Skip to content

Conversation

@britaniar
Copy link
Collaborator

@britaniar britaniar commented Apr 22, 2025

Description of your changes

Fixes #

I have: skipped creating an empty work object when no resources have been selected.

  • Run make reviewable to ensure this PR is ready for review.

How has this code been tested

Special notes for your reviewer

The CRP controller relies exclusively on work status to populate the "applied" condition on CRP. There is a special case that no resource is selected. Currently, the work generator creates an empty work object to accommodate this behavior.

New UX outcome:

Name:         crp
Namespace:
Labels:       <none>
Annotations:  <none>
API Version:  placement.kubernetes-fleet.io/v1
Kind:         ClusterResourcePlacement
Metadata:
  Creation Timestamp:  2025-05-06T20:13:13Z
  Finalizers:
    kubernetes-fleet.io/crp-cleanup
    kubernetes-fleet.io/scheduler-cleanup
  Generation:        2
  Resource Version:  19474938
  UID:               e3de5d2e-38be-4ca4-a879-a33ca2c069ff
Spec:
  Resource Selectors:
    Group:
    Kind:                  Namespace
    Name:                  doesNotExist
    Version:               v1
  Revision History Limit:  15
  Strategy:
    Rolling Update:
      Max Surge:                   25%
      Max Unavailable:             25%
      Unavailable Period Seconds:  60
    Type:                          RollingUpdate
Status:
  Conditions:
    Last Transition Time:   2025-05-06T20:13:33Z
    Message:                found all cluster needed as specified by the scheduling policy, found 1 cluster(s)
    Observed Generation:    2
    Reason:                 SchedulingPolicyFulfilled
    Status:                 True
    Type:                   ClusterResourcePlacementScheduled
    Last Transition Time:   2025-05-06T20:13:33Z
    Message:                All 1 cluster(s) start rolling out the latest resource
    Observed Generation:    2
    Reason:                 RolloutStarted
    Status:                 True
    Type:                   ClusterResourcePlacementRolloutStarted
    Last Transition Time:   2025-05-06T20:13:33Z
    Message:                No override rules are configured for the selected resources
    Observed Generation:    2
    Reason:                 NoOverrideSpecified
    Status:                 True
    Type:                   ClusterResourcePlacementOverridden
    Last Transition Time:   2025-05-06T20:13:33Z
    Message:                Works(s) are succcesfully created or updated in 1 target cluster(s)' namespaces
    Observed Generation:    2
    Reason:                 WorkSynchronized
    Status:                 True
    Type:                   ClusterResourcePlacementWorkSynchronized
    Last Transition Time:   2025-05-06T20:13:33Z
    Message:                Previous resources are deleted. No new resources are selected for the placement
    Observed Generation:    2
    Reason:                 ApplySucceeded
    Status:                 True
    Type:                   ClusterResourcePlacementApplied
    Last Transition Time:   2025-05-06T20:13:33Z
    Message:                No new resources are selected for the placement. Please update crp to select new resources
    Observed Generation:    2
    Reason:                 NoResourcesSelected
    Status:                 True
    Type:                   ClusterResourcePlacementAvailable
  Observed Resource Index:  0
  Placement Statuses:
    Cluster Name:  member1
    Conditions:
      Last Transition Time:  2025-05-06T20:13:33Z
      Message:               Successfully scheduled resources for placement in "member1" (affinity score: 0, topology spread score: 0): picked by scheduling policy
      Observed Generation:   2
      Reason:                Scheduled
      Status:                True
      Type:                  Scheduled
      Last Transition Time:  2025-05-06T20:13:33Z
      Message:               Detected the new changes on the resources and started the rollout process
      Observed Generation:   2
      Reason:                RolloutStarted
      Status:                True
      Type:                  RolloutStarted
      Last Transition Time:  2025-05-06T20:13:33Z
      Message:               No override rules are configured for the selected resources
      Observed Generation:   2
      Reason:                NoOverrideSpecified
      Status:                True
      Type:                  Overridden
      Last Transition Time:  2025-05-06T20:13:33Z
      Message:               All of the works are synchronized to the latest
      Observed Generation:   2
      Reason:                AllWorkSynced
      Status:                True
      Type:                  WorkSynchronized
      Last Transition Time:  2025-05-06T20:13:33Z
      Message:               Work has been deleted. No new work objects need to be applied.
      Observed Generation:   2
      Reason:                AllWorkHaveBeenApplied
      Status:                True
      Type:                  Applied
      Last Transition Time:  2025-05-06T20:13:33Z
      Message:               No resources selected. 
      Observed Generation:   2
      Reason:                NoResourcesSelected
      Status:                True
      Type:                  Available
Events:
  Type    Reason                        Age   From                                   Message
  ----    ------                        ----  ----                                   -------
  Normal  PlacementRolloutStarted       2s    cluster-resource-placement-controller  Started rolling out the latest resources
  Normal  PlacementOverriddenSucceeded  2s    cluster-resource-placement-controller  Placement has been successfully overridden
  Normal  PlacementWorkSynchronized     2s    cluster-resource-placement-controller  Work(s) have been created or updated successfully for the selected cluster(s)
  Normal  PlacementApplied              2s    cluster-resource-placement-controller  Resources have been applied to the selected cluster(s)
  Normal  PlacementAvailable            2s    cluster-resource-placement-controller  Resources are available on the selected cluster(s)
  Normal  PlacementRolloutCompleted     2s    cluster-resource-placement-controller  Placement has finished the rollout process and reached the desired status

@britaniar britaniar marked this pull request as ready for review April 23, 2025 00:44
}
if len(simpleManifests) == 0 {
klog.V(2).InfoS("the snapshot contains no resource to apply either because of override or enveloped resources", "snapshot", klog.KObj(snapshot))
} else {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit confused: from the comments, it seems that crp always depends on work to populate the status of the placement. And now, for CRP w/o any resources, we do not generate a work obj, what's gonna happen on CRP status?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRP Status will still be available. We are thinking of updating UX that will change CRP status.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without a work, the binding status will not be updated thus CRP reconcile not triggered, I am not entirely sure how it will work

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the changes I implemented, workUpdated will be false here.
Since it is false, we will set the binding status.
Within the function setBindingStatus, we check to see the apply strategy and set the conditions here.
Since there are no works, the default for the condition will be set which is True. This is where we can update the UX interface to show whether the resources were selected.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see two scenarios where workUpdated can be true in this case:

  1. With an older version of hub-agent without your change, an empty work will be generated and after upgrading the image with your change, that empty work object will be deleted by code below on line 546-562 and workUpdated becomes true.
  2. When there are multiple resourceSnapshots generated or there's a configmap-wrapped resource, there will be other work obj generated and thus workUpdated also becomes true.
    Could you please verify these two cases and add tests?

@codecov
Copy link

codecov bot commented Apr 29, 2025

Codecov Report

Attention: Patch coverage is 79.51807% with 17 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
pkg/controllers/workgenerator/controller.go 76.38% 15 Missing and 2 partials ⚠️

📢 Thoughts on this report? Let us know!

}
if len(simpleManifests) == 0 {
klog.V(2).InfoS("the snapshot contains no resource to apply either because of override or enveloped resources", "snapshot", klog.KObj(snapshot))
} else {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see two scenarios where workUpdated can be true in this case:

  1. With an older version of hub-agent without your change, an empty work will be generated and after upgrading the image with your change, that empty work object will be deleted by code below on line 546-562 and workUpdated becomes true.
  2. When there are multiple resourceSnapshots generated or there's a configmap-wrapped resource, there will be other work obj generated and thus workUpdated also becomes true.
    Could you please verify these two cases and add tests?

Copy link
Collaborator

@zhiying-lin zhiying-lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@britaniar could you please update the PR description to show me the final crp status condition when no resources are selected? We need to agree on the user experience/api first . I remember ryan mentioned that cx is confused by the applied/available true status. btw, wantong is working on updating the CRP conditions and code may have conflicts.

@britaniar britaniar force-pushed the skipWorkObject branch 5 times, most recently from 83e637e to 3664bbb Compare May 2, 2025 00:38
case rpsSetCondTypeCounter[i][condition.FalseConditionStatus] > 0:
// There is at least one False condition of the given type being set on the per cluster placement statuses.
crp.SetConditions(i.FalseClusterResourcePlacementCondition(crp.Generation, rpsSetCondTypeCounter[i][condition.FalseConditionStatus]))
if len(crp.Status.SelectedResources) == 0 && i == condition.AppliedCondition {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current behavior is when no resource is selected, the work obj still gets created, and all clusters as well as the crp shows Applied/Available to true. We are changing this behavior. Is this what we want?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we want to change it from it being Applied/Available is true because the cx gets confused.

Copy link
Collaborator

@zhiying-lin zhiying-lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i recall that why we treat the applied & available originally as true. There will be a case, cx first applied a namespace, and then he/she wants to delete this namespace in all members but keeps the crp.

CRP still needs to apply the resources (delete the resources). I'm thinking of changing the applied & available reason to something else, instead of marking them as false, and it also simplifies our alert.

@britaniar britaniar force-pushed the skipWorkObject branch 2 times, most recently from 59dbd2b to ff84430 Compare May 6, 2025 20:06
britaniar added 2 commits May 6, 2025 15:16
Signed-off-by: Britania Rodriguez Reyes <britaniar@microsoft.com>
Signed-off-by: Britania Rodriguez Reyes <britaniar@microsoft.com>
keep available/applied condition true and update CRP status message/reason

Signed-off-by: Britania Rodriguez Reyes <britaniar@microsoft.com>
Copy link
Collaborator

@zhiying-lin zhiying-lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's split the PRs into 6 smaller PRs and add tests to catch the regression:

  1. Fix the bug to track the deleting works, set the binding applied condition is false when work is still in the deleting state. https://github.com/kubefleet-dev/kubefleet/blob/main/pkg/controllers/workgenerator/controller.go#L1010
  2. Test to see what's the behavior our drift/reportDiff feature when no resource are selected. Add tests if missing.
  3. Add e2e tests when resource is not selected
  4. update code to not create work when resource is not selected
  5. update the work binding condition message & reason when resource is not selected
  6. update the crp condition message & reason when resource is not selected

}
case condition.AppliedCondition:
if len(crp.Status.SelectedResources) == 0 {
cond.Message = "Previous resources are deleted. No new resources are selected for the placement"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

``

Suggested change
cond.Message = "Previous resources are deleted. No new resources are selected for the placement"
cond.Message = "No new resources are selected for the placement and deleted applied resources if exists"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change the reason to "NoResourcesSelectedReason" too?

case condition.AvailableCondition:
if len(crp.Status.SelectedResources) == 0 {
cond.Reason = condition.NoResourcesSelectedReason
cond.Message = "No new resources are selected for the placement. Please update crp to select new resources"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cond.Message = "No new resources are selected for the placement. Please update crp to select new resources"
cond.Message = "No new resources are applied for the placement"

cx can create the resources instead of updating crp.

if syncErr == nil {
// generate and apply the workUpdated works if we have all the works
overrideSucceeded, workUpdated, syncErr = r.syncAllWork(ctx, &resourceBinding, works, &cluster)
overrideSucceeded, workUpdated, resourcesSelected, syncErr = r.syncAllWork(ctx, &resourceBinding, works, &cluster)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please test drift detection/ report diff feature when no resources are selected? i'm not sure if we reply on this empty work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants