Skip to content

Conversation

@britaniar
Copy link
Collaborator

@britaniar britaniar commented Nov 5, 2025

Description of your changes

Fixes #

I have:

  • Updated UpdateRunSpec API to make ResourceSnapshotIndex optional in staged update run.

  • Updated UpdateRunStatus API to record ResourceSpanpshotIndexUsed.

  • Update the initialization stage of staged update run to check if ResourceSnapshotIndex is defined to find that specific resource snapshot. Otherwise, if not defined it will use the latest resource snapshot.

  • Added an integration test case where relevant.

  • Update e2e and integration tests to include the resource snapshot index used in the update run status check.

  • Run make reviewable to ensure this PR is ready for review.

How has this code been tested

  • Integration test
  • e2e test

Special notes for your reviewer

@codecov
Copy link

codecov bot commented Nov 6, 2025

Codecov Report

❌ Patch coverage is 80.48780% with 8 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
pkg/controllers/updaterun/initialization.go 80.00% 6 Missing and 2 partials ⚠️

📢 Thoughts on this report? Let us know!

@britaniar britaniar marked this pull request as ready for review November 6, 2025 00:37
@Arvindthiru
Copy link
Collaborator

Arvindthiru commented Nov 6, 2025

We also need to account for resourceSnapshotIndex access here https://github.com/britaniar/kubefleet/blob/bff950842f01cab0eb0041401ed4ef7134581848/pkg/controllers/updaterun/execution.go#L99.

I think it's better to populate the ResourceSnapshotIndex field in the spec by a defaulter instead from the latest resource snapshot. Ideally once set ResourceSnapshotIndex should not change in the middle of an updateRun.

Edit: after offline discussion it's better to store resourceSnapshotName as a field on the updateRun status instead we don't want to change spec since it's an user facing object

@britaniar britaniar force-pushed the updateStagedUpdateRunResourceIndex branch from e34c1db to 1c44de9 Compare November 7, 2025 18:21
@Arvindthiru
Copy link
Collaborator

Arvindthiru commented Nov 11, 2025

Can we add a test where we have multiple resource snapshots, index is not specified and ensure the latest resource snapshot is picked for an updateRun ?

@britaniar britaniar force-pushed the updateStagedUpdateRunResourceIndex branch from 7a60a5b to 8e4673e Compare November 12, 2025 02:15
Arvindthiru
Arvindthiru previously approved these changes Nov 12, 2025
…ex is unset

Signed-off-by: Britania Rodriguez Reyes <britaniar@microsoft.com>
Signed-off-by: Britania Rodriguez Reyes <britaniar@microsoft.com>
Signed-off-by: Britania Rodriguez Reyes <britaniar@microsoft.com>
Signed-off-by: Britania Rodriguez Reyes <britaniar@microsoft.com>
Signed-off-by: Britania Rodriguez Reyes <britaniar@microsoft.com>
Signed-off-by: Britania Rodriguez Reyes <britaniar@microsoft.com>
Signed-off-by: Britania Rodriguez Reyes <britaniar@microsoft.com>
Signed-off-by: Britania Rodriguez Reyes <britaniar@microsoft.com>
Signed-off-by: Britania Rodriguez Reyes <britaniar@microsoft.com>
Signed-off-by: Britania Rodriguez Reyes <britaniar@microsoft.com>
Signed-off-by: Britania Rodriguez Reyes <britaniar@microsoft.com>
Signed-off-by: Britania Rodriguez Reyes <britaniar@microsoft.com>
Signed-off-by: Britania Rodriguez Reyes <britaniar@microsoft.com>
@britaniar britaniar force-pushed the updateStagedUpdateRunResourceIndex branch from 2051ffa to 32bd918 Compare November 13, 2025 19:47
Signed-off-by: Britania Rodriguez Reyes <britaniar@microsoft.com>
})
Expect(k8sClient.Status().Update(ctx, policySnapshot)).Should(Succeed(), "failed to update the policy snapshot condition")

By("Creating the member clusters")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to recreate the member clusters and resource bindings?


It("Should rollout resources to all the members and complete the staged update run successfully", func() {
surSucceededActual := stagedUpdateRunStatusSucceededActual(updateRunNames[0], testNamespace, policySnapshotIndex1st, len(allMemberClusters), defaultApplyStrategy, &strategy.Spec, [][]string{{allMemberClusterNames[1]}, {allMemberClusterNames[0], allMemberClusterNames[2]}}, nil, nil, nil)
surSucceededActual := stagedUpdateRunStatusSucceededActual(updateRunNames[0], testNamespace, resourceSnapshotIndex1st, policySnapshotIndex1st, len(allMemberClusters), defaultApplyStrategy, &strategy.Spec, [][]string{{allMemberClusterNames[1]}, {allMemberClusterNames[0], allMemberClusterNames[2]}}, nil, nil, nil)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need an e2e for not specifying the resourceIndex.

err := fmt.Errorf("no latest resourceSnapshots found for placement `%s`. This might be a transient state, need retry", placementKey)
klog.ErrorS(err, "No latest resourceSnapshots found for placement. This might be transient, need retry", "placement", placementKey, "updateRun", updateRunRef)
// retryable error.
return resourceSnapshotObjs, fmt.Errorf("%w: %s", errInitializedFailed, err.Error())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Retriable error should NOT be wrapped w/ errInitializedfailed.

// The index represents a group of resource snapshots that includes all the resources a ResourcePlacement selected.
// +kubebuilder:validation:Required
// +kubebuilder:validation:XValidation:rule="self == oldSelf",message="resourceSnapshotIndex is immutable"
// +kubebuilder:validation:Optional
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you test this in an e2e env? Please paste your test updateRun obj (including both spec and status) in PR description.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants