Skip to content

Conversation

@michaelawyu
Copy link
Collaborator

Description of your changes

This PR sets up the work applier to abandon the reconciliation loop when the main context exits in a proper way so that partially process manifests will not trigger error unexpectedly.

I have:

  • Run make reviewable to ensure this PR is ready for review.

How has this code been tested

N/A

Special notes for your reviewer

N/A

Signed-off-by: michaelawyu <chenyu1@microsoft.com>
// The workqueue.ParallelizeUntil utility does not return errors even if its context has been
// cancelled (and some availability checks might not be completed yet); to catch such
// premature termination, the work applier checks for context cancellation directly.
if err := ctx.Err(); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not do this check in framework.go and preprocess.go?

If this is applicable everywhere, maybe do this check and return error from

func (p *parallelizer) ParallelizeUntil(ctx context.Context, pieces int, doWork workqueue.DoWorkPieceFunc, operation string)

whenever ctx is cancelled?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, ParallelizeUtil is an interface we own. We can probably return error from there.

Copy link
Collaborator

@weng271190436 weng271190436 Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason that I raised this originally is "can we make it easier for the next person using ParallelizeUntil so that they won't forget to check for context cancellation if we can encapsulate the handling inside the parallelizer package." But it seems that there are some nuances so okay with this approach.

For example, in framework.go we are already handling the context cancellation error separately

Copy link
Collaborator Author

@michaelawyu michaelawyu Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Wei and Wantong! Yeah, actually that was my first attempt as well. The concern I had was that, there are cases where we run ParallelizeUntil in its own child context (primarily for the reason that the child context can be cancelled on its own to terminate the parallelization promptly); and if we do the checking with our wrapper, e.g.,

func (p *parallelizer) ParallelizeUntil(ctx context.Context, pieces int, doWork workqueue.DoWorkPieceFunc, operation string) {
	...
	workqueue.ParallelizeUntil(ctx, p.numOfWorkers, pieces, doWorkWithLogs)

        if err := ctx.Done(); err != nil {...}
}

two possibilities exist:

a) the child context is cancelled willingly by us, and the method returns an error;
b) the parent context gets cancelled by factors outside our control, which in turn cancels the child context, and we also receive an error

The complication is that, we couldn't really tell between the two possibilities by looking at the error, so we still need to inspect the parent context at the caller level to find out why things fail, which kind of defeats the purpose of putting the logic inside the wrapper. This is why in this PR the check is done at the caller level instead at the moment. If there's any concern or things that I've missed, please let me know 🙏

weng271190436
weng271190436 previously approved these changes Nov 20, 2025
Signed-off-by: michaelawyu <chenyu1@microsoft.com>
weng271190436
weng271190436 previously approved these changes Nov 26, 2025
@codecov
Copy link

codecov bot commented Nov 26, 2025

Codecov Report

❌ Patch coverage is 31.81818% with 15 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
pkg/controllers/workapplier/controller.go 25.00% 4 Missing and 2 partials ⚠️
pkg/controllers/workapplier/process.go 33.33% 4 Missing and 2 partials ⚠️
...kg/controllers/workapplier/availability_tracker.go 40.00% 2 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

Signed-off-by: michaelawyu <chenyu1@microsoft.com>
@michaelawyu michaelawyu merged commit 3881da5 into kubefleet-dev:main Nov 27, 2025
13 of 15 checks passed
@michaelawyu michaelawyu deleted the fix/work-applier-abandon-when-context-exits branch November 27, 2025 23:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants