-
Notifications
You must be signed in to change notification settings - Fork 61
feat: added backoff for job worker #814
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: added backoff for job worker #814
Conversation
nloding
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cloned and tested locally, everything works as advertised! great work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @LennartKleymann, I like it 🙇🏼 Can you please resolve the merge conflicts and also apply the backoff supplier to the job streaming? I think then we can also remove the constants we had before (added here #812) wdyt?
After that we should be fine to merge
| /// </summary> | ||
| /// <param name="backoffSupplier">The supplier used to compute the next retry delay in ms.</param> | ||
| /// <returns>The builder for this worker.</returns> | ||
| IJobWorkerBuilderStep3 BackoffSupplier(IBackoffSupplier backoffSupplier); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏼
| this.random = random ?? new Random(); | ||
| } | ||
|
|
||
| public long SupplyRetryDelay(long currentRetryDelay) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔧 Could make a note that this implementation is based on the Java client version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added summary and remarks to the ExponentialBackoffSupplier.cs
Client/Impl/Worker/JobWorker.cs
Outdated
| catch (Exception ex) | ||
| { | ||
| logger?.LogWarning(ex, "Backoff supplier failed; falling back to default backoff."); | ||
| var defaultSupplier = new ExponentialBackoffBuilderImpl().Build(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔧 we could build this once as constant to always have it available wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved initialization to the constructor to have one default supplier per JobWorker.
If you think of a static instance somewhere can you give me a hint where you would see it?
…-client-csharp into 813-added-backoff-to-job-worker
| { | ||
| LogRpcException(rpcException); | ||
| await Task.Delay(pollInterval, cancellationToken); | ||
| await Backoff(cancellationToken); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed in my commit to backoff now every RpcException in Stream and Poll Jobs
|
@ChrisKujawa can you review again? |
closes #813
Feature: Configurable Exponential Backoff for Job Worker
This PR adds a configurable exponential backoff strategy for job polling retries in the C# client. This new feature helps prevent overwhelming the gateway and aligns the C# client with the behavior of the Java client.
Motivation
When the gateway is under heavy load and returns a
RESOURCE_EXHAUSTEDgRPC error, the current worker's aggressive retry behavior can create a "thundering herd" problem. By adding an exponential backoff, the worker can space out its retries, giving the gateway time to recover and improving overall system resilience.Changes & Usage
IBackoffSupplierandIExponentialBackoffBuilder: These new APIs provide a fluent builder to configure the backoff policy.BackoffSupplier()method on the job worker builder to supply a custom backoff strategy.RESOURCE_EXHAUSTEDerrors and is reset to the initial polling interval after a successful job activation.Backwards Compatibility
Existing worker configurations will continue to function without errors. However, the default polling behavior for these workers is now an exponential backoff with standard values, which differs from the previous fixed retry mechanism. The API remains backward compatible.
Testing
Unit tests have been added to verify the backoff behavior, including the handling of jitter and the monotonic increase of the delay.