-
Notifications
You must be signed in to change notification settings - Fork 187
Retry & Backoff #364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Retry & Backoff #364
Conversation
phausler
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally this is an interesting idea of a set of algorithms to include, I think there are some ergonomics improvements that can be made and generally some API cleanup but it is an interesting set of additions.
|
@phausler Very valid and good points raised. Thank you for that. Exactly what I was hoping for. |
|
@phausler Fully embracing stateful backoffs was a good call. I adapted the MR according to your feedback. Would you mind taking a look again? Thanks. The only thing I am still unhappy with is that with this approach we have to copy the RNG. Can't come up with a way around this unfortunately. |
| clock: ClockType, | ||
| isolation: isolated (any Actor)? = #isolation, | ||
| operation: () async throws(ErrorType) -> sending Result, | ||
| strategy: (ErrorType) -> RetryStrategy<ClockType.Instant.Duration> = { _ in .backoff(.zero) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we make RetryStrategy hold this closure directly and become generic itself? That allows for easier sharing of strategies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is an interesting use case. I will investigate.
This comment was marked as outdated.
This comment was marked as outdated.
| self.factor = factor | ||
| } | ||
| @inlinable mutating func nextDuration() -> Duration { | ||
| defer { current *= factor } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this multiplication can overflow when retrying many times. I just saw it happen with .exponential(factor: 2, initial: .seconds(5)). This should check for overflows and cap the value to the max of the duration type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bobergj Although this can overflow at some point it is not realistic, in my opinion.
The last duration before the multiplication (with factor 2 and initial 5 seconds) overflows is 1.46 trillion years, if I did the math correct.
It is generally a good idea to cap exponential backoff at a maximum which makes sense for your use case.
Would you mind sharing what you were trying to do? Maybe I should add a recommendation to the docs of exponential backoff that it should be combined with a .max modifier...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the incomplete report. On iOS < 18 I am doing this:
Backoff
.exponential(factor: 2, initial: .seconds(5))
.maximum(.seconds(120))
It turns out the .maximum is part of the issue. Yes, with a pure exponential backoff, you may have to wait a trillion years. But with this setup the exponential keeps growing, while the waiting time is much shorter due to the .maximum. This overflows after about 65 calls to nextDuration, which is after a total retry time of just over an hour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bobergj Yikes, now I get it. Thanks for the report.
I'll either implement what you suggested or the max-modifier should stop calling its base backoff strategy once maximum was reached.
No docs nor a pitch nor unit tests, yet. But I am looking for early feedback on the interface of the proposed algorithm.
Usage would look something like this:
(Of course nothing in this algorithm is limited to networking, in fact I have used this multiple times with Bluetooth implementations, but network code is more widely used, so I used this in my examples...)
Retry on an
AsyncSequencecould be a future direction, although I have not yet found a use case for this.Thank you.