Skip to content

Conversation

@ph1ps
Copy link

@ph1ps ph1ps commented Sep 5, 2025

No docs nor a pitch nor unit tests, yet. But I am looking for early feedback on the interface of the proposed algorithm.
Usage would look something like this:

// Example of when backoff is calculated by a client
let rng = SystemRandomNumberGenerator() // or a seeded rng for unit tests
var backoff = Backoff
  .exponential(factor: 2, initial: .seconds(5))
  .fullJitter(using: rng)
  .minimum(.seconds(2))
  .maximum(.seconds(100))

let response = try await retry(maxAttempts: 5) {
  try await URLSession.shared.data(from: url)
} strategy: { error in
  return .backoff(backoff.nextDuration())
}
// Example of when backoff is returned by a server
let response = try await retry(maxAttempts: 5) {
  let (data, response) = try await URLSession.shared.data(from: url)
  if
    let response = response as? HTTPURLResponse,
    response.statusCode == 429,
    let retryAfter = response.value(forHTTPHeaderField: "Retry-After")
  {
   throw TooManyRequestsError(retryAfter: Double(retryAfter)!)
  }
  return (data, response)
} strategy: { error in
  if let error = error as? TooManyRequestsError {
    return .backoff(.seconds(error.retryAfter))
  } else {
    return .stop
  }
}

(Of course nothing in this algorithm is limited to networking, in fact I have used this multiple times with Bluetooth implementations, but network code is more widely used, so I used this in my examples...)

Retry on an AsyncSequence could be a future direction, although I have not yet found a use case for this.

Thank you.

Copy link
Member

@phausler phausler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally this is an interesting idea of a set of algorithms to include, I think there are some ergonomics improvements that can be made and generally some API cleanup but it is an interesting set of additions.

@ph1ps
Copy link
Author

ph1ps commented Sep 8, 2025

@phausler Very valid and good points raised. Thank you for that. Exactly what I was hoping for.

@ph1ps
Copy link
Author

ph1ps commented Sep 11, 2025

@phausler Fully embracing stateful backoffs was a good call. I adapted the MR according to your feedback. Would you mind taking a look again? Thanks.

The only thing I am still unhappy with is that with this approach we have to copy the RNG. Can't come up with a way around this unfortunately.

clock: ClockType,
isolation: isolated (any Actor)? = #isolation,
operation: () async throws(ErrorType) -> sending Result,
strategy: (ErrorType) -> RetryStrategy<ClockType.Instant.Duration> = { _ in .backoff(.zero) }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we make RetryStrategy hold this closure directly and become generic itself? That allows for easier sharing of strategies.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is an interesting use case. I will investigate.

@ph1ps

This comment was marked as outdated.

@ph1ps ph1ps marked this pull request as ready for review September 26, 2025 04:12
self.factor = factor
}
@inlinable mutating func nextDuration() -> Duration {
defer { current *= factor }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this multiplication can overflow when retrying many times. I just saw it happen with .exponential(factor: 2, initial: .seconds(5)). This should check for overflows and cap the value to the max of the duration type.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bobergj Although this can overflow at some point it is not realistic, in my opinion.

The last duration before the multiplication (with factor 2 and initial 5 seconds) overflows is 1.46 trillion years, if I did the math correct.

It is generally a good idea to cap exponential backoff at a maximum which makes sense for your use case.

Would you mind sharing what you were trying to do? Maybe I should add a recommendation to the docs of exponential backoff that it should be combined with a .max modifier...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the incomplete report. On iOS < 18 I am doing this:

     Backoff
         .exponential(factor: 2, initial: .seconds(5))
         .maximum(.seconds(120))

It turns out the .maximum is part of the issue. Yes, with a pure exponential backoff, you may have to wait a trillion years. But with this setup the exponential keeps growing, while the waiting time is much shorter due to the .maximum. This overflows after about 65 calls to nextDuration, which is after a total retry time of just over an hour.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bobergj Yikes, now I get it. Thanks for the report.

I'll either implement what you suggested or the max-modifier should stop calling its base backoff strategy once maximum was reached.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants