-
Notifications
You must be signed in to change notification settings - Fork 135
fix: Retry multiplexed session failures #4280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @sakthivelmanii, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a crucial retry mechanism for multiplexed session creation failures in the Spanner client library. Previously, a single failure during session initialization would cause the library to stall, re-throwing the same error for all subsequent requests. The new logic ensures that session creation is retried upon encountering transient errors (excluding UNIMPLEMENTED errors), significantly improving the client's resilience and preventing service disruption. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a retry mechanism for multiplexed session creation failures, which is a valuable improvement to prevent the client library from stalling. The use of a lock to manage retries and prevent concurrent session creation attempts is a good approach.
I've identified a critical issue where a change intended to be non-blocking has become blocking, which could impact performance and behavior. I've also included a suggestion to improve the robustness of the locking mechanism by using a try-finally block, which is a standard best practice.
The added tests effectively cover the new retry logic for both single and parallel requests. Addressing the identified issues will make this a solid contribution.
| ApiFuture<SessionReference> getCurrentSessionReferenceFuture() { | ||
| return ApiFutures.immediateFuture(getCurrentSessionReference()); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation of getCurrentSessionReferenceFuture() makes it a blocking method because it directly calls the blocking method getCurrentSessionReference(). This negates the purpose of DelayedMultiplexedSessionTransaction, which relies on a non-blocking future to defer session acquisition. The current implementation will cause createDelayedMultiplexedSessionTransaction() to block until a session is available, which is a significant change in behavior.
The method should be implemented asynchronously to chain the retry logic without blocking. Here's a suggested implementation that preserves the non-blocking nature:
ApiFuture<SessionReference> getCurrentSessionReferenceFuture() {
return ApiFutures.catchingAsync(
this.multiplexedSessionReference.get(),
Throwable.class,
(throwable) -> {
sessionCreationLock.lock();
try {
if (isMultiplexedSessionsSupported()
&& retryingSessionCreation.compareAndSet(false, true)) {
SettableApiFuture<SessionReference> settableApiFuture = SettableApiFuture.create();
asyncCreateMultiplexedSession(settableApiFuture);
multiplexedSessionReference.set(settableApiFuture);
}
} finally {
sessionCreationLock.unlock();
}
return this.multiplexedSessionReference.get();
},
MoreExecutors.directExecutor());
}This change ensures that DelayedMultiplexedSessionTransaction remains non-blocking and that the retry logic is correctly applied to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even current logic will ensure that only DelayedMultiplexedSessionTransaction remains non blocking during getDatabaseClient creation.
...e-cloud-spanner/src/main/java/com/google/cloud/spanner/MultiplexedSessionDatabaseClient.java
Outdated
Show resolved
Hide resolved
75bc04d to
968a849
Compare
|
/gemini review |
f4741b0 to
40c06d0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a retry mechanism for multiplexed session creation failures, which is a valuable improvement for the library's resilience. The changes correctly identify UNIMPLEMENTED errors as non-retryable and use a ReentrantLock to prevent multiple concurrent session creation retries. New test cases have been added to validate this behavior, including scenarios with parallel requests and DEADLINE_EXCEEDED errors. However, there are a couple of areas that could be improved for clarity and efficiency.
| * This flag is set to true if create session RPC is in progress. This flag prevents application | ||
| * from firing two requests concurrently | ||
| */ | ||
| private final AtomicBoolean retryingSessionCreation = new AtomicBoolean(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The retryingSessionCreation flag is initialized to true here, but it's immediately set to false in both onSessionReady and onSessionCreateFailure callbacks. Additionally, the compareAndSet(false, true) logic in maybeRetrySessionCreation expects the flag to be false to initiate a new retry.
Initializing it to false would align better with its intended state (not retrying initially) and the subsequent logic.
| private final AtomicBoolean retryingSessionCreation = new AtomicBoolean(true); | |
| private final AtomicBoolean retryingSessionCreation = new AtomicBoolean(false); |
40c06d0 to
05e4618
Compare
Description:
Currently when multiplexed session fails with any error, we are storing the exception in the session reference and re-throwing that error to all the subsequent requests. This will cause the library to stall since no further requests will be processed successfully. It's a general expectation that all RPC requests are expected due to CPU, Network, GFE and other factors.
Proposed solution:
If session creation failed with an any error, it should retry the error. So we will be checking the following before initiating the retry.
Old Behaviour:
Updated behaviour:
State - 1
State - 2
State - 3