-
Notifications
You must be signed in to change notification settings - Fork 91
feat: Add Redis lock strategy #4617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: feat/nonce-ordering-with-locks
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
❌ Your patch check has failed because the patch coverage (65.91%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage. @@ Coverage Diff @@
## feat/nonce-ordering-with-locks #4617 +/- ##
=================================================================
Coverage ? 94.11%
=================================================================
Files ? 133
Lines ? 21232
Branches ? 1719
=================================================================
Hits ? 19982
Misses ? 1229
Partials ? 21
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
packages/relay/src/lib/services/lockService/RedisLockStrategy.ts
Outdated
Show resolved
Hide resolved
packages/relay/src/lib/services/lockService/RedisLockStrategy.ts
Outdated
Show resolved
Hide resolved
packages/relay/tests/lib/services/lockService/RedisLockStrategy.spec.ts
Outdated
Show resolved
Hide resolved
packages/relay/tests/lib/services/lockService/RedisLockStrategy.spec.ts
Outdated
Show resolved
Hide resolved
packages/relay/tests/lib/services/lockService/RedisLockStrategy.spec.ts
Outdated
Show resolved
Hide resolved
quiet-node
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some concerns and comments
docs/configuration.md
Outdated
| | `USE_ASYNC_TX_PROCESSING` | "true" | Set to `true` to enable `eth_sendRawTransaction` to return the transaction hash immediately after passing all prechecks, while processing the transaction asynchronously in the background. | | ||
| | `LOCK_MAX_HOLD_MS` | "30000" | Maximum time in milliseconds a lock can be held before automatic force-release. This TTL prevents deadlocks when transaction processing hangs or crashes. Default is 30 seconds. | | ||
| | `LOCK_QUEUE_POLL_INTERVAL_MS` | "50" | Polling interval in milliseconds for Redis lock queue checks. Lower values reduce latency but increase Redis load. Default is 50ms. | | ||
| | `LOCK_REDIS_PREFIX` | "lock" | Redis key prefix for lock-related keys. Useful for namespace isolation in shared Redis instances. Default is "lock". | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this as a configuration? I think a hardcoded value inside Redis Lock strategy is good enough, right?
packages/relay/src/lib/services/lockService/RedisLockStrategy.ts
Outdated
Show resolved
Hide resolved
| async acquireLock(address: string): Promise<string> { | ||
| const normalizedAddress = this.normalizeAddress(address); | ||
| const sessionKey = randomUUID(); | ||
| const lockKey = this.getLockKey(normalizedAddress); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we’ve had this conversation before, right? Not sure when but anywho, requiring the key to be normalized before being passed in adds unnecessary complexity and risks producing different types of keys for different operations, since it allows unnormalized parameters to be passed. It also leads to repeated calls to this.normalizeAddress() at the call sites.
Would it make sense to handle normalization inside getLockKey? This way, whenever we call this.getLockKey() with any type of parameter, it would automatically be normalized, ensuring that getLockKey() reliably produces a lock key that is always consistent and accurate, regardless of the case sensitivity of the input.
| const normalizedAddress = this.normalizeAddress(address); | ||
| const sessionKey = randomUUID(); | ||
| const lockKey = this.getLockKey(normalizedAddress); | ||
| const queueKey = this.getQueueKey(normalizedAddress); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for getQueueKey, I think it would be more intuitive to have normalization inside getQueueKey.
packages/relay/src/lib/services/lockService/RedisLockStrategy.ts
Outdated
Show resolved
Hide resolved
| }); | ||
|
|
||
| if (result === 1) { | ||
| this.logger.info(`Lock released: address=${normalizedAddress}, sessionKey=${sessionKey}`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for this log would it make more sense to use debug?
|
|
||
| const redisError = new RedisCacheError(error); | ||
| this.logger.error(redisError, `Failed to acquire lock: address=${normalizedAddress}, sessionKey=${sessionKey}`); | ||
| throw redisError; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also on Tuesday the team decided to go with Fail Open approach so we don't need to throw here but rather return null and ignore.
| } | ||
| } | ||
| } catch (error) { | ||
| const redisError = new RedisCacheError(error); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not wrap it into RedisCacheError so we can avoid confusion
| * @param sessionKey - The session key to remove. | ||
| * @param address - The address (for logging). | ||
| */ | ||
| private async cleanupFromQueue(queueKey: string, sessionKey: string, address: string): Promise<void> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we repurpose this and make it usable for both removing it after successful acquisition or fail acquisition? rPop(queueKey) and lRem(queueKey, 1, sessionKey) technically work the same, right? If it makes sense to repurpose for both successful and fail cases we can have it triggered in the finally() block of the acquireLock() method, it'll be a little cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They're not quite the same and I think it's better to keep it as it is, rPop removes the last element in list (which is the first in the queue) while lRem removes the first occurrence of a specific value in the list searching from head to tail. For the success case we use rPop because we know that we're at the end of the list. For failed acquisition we use lRem because we might be anywhere in the queue. If an error occurs during the polling loop, we could be in the middle of the queue, not necessarily at the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah I understand that. I meant we should use lRem for both cases, not rPop. Since sessionKeys are unique, rPop removes the last item in the queue, which is the current sessionKey, and lrem(queueKey, sessionKey, 1) would remove that same sessionKey as well.
So we can update the method to use lRem, and then call it only once in the finally block because we want to remove the sessionKey regardless of lock acquisistion status.
This makes the code cleaner while keeping the behavior exactly the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understood you now, yes that's a cleaner approach. I will apply it
| this.maxLockHoldMs = ConfigService.get('LOCK_MAX_HOLD_MS' as any) as number; | ||
| this.pollIntervalMs = ConfigService.get('LOCK_QUEUE_POLL_INTERVAL_MS' as any) as number; | ||
| this.keyPrefix = ConfigService.get('LOCK_REDIS_PREFIX' as any) as string; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to add as any to these calls? ConfigService should automatically define the type based on what we put for these configs in GlobalConfig object, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Signed-off-by: Simeon Nakov <simeon.nakov@limechain.tech>
Signed-off-by: Simeon Nakov <simeon.nakov@limechain.tech>
643c24f to
11b8763
Compare
Signed-off-by: Simeon Nakov <simeon.nakov@limechain.tech>
|
your comments have been addressed @quiet-node @konstantinabl |
Signed-off-by: Simeon Nakov <simeon.nakov@limechain.tech>
quiet-node
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking better left some more items
| return new RedisLockStrategy(redisClient, logger.child({ name: 'redis-lock-strategy' })); | ||
| } | ||
|
|
||
| return new LocalLockStrategy(logger); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be updated in the next follow up PR but can we add a name for this logger?
| private readonly redisClient: RedisClientType; | ||
| private readonly logger: Logger; | ||
| private readonly maxLockHoldMs: number; | ||
| private readonly pollIntervalMs = 50; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I made a misleading comment earlier. I meant we just need the keyPrefix to be a hardcoded value, but pollIntervalMs can still be configurable, right? Also can address in next follow up PR
| return null; | ||
| } finally { | ||
| // Always remove from queue if we joined it (whether success or failure) | ||
| if (joinedQueue) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
non blocking but looks like we don't need joinedQueue, right? lRem(queueKey, 1, sessionKey) will remove sessionKey if it's there or it will ignore if it's not there, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case that lPush fails we would try to remove from the queue something that's guaranteed to not be there which would be an extra network call to Redis. It's a very small optimization but I suggest we keep it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah that's fair! Thanks for the explanation
| private async removeFromQueue(queueKey: string, sessionKey: string, address: string): Promise<void> { | ||
| try { | ||
| await this.redisClient.lRem(queueKey, 1, sessionKey); | ||
| this.logger.trace(`Removed from queue: address=${address}, sessionKey=${sessionKey}`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's still add the log level guard check to make sure JS doesn't build the string if not necessary
| ); | ||
|
|
||
| if (result === 1) { | ||
| this.logger.debug(`Lock released: address=${address}, sessionKey=${sessionKey}`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's still add the log level guard check to make sure JS doesn't build the string if not necessary
| const acquisitionDuration = Date.now() - startTime; | ||
| const queueLength = await this.redisClient.lLen(queueKey); | ||
|
|
||
| this.logger.debug( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's still add the log level guard check to make sure JS doesn't build the string if not necessary
| await this.redisClient.lRem(queueKey, 1, sessionKey); | ||
| this.logger.trace(`Removed from queue: address=${address}, sessionKey=${sessionKey}`); | ||
| } catch (error) { | ||
| this.logger.error(error, `Failed to remove from queue: address=${address}, sessionKey=${sessionKey}`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we make this as warn level, not an error? I don't think it's critical enough to be an error yeah?
| } | ||
| } | ||
| } catch (error) { | ||
| this.logger.error(error, `Failed to release lock: address=${address}, sessionKey=${sessionKey}`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we make this only warn, not an error? I don't think it's critical enough to be an error yeah?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would likely be a Redis error due to a operational failure. I suggest we keep it as error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah okay that works too
| protected generateSessionKey(): string { | ||
| return randomUUID(); | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think randomUUID() itself is clean and compact enough, and it doesn't need a wrapper method for it. Should we just directly use it to avoid indirection?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
crypto.randomUUID cannot be stubbed directly in the tests and results in the following error TypeError: Descriptor for property randomUUID is non-configurable and non-writable. This wrapper was added as a workaround to address that limitation. Let me know if you have a better approach for this scenario
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see thanks
Signed-off-by: Simeon Nakov <simeon.nakov@limechain.tech>
quiet-node
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM Great work!
Description
This PR introduces the class that will implement locks in Redis in order to support the new Nonce Ordering feature
Related issue(s)
Fixes #4594
Changes from original design (optional)
N/A
Additional work needed (optional)
N/A
Checklist