-
Notifications
You must be signed in to change notification settings - Fork 136
fix(runtime): evict runners on SIGTERM, add periodic shutdown progress messages #3550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
3 Skipped Deployments
|
More templates
@rivetkit/cloudflare-workers
@rivetkit/db
@rivetkit/framework-base
@rivetkit/next-js
@rivetkit/react
rivetkit
@rivetkit/sql-loader
@rivetkit/engine-runner
@rivetkit/engine-runner-protocol
commit: |
Code ReviewThank you for this PR! This adds important shutdown observability and graceful termination handling. Here is my feedback: Strengths
Issues and Suggestions1. Missing SIGTERM handling in tunnel_to_ws_task.rsThe ws_to_tunnel_task.rs now handles SIGTERM and sends the GoingAway error, but tunnel_to_ws_task.rs does not have the same logic. This creates an asymmetry in how the two directions handle graceful shutdown. Suggestion: Add the same SIGTERM handling to tunnel_to_ws_task.rs for consistency. The runner should evict gracefully from both directions. 2. Missing newline in JSON error fileThe file engine/artifacts/errors/ws.going_away.json is missing a trailing newline, which violates common conventions and may cause issues with some tools. Suggestion: Add a newline at the end of the JSON file. 3. Memory ordering is correctYou are using Ordering::Release for the store and Ordering::Acquire for the load on hyper_shutdown. This is correct for ensuring visibility of the write to readers. Good job! 4. Consider extracting the shutdown interval constantBoth gasoline/worker.rs and guard-core/server.rs define SHUTDOWN_PROGRESS_INTERVAL as Duration::from_secs(7). Suggestion: If you expect this to be consistent across services, consider extracting it to a shared configuration or runtime module. Test CoverageI did not see any tests added for the new shutdown progress logging, the GoingAway error path in websocket handling, or the remaining_tasks method. Suggestion: Consider adding integration tests that verify SIGTERM triggers graceful shutdown with progress messages and WebSocket connections receive the proper GoingAway close code. Minor Observations
Priority Fixes
Overall, this is a solid improvement to shutdown observability! The main concern is the missing SIGTERM handling in the tunnel-to-ws direction. Review generated with Claude Code |
5d5591a to
ef06c0e
Compare
PR Review: fix(runtime): evict runners on SIGTERM, add periodic shutdown progress messagesThis PR improves graceful shutdown behavior by evicting runners when receiving SIGTERM signals and adding periodic progress logging during shutdown. Overall, the implementation is solid and follows good practices. Here are my detailed findings: ✅ Strengths
🔍 Code Quality Observationsguard-core/src/server.rs (lines 255-311)Good:
Minor suggestion:
gasoline/src/worker.rs (lines 301-314)Good:
Observation:
pegboard-runner/src/ws_to_tunnel_task.rs (lines 57-106)Good:
Note:
🐛 Potential Issues1. Memory Ordering Consistency (Minor)In
This is correct, but for documentation: Release/Acquire pairing ensures proper synchronization. Good job here. 2. Task Group Counting (Observation)In
Suggestion: Add a comment in 3. Progress Interval Initialization PatternAll three files use: let mut progress_interval = tokio::time::interval(SHUTDOWN_PROGRESS_INTERVAL);
progress_interval.tick().await;The immediate 🔒 Security ConsiderationsNo security concerns identified. The changes are focused on observability and graceful shutdown, which actually improve reliability. ⚡ Performance Considerations
🧪 Testing RecommendationsConsider adding tests or manual verification for:
📝 Minor Style/Convention NotesPer CLAUDE.md conventions:
🎯 Suggestions for ImprovementOptional Enhancement 1: Shutdown MetricsConsider emitting metrics for:
Example in let shutdown_elapsed = shutdown_start.elapsed();
metrics::SHUTDOWN_DURATION.record(shutdown_elapsed.as_secs_f64(), &[]);Optional Enhancement 2: Configurable Progress IntervalThe 7-second interval is hardcoded. Consider making it configurable in ✨ SummaryThis is a well-implemented PR that improves observability during shutdown. The code quality is high, follows repository conventions, and adds valuable operational visibility. The main suggestions are minor enhancements rather than fixes for bugs. Recommendation: ✅ Approve with minor suggestions Great work on improving the graceful shutdown experience! |
ef06c0e to
f5a5102
Compare
f5a5102 to
5439e53
Compare
Merge activity
|

No description provided.