Skip to content

Conversation

@MasterPtato
Copy link
Contributor

No description provided.

@vercel
Copy link

vercel bot commented Dec 3, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
rivetkit-serverless Ready Ready Preview Comment Dec 4, 2025 8:54pm
3 Skipped Deployments
Project Deployment Preview Comments Updated (UTC)
rivet-cloud Ignored Ignored Dec 4, 2025 8:54pm
rivet-inspector Ignored Ignored Preview Dec 4, 2025 8:54pm
rivet-site Ignored Ignored Preview Dec 4, 2025 8:54pm

@vercel vercel bot temporarily deployed to Preview – rivetkit-serverless December 3, 2025 01:58 Inactive
Copy link
Contributor Author

MasterPtato commented Dec 3, 2025


How to use the Graphite Merge Queue

Add the label merge-queue to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@claude
Copy link

claude bot commented Dec 3, 2025

PR Review: Fix Runner Config Creation

Overview

This PR addresses a critical issue in runner config creation and deletion by adding graceful handling for missing runner pool workflows and implementing a backfill mechanism.


✅ Strengths

  1. Proper Error Handling: The addition of .graceful_not_found() prevents crashes when the runner pool workflow doesn't exist yet, which is essential for robustness.

  2. Smart Backfill Logic: In upsert.rs:182-192, when a signal fails to find the workflow, it automatically creates one. This ensures the system self-heals and maintains consistency.

  3. Good Logging: The debug log at delete.rs:57 provides useful information for troubleshooting without being too noisy.

  4. Log Level Adjustment: Changing from debug to trace for GC logs in chunking.rs:106 is appropriate since this is routine housekeeping noise.


🔍 Code Quality

Follows Repository Conventions: ✅

  • Structured logging with tracing: namespace_id=?input.namespace_id, name=%input.name
  • Lowercase log messages: "no runner pool workflow to bump"
  • Proper use of Result types and error handling

Code Patterns: ✅

  • Consistent with existing workflow signal patterns
  • Proper use of tags for workflow dispatch
  • Uses .unique() to prevent duplicate workflows

🤔 Considerations & Questions

1. Race Condition Potential

In upsert.rs:172-192, there's a potential race condition:

let res = ctx.signal(/* ... */).graceful_not_found().send().await?;

// Backfill
if res.is_none() {
    ctx.workflow(/* ... */).unique().dispatch().await?;
}

Question: Between checking res.is_none() and dispatching the workflow, could another concurrent upsert create the workflow, potentially causing duplicate workflows? The .unique() should handle this, but worth verifying the uniqueness key includes both namespace_id and runner_name.

Recommendation: Verify in workflows/runner_pool.rs that the workflow's unique key is based on both namespace and runner name to prevent race conditions.

2. Asymmetry Between Delete and Upsert

  • Delete (delete.rs:46-59): Only logs when workflow not found
  • Upsert (upsert.rs:171-193): Creates workflow when not found

Question: Should delete also attempt to clean up or verify workflow state? Or is the current asymmetry intentional because deletion doesn't require an active workflow?

This seems reasonable on second thought - deletion is a terminal operation, so logging is sufficient.

3. Error vs None Semantics

graceful_not_found() returns None for not-found workflows. Consider:

  • What if the signal fails for other reasons (network, permissions, etc.)?
  • Does .send().await? propagate those errors correctly while only converting not-found to None?

This is likely correct, but worth confirming the error handling semantics of graceful_not_found().


🧪 Testing Recommendations

  1. Test workflow backfill: Verify that upserting a config when no pool workflow exists creates the workflow correctly
  2. Test concurrent upserts: Ensure .unique() prevents duplicate workflows when multiple upserts race
  3. Test delete with missing workflow: Confirm deletion succeeds gracefully when workflow doesn't exist
  4. Test config transitions: Verify Normal → Serverless config changes trigger proper workflow creation

🐛 Potential Issues

None identified - The changes appear sound and follow the repository's patterns correctly.


📊 Security & Performance

  • Security: ✅ No concerns
  • Performance: ✅ Minimal impact - adds one conditional check and potential workflow dispatch
  • Resource Management: ✅ The backfill pattern prevents orphaned configs without workflows

📝 Minor Suggestions

  1. Consider adding a comment explaining the backfill logic at upsert.rs:181-182:
// Backfill: If the workflow doesn't exist (e.g., after system restart or migration),
// create it to ensure the pool is managed
if res.is_none() {
  1. Log consistency: Consider using the same log level (debug) for both delete and upsert cases for easier grepping:
    • delete.rs:57: tracing::debug!(...)
    • Could add similar log in upsert.rs when backfilling

✨ Summary

Overall Assessment: ✅ Approve

This is a solid fix that adds necessary resilience to the runner config system. The backfill mechanism is a smart solution that makes the system self-healing. The code follows repository conventions and integrates well with the existing workflow patterns.

Key Improvements:

  • Graceful handling of missing workflows
  • Self-healing backfill mechanism
  • Better observability with appropriate logging
  • Reduced log noise for routine operations

Recommendation: Approve and merge after confirming the workflow uniqueness semantics handle the potential race condition correctly.

@pkg-pr-new
Copy link

pkg-pr-new bot commented Dec 3, 2025

More templates

@rivetkit/cloudflare-workers

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/cloudflare-workers@3580

@rivetkit/db

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/db@3580

@rivetkit/framework-base

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/framework-base@3580

@rivetkit/next-js

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/next-js@3580

@rivetkit/react

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/react@3580

rivetkit

pnpm add https://pkg.pr.new/rivet-dev/rivet/rivetkit@3580

@rivetkit/sql-loader

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/sql-loader@3580

@rivetkit/engine-runner

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner@3580

@rivetkit/engine-runner-protocol

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner-protocol@3580

commit: a7e1a7c

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants