Feat: sandbox and sub-agents stability updates #253

AshishKumar4 · 2025-11-25T03:58:59Z

Summary

This PR improves stability of the sandbox integration and sub-agent operations by upgrading the sandbox SDK version, adding null-check guards for LLM inference results, and refining agent prompts for better code generation quality.

Changes

Sandbox Updates

SandboxDockerfile, package.json, bun.lock: Upgraded @cloudflare/sandbox from 0.4.14 to 0.5.2
worker/services/sandbox/sandboxTypes.ts: Added disabled field to TemplateInfo and TemplateDetails schemas to support template filtering
worker/services/sandbox/BaseSandboxService.ts: Implemented disabled field population in template listing and loading

Agent Stability Improvements

worker/agents/assistants/projectsetup.ts: Added null result validation after inference calls to prevent silent failures
worker/agents/operations/PhaseGeneration.ts: Added null-check guard and updated phase planning prompt to prioritize runtime error fixing
worker/agents/operations/PostPhaseCodeFixer.ts: Added null result validation for code fixer operation
worker/agents/operations/UserConversationProcessor.ts: Added null-check guards for conversation processing and summarization
worker/agents/planning/blueprint.ts: Added null result validation and emphasized proportional blueprint complexity
worker/agents/planning/templateSelector.ts: Added null-check validation and filtered out disabled/minimal templates from selection

Code Generation Quality

worker/agents/core/simpleGeneratorAgent.ts:
- Simplified user model config loading (removed unnecessary filtering)
- Added static analysis cache invalidation after deep debugger fixes
- Added sandbox deployment after file generation for debugging consistency
worker/agents/assistants/codeDebugger.ts: Switched from full to lite React render loop prevention prompt
worker/agents/prompts.ts:
- Added worker/core-utils.ts to the "DO NOT TOUCH" files list
- Added Cloudflare Workers/Durable Objects install warning to common pitfalls
- Enhanced render loop detection to include "infinite loop" pattern

Motivation

This PR addresses several stability issues encountered in production:

Null inference results: Multiple operations could fail silently when LLM inference returned null (due to rate limits, network issues, or provider errors). Added explicit null checks with error messages to surface these failures clearly.
Sandbox compatibility: Upgraded to sandbox SDK 0.5.2 to leverage latest stability fixes and features.
Template management: Added disabled field to prevent broken or experimental templates from being selected by users.
Cache consistency: Deep debugger fixes weren't invalidating static analysis cache, causing stale error reports.
Deployment consistency: Files generated by debugger weren't being deployed to sandbox immediately, requiring manual refresh.

Testing

To verify these changes:

Test agent operations under rate-limited conditions (null inference results should throw clear errors)
Verify template selection excludes disabled and minimal templates
Test deep debugger flow: confirm fixes are deployed immediately and cache is invalidated
Verify project setup handles missing dependencies gracefully
Check that phase generation prioritizes runtime errors correctly

Related Issues

This PR addresses stability concerns raised in:

Issue Deep_Debug Fix Reverts Itself and Breaks Functionality in a Loop #216 (Deep_Debug Fix Reverts Itself) - cache invalidation
Issue Debugger won’t stop and reports completion incorrectly #246 (Debugger won't stop) - improved error handling
Issue Dependencies are imported but could not be resolved #201 (Dependencies not resolved) - better dependency warnings

_{This PR description was automatically generated by Claude Code}

github-actions · 2025-11-25T03:59:16Z

Claude encountered an error —— View job

Code Review in Progress

Tasks:

Get PR diff and understand changes
Check for related issues
Generate and update PR description
Review code quality (bugs, types, architecture, performance)
Review security (injections, auth, data handling)
Post inline comments for critical issues
Post final review summary

PR Description Updated:
Added comprehensive description with summary of changes, motivation, testing guidance, and links to related issues (#195, #246, #216, #201).

Review Progress:
Conducting detailed code quality and security analysis...

Add null checks after inference calls in blueprint, template selection, phase generation, user conversation processing, project setup, and code fixer operations. Switch from REACT_RENDER_LOOP_PREVENTION to REACT_RENDER_LOOP_PREVENTION_LITE in code debugger and phase generation. Add explicit warning against installing cloudflare:workers/durable-objects dependencies in multiple prompts. Update phase generation to prioritize runtime

github-actions · 2025-11-25T03:59:57Z

Claude encountered an error —— View job

Code Quality & Security Review

Recommendation: APPROVE with one minor suggestion

Code Quality

Medium Priority Issues:

Type safety concern in userModelConfigs assignment - The removal of filtering logic may introduce type inconsistencies. Please verify the type definitions align.

Positive Changes:

Excellent addition of null-check guards across all inference operations (projectsetup.ts, PhaseGeneration.ts, PostPhaseCodeFixer.ts, UserConversationProcessor.ts, blueprint.ts, templateSelector.ts) - prevents silent failures
Cache invalidation after deep debugger fixes (simpleGeneratorAgent.ts:1698) - addresses issue Deep_Debug Fix Reverts Itself and Breaks Functionality in a Loop #216
Immediate sandbox deployment after file generation (simpleGeneratorAgent.ts:1763) - improves debugging consistency
Template filtering improvements with disabled field - better template management
Enhanced prompt guidance for runtime error prioritization and Cloudflare Workers warnings

Security

No vulnerabilities found

This PR focuses on stability improvements in the agent/sandbox layer. All changes are internal to the worker runtime:

No user input validation changes
No authentication/authorization changes
No SQL/injection risks introduced
Sandbox version upgrade (0.4.14 → 0.5.2) follows standard dependency management
All error messages appropriately sanitize sensitive information

Summary

This is a solid stability improvement PR that addresses real production issues. The null-check guards are essential for surfacing LLM inference failures clearly rather than failing silently. The cache invalidation and deployment consistency fixes directly address reported issues #216 and #246.

The only concern is the simplified userModelConfigs assignment which may need type verification, but this is a minor issue that can be validated in testing.

Approval recommendation: The changes improve error handling, fix known bugs, and maintain code quality standards. The type safety concern is worth checking but doesn't block approval.

Tasks:

Get PR diff and understand changes
Check for related issues
Generate and update PR description
Review code quality (bugs, types, architecture, performance)
Review security (injections, auth, data handling)
Post inline comments for critical issues
Post final review summary

github-actions · 2025-11-25T04:02:21Z

worker/services/sandbox/BaseSandboxService.ts

+                    disabled: t.disabled ?? false,
                    projectType: t.projectType || 'app',
                    renderMode: t.renderMode,
                    slideDirectory: t.slideDirectory,


Critical Bug: Missing disabled field in template mapping

The TemplateInfoSchema in sandboxTypes.ts (line 116) requires a disabled field:

disabled: z.boolean(),

However, this mapping on lines 90-98 does not include the disabled field. This will cause Zod validation failures when the response is validated against TemplateInfoSchema.

Fix:

Suggested change

slideDirectory: t.slideDirectory,

name: t.name,

language: t.language,

frameworks: t.frameworks || [],

description: t.description,

disabled: t.disabled ?? false,

projectType: t.projectType || 'app',

renderMode: t.renderMode,

slideDirectory: t.slideDirectory,

Note: The PR correctly adds disabled to line 95 in the diff (in the template details section), but this section at line 90-98 was missed.

github-actions · 2025-11-25T04:02:48Z

worker/agents/core/simpleGeneratorAgent.ts

                ...this.state.inferenceContext,
-                userModelConfigs,
+                userModelConfigs: userConfigsRecord,
            },


Medium Priority - Type Safety: The removed filtering logic appears to have served a purpose in extracting clean ModelConfig objects by removing metadata fields (isUserOverride, userConfigId).

The new code directly assigns userConfigsRecord (which contains these extra fields) to userModelConfigs. This could cause issues if:

The InferenceContext.userModelConfigs type expects Record<string, ModelConfig> without metadata fields

Downstream code doesn't expect these additional fields

Recommendation: Verify that the type definition for InferenceContext.userModelConfigs accepts the full record structure returned by getUserModelConfigs(), or restore the filtering logic to maintain type safety.

github-actions · 2025-11-25T04:03:04Z

worker/agents/core/simpleGeneratorAgent.ts

            inferenceContext: {
                ...this.state.inferenceContext,
-                userModelConfigs,
+                userModelConfigs: userConfigsRecord,


Type Mismatch: UserModelConfigWithMetadata vs ModelConfig

The getUserModelConfigs() method returns Record<AgentActionKey, UserModelConfigWithMetadata>, which includes extra metadata fields (isUserOverride, userConfigId).

However, InferenceContext.userModelConfigs is typed as Record<AgentActionKey, ModelConfig> (see worker/agents/inferutils/config.types.ts:359).

The previous code (commit 7b90c82) explicitly filtered out these metadata fields:

const userModelConfigs: Record<string, ModelConfig> = {}; for (const [actionKey, mergedConfig] of Object.entries(userConfigsRecord)) { if (mergedConfig.isUserOverride) { const { isUserOverride, userConfigId, ...modelConfig } = mergedConfig; userModelConfigs[actionKey] = modelConfig; } }

Issue: While TypeScript will allow this assignment (since UserModelConfigWithMetadata extends ModelConfig), downstream code may not expect these extra fields. The old code also only included configs where isUserOverride === true, but the new code includes all configs.

Recommendation: Either:

Keep the old filtering logic if the isUserOverride check was intentional, or

Update the InferenceContext type to accept UserModelConfigWithMetadata if the extra fields are acceptable

Clarification needed: Was the isUserOverride filter intentional, or was it meant to filter all configs regardless of override status?

feat: upgraded to sandbox sdk version 0.5.2

9f9cc7a

AshishKumar4 marked this pull request as ready for review November 25, 2025 03:59

github-actions bot added dependencies Pull requests that update a dependency file worker agent labels Nov 25, 2025

github-actions bot added the size/M label Nov 25, 2025

AshishKumar4 force-pushed the feat/stability branch from 744b687 to bf20035 Compare November 25, 2025 03:59

github-actions bot reviewed Nov 25, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat: sandbox and sub-agents stability updates #253

Feat: sandbox and sub-agents stability updates #253

Uh oh!

AshishKumar4 commented Nov 25, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Nov 25, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 25, 2025 •

edited

Loading

Uh oh!

github-actions bot Nov 25, 2025

Uh oh!

github-actions bot Nov 25, 2025

Uh oh!

github-actions bot Nov 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Feat: sandbox and sub-agents stability updates #253

Are you sure you want to change the base?

Feat: sandbox and sub-agents stability updates #253

Uh oh!

Conversation

AshishKumar4 commented Nov 25, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Sandbox Updates

Agent Stability Improvements

Code Generation Quality

Motivation

Testing

Related Issues

Uh oh!

github-actions bot commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review in Progress

Uh oh!

github-actions bot commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Quality & Security Review

Code Quality

Security

Summary

Uh oh!

github-actions bot Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

AshishKumar4 commented Nov 25, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Nov 25, 2025 •

edited

Loading

github-actions bot commented Nov 25, 2025 •

edited

Loading