Skip to content

Conversation

@ammar-agent
Copy link
Collaborator

@ammar-agent ammar-agent commented Oct 11, 2025

Overview

Implements pluggable runtime abstraction layer with both LocalRuntime and SSHRuntime implementations. Includes comprehensive integration testing for file editing tools across both runtimes.

Changes

Runtime Abstraction

  • Runtime interface with core methods: exec(), readFile(), writeFile(), stat(), workspace lifecycle
  • LocalRuntime implementation using Node.js APIs (spawn, fs/promises)
  • SSHRuntime implementation using SSH connections with:
    • Remote command execution via SSH
    • File operations over SSH with proper shell escaping
    • Git clone-based workspace creation with rsync/scp fallback
    • Remote init hook execution
  • RuntimeError class for consistent error handling

File Tools Integration

  • Refactored file_read.ts, file_edit_replace.ts, file_edit_insert.ts to use runtime
  • Updated bash.ts to support both local and SSH execution
  • All file operations now work transparently across local and remote environments

SSH Runtime Bug Fixes

  • Fixed shell escaping in SSHRuntime (was using JSON.stringify causing nested quote issues)
  • Added escapeShellArg() helper using single-quote escaping for reliable shell argument handling
  • Fixed quoting issues in file operations (cat, mkdir, mv commands)

Integration Testing

  • New test suite: runtimeFileEditing.test.ts (6 tests, ~42s)
    • Matrix testing across LocalRuntime and SSHRuntime
    • Tests file_read, file_edit_replace_string, file_edit_insert via AI
    • All file operations driven through AI (no direct filesystem access)
    • Uses Haiku 4.5 for speed, toolPolicy to isolate file tools
  • Updated: workspaceInitHook.test.ts to expect workspace creation logs
    • Init events now emitted for all workspaces (with or without .cmux/init hook)
    • Tests verify workspace creation steps are logged properly

Workspace Init Events

  • Workspace creation now emits init events (git worktree creation, rsync progress, etc.)
  • Provides visibility into workspace setup even without init hooks
  • LocalRuntime logs: "Creating git worktree...", "Worktree created successfully"
  • SSHRuntime logs: "Preparing remote workspace...", "Syncing project files...", etc.

Architecture Benefits

Easy to test: Runtime implementations isolated, matrix testing verifies both paths work identically.

Easy to extend: Adding new runtime = implement interface. Tools automatically work.

Battle-tested: SSH runtime validated with real AI file operations, not just unit tests.

Testing

✅ 6 runtime file editing tests pass (LocalRuntime + SSHRuntime)
✅ 5 workspace init hook tests pass
✅ All 40+ integration tests pass
✅ Type checking passes
✅ Static checks pass

Next Steps

Foundation ready for:

  • UI for configuring runtime per workspace
  • Additional runtime implementations (Docker, etc.)
  • Extended test coverage for edge cases

Generated with cmux

@ammar-agent
Copy link
Collaborator Author

✅ Runtime Integration Tests Added

Added comprehensive integration tests for both LocalRuntime and SSHRuntime with real SSH testing using Docker.

Test Infrastructure

Docker SSH Server:

  • Alpine Linux + OpenSSH in container
  • Dynamic port allocation (-p 0:22) for concurrent test runs
  • Ephemeral SSH key generation per test run
  • Container reused across test suite (~2s startup, ~25s total)

Test Matrix Pattern:

  • All 52 tests run against both local and ssh runtimes
  • Ensures interface compliance and behavior consistency
  • Isolated temp directories per test (local or remote)

Test Coverage

Core Operations: (26 tests)

  • exec(): stdout/stderr separation, stdin, exit codes, env vars, cwd
  • readFile(): text, binary, empty files, error handling
  • writeFile(): atomic writes, overwrites, parent dir creation
  • stat(): file/dir metadata, error handling

Edge Cases: (26 tests)

  • Non-existent paths → RuntimeError
  • Directory operations (stat works, read fails)
  • Binary data (non-UTF8)
  • Large files (1MB+)
  • Concurrent operations
  • Special characters (quotes, spaces, newlines)
  • Nested/long file paths

Results

✅ All 52 tests passing (26 local + 26 SSH)
✅ Existing integration tests still pass
✅ Types checked
✅ Formatted

Test run:

TEST_INTEGRATION=1 bun x jest tests/runtime/runtime.test.ts
# ~25 seconds for all 52 tests

Implementation Notes

LocalRuntime fixes:

  • Added type annotations for ReadableStream/WritableStream handlers
  • Already creates parent dirs (via write-file-atomic)

SSHRuntime enhancements:

  • Added identityFile config option for key path
  • Added port config option for non-standard ports
  • Fixed writeFile() to create parent directories: mkdir -p $(dirname ...)
  • Suppress SSH warnings with -o LogLevel=ERROR

Key design decisions:

  • Test real SSH behavior (no mocking) for production confidence
  • Dynamic ports prevent conflicts between parallel test runs
  • Disposable workspaces ensure test isolation
  • Container reuse for speed vs isolation trade-off

Files Added

tests/runtime/
├── runtime.test.ts              # 409 lines, 52 tests
├── ssh-fixture.ts               # Docker lifecycle (278 lines)
├── test-helpers.ts              # Workspace management (176 lines)
├── ssh-server/
│   ├── Dockerfile               # Alpine + OpenSSH
│   ├── entrypoint.sh            # SSH server startup
│   └── sshd_config              # SSH config
└── README.md                    # Test documentation

Ready for review! 🚀

@ammar-agent
Copy link
Collaborator Author

Added macOS runtime integration tests to CI. The main integration suite covers Linux, but runtime code is particularly prone to system incompatibilities (process spawning, streams, file ops) so we run these tests on macOS as well. Full suite on matrix would be wasteful - this targets just the platform-sensitive code.

@ammar-agent
Copy link
Collaborator Author

Updated macOS runtime test job:

  • Now runs all of tests/runtime/ for future-proofing
  • Removed API keys (runtime tests don't use AI APIs)

@ammar-agent ammar-agent force-pushed the runtime-abstraction branch 8 times, most recently from fb9c373 to c27e082 Compare October 25, 2025 16:49
@ammar-agent ammar-agent changed the title 🤖 Add pluggable runtime abstraction layer 🤖 Add runtime abstraction + SSH runtime with file tools Oct 25, 2025
@ammario ammario changed the title 🤖 Add runtime abstraction + SSH runtime with file tools 🤖 Add basic support for SSH runtime Oct 25, 2025
@ammar-agent ammar-agent force-pushed the runtime-abstraction branch 3 times, most recently from ced92d9 to ae86f06 Compare October 25, 2025 21:01
Implements Phase 1 of pluggable runtime system with minimal Runtime interface
that allows tools to execute in different environments (local, docker, ssh).

Changes:
- Add Runtime interface with 5 core methods: exec, readFile, writeFile, stat, exists
- Implement LocalRuntime using Node.js APIs (spawn, fs/promises)
- Refactor file tools (file_read, file_edit_*) to use runtime abstraction
- Update ToolConfiguration to include runtime field
- Inject LocalRuntime in aiService and ipcMain
- Update tsconfig to ES2023 for Disposable type support
- Update all tests to inject LocalRuntime (90 tests pass)

This is a pure refactoring with zero user-facing changes. All existing
functionality remains identical. Sets foundation for Docker and SSH runtimes.

_Generated with `cmux`_
- Add SSHRuntime class implementing Runtime interface
- Add runtime configuration types (local, ssh)
- Add runtime factory to create runtime based on config
- Use native ssh2 SFTP for file operations
- Support SSH key and password authentication
- Connection pooling and automatic reconnection
- Add runtimeConfig to WorkspaceMetadata
- Update AIService to use runtime factory with workspace config
- Update all tests and ipcMain to use runtime factory
- Default to local runtime if no config specified
- Enables faster CI iteration by filtering all test types
- Add comment explaining test_filter helps keep test times low
- E2E tests use --grep for pattern matching
- Storybook tests pass filter directly to test-storybook
When running workflow_dispatch with a test filter for debugging specific
unit/integration tests, skip the expensive storybook and e2e test jobs
to keep iteration times fast.
Improves separation of concerns by delegating workspace deletion to
Runtime implementations, following the same pattern as rename.

Changes:
- Runtime.ts: Add deleteWorkspace() method to interface
- LocalRuntime.ts: Implement with git worktree remove
  - Auto-uses --force for clean worktrees with submodules
  - Respects force flag for dirty worktrees
- SSHRuntime.ts: Implement with rm -rf command
- ipcMain.ts: Refactor removeWorkspaceInternal to delegate to runtime
  - Removed direct git operations and getMainWorktreeFromWorktree
  - ~40 LoC reduction from better abstraction
- tests: Add matrix tests for both local and SSH runtimes
  - 6 new tests covering success, force-delete, and error cases
  - All 86 runtime tests passing
  - All 5 removeWorkspace IPC tests passing

Design decisions:
- Runtime computes paths from projectPath + workspaceName + srcDir
- LocalRuntime handles submodule edge case (git requires --force)
- SSHRuntime ignores force flag (rm -rf is always forceful)
- Maintains backward compatibility with existing workspaces

Net change: +150 lines (tests), -40 lines (ipcMain simplification)
- Remove unused result variables in execAsync calls
- Remove unused execAsync import from SSHRuntime
- Replace || with ?? for nullish coalescing
- Remove unused catch parameter
Restore preloadAISDKProviders() functionality to eagerly load @ai-sdk/anthropic
and @ai-sdk/openai modules before tests run concurrently. This prevents Jest's
module sandboxing from blocking dynamic imports with 'You are trying to import
a file outside of the scope of the test code' errors.

The function was previously converted to a no-op under the assumption that
'Jest concurrency has been stabilized elsewhere', but tests still failed when
run together. Preloading ensures providers are in the module cache before
concurrent tests access them via dynamic imports in createModel().

Also add note to AGENTS.md warning that tests/ipcMain takes a long time locally.
- Increase TEST_TIMEOUT_SSH_MS from 45s to 60s to account for network latency
- Add separate stream timeout constants (15s local, 25s SSH)
- Pass runtime-specific stream timeout to sendMessageAndWait in all tests
- Add preloadAISDKProviders to runtimeFileEditing tests to prevent race conditions

SSH tests perform multiple AI calls per test, and with network overhead in CI,
the previous 15s timeout was too tight. Now SSH tests get 25s per stream.
**Root cause:** config.getWorkspacePath() duplicated Runtime.getWorkspacePath(),
causing workspace paths to be computed inconsistently. This broke workspace
deletion and bash execution in tests.

**Changes:**
- **Deleted config.getWorkspacePath()** and config.getWorkspacePaths()
- Replaced all uses with Runtime.getWorkspacePath()
- Fixed IPC WORKSPACE_EXECUTE_BASH to use correct workspace path via Runtime
- Fixed bash tool to pass cwd to runtime.exec() (was ignoring it)
- Renamed deleteWorkspace.test.ts → removeWorkspace.test.ts for IPC parity
- Added waitForInitComplete() helper using init-end events (no static sleeps)

**Test improvements:**
- 7/12 workspace deletion tests passing (up from 0/12)
- All local runtime tests pass
- SSH tests have pre-existing issues unrelated to this fix

**Architecture:**
- Runtime owns all path computation logic
- Config only stores paths for legacy migration
- IPC layer computes paths via Runtime, not Config

_Generated with _
- SSHRuntime.initWorkspace now uses workspacePath for git operations
- SSHRuntime.syncProjectToRemote clones to workspace path, not srcBaseDir
- SSHRuntime.runInitHook runs hook in workspace directory
- SSHRuntime.createWorkspace creates parent of workspace path, not parent of srcBaseDir
- Enhanced waitForInitComplete to show init output on failure
- Fixed all type errors: workdir → srcBaseDir
- Updated parseRuntimeString to use correct srcBaseDir (without workspace name)
- Updated test fixtures to use correct srcBaseDir values

All 12 removeWorkspace tests passing (7 local + 5 SSH)
All 16 createWorkspace tests passing
- Update chatCommands.test.ts to expect srcBaseDir instead of workdir
- Remove workspace name from srcBaseDir in parseRuntimeString
- Add srcDir to gitService.test.ts mockConfig to fix createWorktree calls
- Check if directory exists before checking for uncommitted changes
- Return success for deleting non-existent workspace (rm -rf is no-op)
- Prevents false 'uncommitted changes' error when workspace doesn't exist
- Close exec stream stdin right after spawning
- Prevents runtime from waiting forever for input
- Fixes sendMessage.test.ts hanging after tool calls
- AIService now passes workspacePath (not srcBaseDir) to tools
- Fixes file_read/file_edit tools resolving paths from wrong directory
- Enhanced EventCollector.logEventDiagnostics() with detailed output:
  * Tool call args and results
  * Error details
  * Stream deltas and content
  * Event type counts
- assertStreamSuccess() and waitForEvent() now call logEventDiagnostics()
- Fixes sendMessage integration test that was timing out
- SSHRuntime.renameWorkspace now uses expandTildeForSSH() before mv
- expandTildeForSSH already handles quoting, so removed shescape.quote()
- Refactored renameWorkspace.test.ts to use runtime matrix pattern:
  * Tests both local and SSH runtimes
  * Extracted test helpers for code reuse
  * Added waitForInitComplete() before rename operations
- All rename tests now pass for both runtimes (4/4 tests)

Bug was: mv command received quoted paths with unexpanded tilde (~),
causing "No such file or directory" errors. Also, test was trying to
rename before init hook completed, causing directory to not exist yet.
@ammario ammario merged commit 5f200a6 into main Oct 26, 2025
15 checks passed
@ammario ammario deleted the runtime-abstraction branch October 26, 2025 17:38
ammar-agent added a commit that referenced this pull request Oct 27, 2025
After the SSH runtime commit (#178), testing infrastructure had significant
duplication across helper modules and inline functions.

Changes:
- Consolidated shared utilities into tests/ipcMain/helpers.ts
  - Added extractTextFromEvents(), sendMessageAndWait(), createWorkspaceWithInit()
  - Centralized test constants (INIT_HOOK_WAIT_MS, SSH_INIT_WAIT_MS, HAIKU_MODEL, etc.)
- Deleted tests/ipcMain/test-helpers/runtimeTestHelpers.ts (149 lines)
- Removed inline helper duplicates from test files
  - runtimeFileEditing.test.ts: -120 lines
  - removeWorkspace.test.ts: -39 lines
  - renameWorkspace.test.ts: -45 lines
  - runtimeExecuteBash.test.ts: simplified imports

Result: 240 net lines removed, improved consistency across integration tests.

All tests use consistent patterns from single source of truth.
ammar-agent added a commit that referenced this pull request Oct 27, 2025
After the SSH runtime commit (#178), testing infrastructure had significant
duplication across helper modules and inline functions.

Changes:
- Consolidated shared utilities into tests/ipcMain/helpers.ts
  - Added extractTextFromEvents(), sendMessageAndWait(), createWorkspaceWithInit()
  - Centralized test constants (INIT_HOOK_WAIT_MS, SSH_INIT_WAIT_MS, HAIKU_MODEL, etc.)
- Deleted tests/ipcMain/test-helpers/runtimeTestHelpers.ts (149 lines)
- Removed inline helper duplicates from test files
  - runtimeFileEditing.test.ts: -120 lines
  - removeWorkspace.test.ts: -39 lines
  - renameWorkspace.test.ts: -45 lines
  - runtimeExecuteBash.test.ts: simplified imports

Result: 240 net lines removed, improved consistency across integration tests.

All tests use consistent patterns from single source of truth.
ammar-agent added a commit that referenced this pull request Oct 27, 2025
After the SSH runtime commit (#178), testing infrastructure had significant
duplication across helper modules and inline functions.

Changes:
- Consolidated shared utilities into tests/ipcMain/helpers.ts
  - Added extractTextFromEvents(), sendMessageAndWait(), createWorkspaceWithInit()
  - Centralized test constants (INIT_HOOK_WAIT_MS, SSH_INIT_WAIT_MS, HAIKU_MODEL, etc.)
- Deleted tests/ipcMain/test-helpers/runtimeTestHelpers.ts (149 lines)
- Removed inline helper duplicates from test files
  - runtimeFileEditing.test.ts: -120 lines
  - removeWorkspace.test.ts: -39 lines
  - renameWorkspace.test.ts: -45 lines
  - runtimeExecuteBash.test.ts: simplified imports

Result: 240 net lines removed, improved consistency across integration tests.

All tests use consistent patterns from single source of truth.
github-merge-queue bot pushed a commit that referenced this pull request Oct 27, 2025
After the SSH runtime commit (#178, commit 5f200a6), there was
significant duplication in test infrastructure and repeated
copy-to-clipboard patterns across components.

## Test Infrastructure Consolidation

Consolidated 3 overlapping test helper modules and inline duplicates
into `tests/ipcMain/helpers.ts`:

- Added `extractTextFromEvents()`, `sendMessageAndWait()`,
`createWorkspaceWithInit()`
- Centralized test constants (INIT_HOOK_WAIT_MS, SSH_INIT_WAIT_MS,
HAIKU_MODEL, timeouts)
- Updated 4 test files to use consolidated helpers
- Deleted `tests/ipcMain/test-helpers/runtimeTestHelpers.ts` (149 lines)

**Result:** -240 lines, single source of truth for ipcMain integration
test utilities.

## Copy-to-Clipboard Deduplication

The copy feedback pattern (`copied` state + 2000ms timeout) was
duplicated across 4 components with identical implementations.

- Created `src/hooks/useCopyToClipboard.ts` for reusable copy
functionality
- Added `COPY_FEEDBACK_DURATION_MS` constant to `src/constants/ui.ts`
- Updated AssistantMessage, UserMessage, ProposePlanToolCall,
FileEditToolCall

**Result:** 54 lines of duplicated code replaced with single 33-line
hook.

_Generated with `cmux`_
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants