Skip to content

Commit 1b89b95

Browse files
committed
feat: add e2e test harness and accessibility polish
Introduce Playwright-driven Electron e2e runner with mock AI scenario to capture deterministic transcripts and videos for review flows. Seed test config via CMUX_TEST_ROOT, adjust Config to honor it, and teach main process to boot against the dev server during e2e runs. Stub mermaid in Vite to keep captures predictable and expand tsconfig coverage for new test sources. Harden Bash and StreamManager tests for CI timing variance. Improve keyboard and screen-reader access across chat transcript, command palette, slider, sidebar, and toast UI. Signed-off-by: Thomas Kosiewski <tk@coder.com> fix: align git status parser with ref order Change-Id: I7afe82dc1d8eb76f0c78bd6c60571a7460dfcf55 Signed-off-by: Thomas Kosiewski <tk@coder.com> fix: emit abort event and cancel existing mock streams - Emit stream-abort event when stopping mock streams to mirror real streaming behavior - Cancel any existing stream before starting a new one to prevent timer corruption - Addresses Codex review comments about mock stream lifecycle management Change-Id: I0a95493aa15dc0eda71a5c31b09b3844eeaf8187 Signed-off-by: Thomas Kosiewski <tk@coder.com> fix: persist mock stream messages to history on completion - Call HistoryService.updateHistory when stream-end fires in mock mode - Mirrors real StreamManager behavior to persist completed messages - Prevents empty assistant entries in chat.jsonl after mock runs - Addresses Codex review comment about mock persistence Change-Id: I4f4cd7fe27355a561cd4e2193a4045424918dd1e Signed-off-by: Thomas Kosiewski <tk@coder.com>
1 parent 5bff20c commit 1b89b95

38 files changed

+2094
-416
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,3 +79,5 @@ profile.txt
7979
src/version.ts
8080
OPENAI_FIX_SUMMARY.md
8181
docs/vercel/
82+
artifacts/
83+
tests/e2e/tmp/

Makefile

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@
2121
.PHONY: all build dev start clean help
2222
.PHONY: build-main build-preload build-renderer
2323
.PHONY: lint lint-fix fmt fmt-check fmt-shell fmt-shell-check typecheck static-check
24-
.PHONY: test test-unit test-integration test-watch test-coverage
24+
.PHONY: test test-unit test-integration test-watch test-coverage test-e2e
2525
.PHONY: dist dist-mac dist-win dist-linux
2626
.PHONY: docs docs-build docs-watch
2727

@@ -110,6 +110,9 @@ test-watch: ## Run tests in watch mode
110110
test-coverage: ## Run tests with coverage
111111
@./scripts/test.sh --coverage
112112

113+
test-e2e: ## Run end-to-end tests
114+
@PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD=1 bun x playwright test --project=electron
115+
113116
## Distribution
114117
dist: build ## Build distributable packages
115118
@bun x electron-builder --publish never

bun.lock

Lines changed: 112 additions & 162 deletions
Large diffs are not rendered by default.

bunfig.toml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
[test]
2+
root = "src"
3+
match = [
4+
"**/*.test.ts",
5+
"**/*.test.tsx",
6+
"**/*.spec.ts",
7+
"**/*.spec.tsx"
8+
]

docs/e2e/mock-transcript.md

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
# Mock Transcript: Cmux End-to-End Demo
2+
3+
This scripted flow is tailored for automated Playwright captures. It walks through the highest-impact UI affordances so reviewers can skim a single recording and understand the app’s behavior.
4+
5+
## Goals
6+
7+
- Exercise key surfaces: project sidebar, workspace modal, chat surface (streaming + reasoning + tool call), chat meta sidebar, plan/exec toggle, thinking slider, error banner, edit flow, and history truncation.
8+
- Keep every interaction deterministic so the mock AI backend can replay the same transcript on every run.
9+
- Ensure videos remain under ~90 seconds by limiting idle time while still letting animations complete.
10+
11+
## Environment Prep
12+
13+
1. Launch Cmux with `CMUX_MOCK_AI=1` so the main process swaps in the scripted responder.
14+
2. Pre-seed `~/.cmux/config.json` with a single project `demo-repo` and worktree `feature/login`. (The Playwright harness will handle this setup.)
15+
3. Start the app with the project pre-selected so the chat surface is immediately visible.
16+
4. Install Playwright’s bundled ffmpeg runtime via `bunx playwright install ffmpeg` to ensure Electron video capture works reliably.
17+
18+
## High-Level Timeline
19+
20+
| Step | UI Action | Transcript Snippet | Feature Coverage |
21+
| ---- | ---------------------------------------------------------------------------------------------------- | -------------------------- | --------------------------------------------------------------- |
22+
| 0 | Hover "Tips" carousel briefly before interacting || Carousel animation baseline |
23+
| 1 | Open project sidebar menu → click `+` to add workspace || Project sidebar controls, modal launch |
24+
| 2 | Use `NewWorkspaceModal` to create branch `demo-review` || Modal form + validation |
25+
| 3 | Select `demo-review` workspace || Workspace selection, metadata refresh |
26+
| 4 | Adjust plan/exec toggle to `plan` and drag thinking slider to `3` || Input controls, tooltips |
27+
| 5 | Send message `Let's summarize the current branches.` | `User#1` | Chat input send, persisted state |
28+
| 6 | Mock assistant streams plan-style response with reasoning preamble and tool call to `git.branchList` | `Assistant#1` | Streaming text, reasoning block, tool message, message metadata |
29+
| 7 | Switch toggle to `exec`, thinking level `1`, send follow-up `Open the onboarding doc.` | `User#2` | Mode swap effect, second send |
30+
| 8 | Mock assistant attempts `filesystem.open` tool, emits `StreamError` (simulated ENOENT) | `Assistant#2` (error) | Error rendering, cancel streaming state |
31+
| 9 | Click edit on `User#2`, modify text to `Show the onboarding doc contents instead.` and submit | `User#2-edit` | Edit barrier, resend flow |
32+
| 10 | Assistant retries, succeeds with streamed content and closes tool call | `Assistant#3` | Stream restart after edit, reasoning end, tool output |
33+
| 11 | Invoke `/truncate 50` command from command palette to trim history | `System` message (backend) | Slash command handling, delete message event |
34+
| 12 | Chat auto-scroll hint appears and is dismissed via tooltip/button || Jump-to-bottom affordance |
35+
| 13 | Use chat meta sidebar to collapse/expand (`ChatMetaSidebar`), ensure recording captures state change || Sidebar interactions |
36+
37+
## Detailed Transcript
38+
39+
The mock backend will replay the following payloads (history sequences are strictly increasing):
40+
41+
1. **User#1** (`historySequence: 1`)
42+
- Text: "Let's summarize the current branches."
43+
- Metadata: plan mode, thinking level 3.
44+
2. **Assistant#1** (streamed)
45+
- `stream-start` for `msg-plan-1` (`historySequence: 2`).
46+
- `reasoning-delta`: "Looking at demo-repo/workspaces…" → "Found three branches." (two chunks).
47+
- `tool-call-start`: id `tool-branches`, name `git.branchList`, args `{ project: "demo-repo" }`.
48+
- `tool-call-end`: same id, result `[{ name: "main" }, { name: "feature/login" }, { name: "demo-review" }]`.
49+
- `stream-delta` chunks forming assistant text:
50+
1. "Here’s the current branch roster:"
51+
2. "• `main` – release baseline"
52+
3. "• `feature/login` – authentication refresh"
53+
4. "• `demo-review` – sandbox you just created"
54+
- `stream-end` with metadata `{ model: "mock:planner", usage: { inputTokens: 128, outputTokens: 85 } }`.
55+
3. **User#2** (`historySequence: 3`)
56+
- Text: "Open the onboarding doc."
57+
- Metadata: exec mode, thinking level 1.
58+
4. **Assistant#2 error run**
59+
- `stream-start` for `msg-exec-1` (`historySequence: 4`).
60+
- `tool-call-start`: id `tool-open`, name `filesystem.open`, args `{ path: "docs/onboarding.md" }`.
61+
- `stream-error`: `{ messageId: "msg-exec-1", error: "ENOENT: docs/onboarding.md not found", errorType: "tool_failed" }`.
62+
5. **User#2 edit** (`historySequence: 4` replacement)
63+
- Edited text: "Show the onboarding doc contents instead." (same history slot replaces prior message; backend replays truncated history before new message).
64+
6. **Assistant#3 success run**
65+
- `stream-start` for `msg-exec-2` (`historySequence: 5`).
66+
- `tool-call-start`: id `tool-open`, name `filesystem.open`, args `{ path: "docs/onboarding.md" }`.
67+
- `tool-call-end`: result `{ excerpt: "1. Clone the repo…" }`.
68+
- `stream-delta` chunks narrating successful retrieval.
69+
- `stream-end` metadata `{ model: "mock:executor", usage: { inputTokens: 96, outputTokens: 142 } }`.
70+
7. **System truncate acknowledgement**
71+
- After `/truncate 50`, backend emits `DeleteMessage` for sequences `[1, 2]` followed by informational assistant message `historySequence: 6` summarizing remaining context.
72+
73+
## Notes for Automation
74+
75+
- Every event is timestamped deterministically (e.g., add 1s per history sequence) so recordings align across runs.
76+
- Tool outputs should stay compact to avoid long scrolls; prefer bullet lists under 5 items.
77+
- When the error fires, keep stream duration short (<2s) so reviewers see the red banner without waiting.
78+
- After truncation, ensure the jump-to-bottom hint becomes visible by temporarily scrolling up before the delete event.
79+
80+
This transcript can be encoded as a JSON/TypeScript fixture and consumed by the mock AI service during tests.

package.json

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@
1919
"test:watch": "make test-watch",
2020
"test:coverage": "make test-coverage",
2121
"test:integration": "make test-integration",
22+
"test:e2e": "make test-e2e",
2223
"dist": "make dist",
2324
"dist:mac": "make dist-mac",
2425
"dist:win": "make dist-win",
@@ -57,6 +58,7 @@
5758
},
5859
"devDependencies": {
5960
"@eslint/js": "^9.36.0",
61+
"@playwright/test": "^1.56.0",
6062
"@testing-library/react": "^16.3.0",
6163
"@types/bun": "^1.2.23",
6264
"@types/diff": "^8.0.0",
@@ -79,6 +81,7 @@
7981
"eslint-plugin-react": "^7.37.5",
8082
"eslint-plugin-react-hooks": "^5.2.0",
8183
"jest": "^30.1.3",
84+
"playwright": "^1.56.0",
8285
"prettier": "^3.6.2",
8386
"ts-jest": "^29.4.4",
8487
"tsc-alias": "^1.8.16",

playwright.config.ts

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
import { defineConfig } from "@playwright/test";
2+
3+
const isCI = process.env.CI === "true";
4+
5+
export default defineConfig({
6+
testDir: "./tests/e2e",
7+
timeout: 120_000,
8+
expect: {
9+
timeout: 5_000,
10+
},
11+
fullyParallel: false,
12+
forbidOnly: isCI,
13+
retries: isCI ? 1 : 0,
14+
reporter: [
15+
["list"],
16+
["html", { outputFolder: "artifacts/playwright-report", open: "never" }],
17+
],
18+
workers: 1,
19+
use: {
20+
trace: isCI ? "on-first-retry" : "retain-on-failure",
21+
screenshot: "only-on-failure",
22+
video: {
23+
mode: "on",
24+
size: { width: 1280, height: 720 },
25+
},
26+
},
27+
outputDir: "artifacts/playwright-output",
28+
projects: [
29+
{
30+
name: "electron",
31+
testDir: "./tests/e2e",
32+
},
33+
],
34+
});

src/components/AIView.tsx

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -138,7 +138,7 @@ const EditBarrier = styled.div`
138138
text-align: center;
139139
`;
140140

141-
const JumpToBottomIndicator = styled.div`
141+
const JumpToBottomIndicator = styled.button`
142142
position: absolute;
143143
bottom: 8px;
144144
left: 50%;
@@ -360,6 +360,11 @@ const AIViewInner: React.FC<AIViewProps> = ({
360360
onWheel={markUserInteraction}
361361
onTouchMove={markUserInteraction}
362362
onScroll={handleScroll}
363+
role="log"
364+
aria-live={canInterrupt ? "polite" : "off"}
365+
aria-busy={canInterrupt}
366+
aria-label="Conversation transcript"
367+
tabIndex={0}
363368
>
364369
{messages.length === 0 ? (
365370
<EmptyState>
@@ -399,7 +404,7 @@ const AIViewInner: React.FC<AIViewProps> = ({
399404
)}
400405
</OutputContent>
401406
{!autoScroll && (
402-
<JumpToBottomIndicator onClick={jumpToBottom}>
407+
<JumpToBottomIndicator onClick={jumpToBottom} type="button">
403408
Press {formatKeybind(KEYBINDS.JUMP_TO_BOTTOM)} to jump to bottom
404409
</JumpToBottomIndicator>
405410
)}

src/components/ChatInput.tsx

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
import React, { useState, useRef, useCallback, useEffect } from "react";
1+
import React, { useState, useRef, useCallback, useEffect, useId } from "react";
22
import styled from "@emotion/styled";
33
import { CommandSuggestions, COMMAND_SUGGESTION_KEYS } from "./CommandSuggestions";
44
import type { Toast } from "./ChatInputToast";
@@ -308,6 +308,7 @@ export const ChatInput: React.FC<ChatInputProps> = ({
308308
const [mode, setMode] = useMode();
309309
const [use1M] = use1MContext(workspaceId);
310310
const { recentModels } = useModelLRU();
311+
const commandListId = useId();
311312

312313
const focusMessageInput = useCallback(() => {
313314
const element = inputRef.current;
@@ -742,6 +743,8 @@ export const ChatInput: React.FC<ChatInputProps> = ({
742743
onSelectSuggestion={handleCommandSelect}
743744
onDismiss={() => setShowCommandSuggestions(false)}
744745
isVisible={showCommandSuggestions}
746+
ariaLabel="Slash command suggestions"
747+
listId={commandListId}
745748
/>
746749
<InputControls data-component="ChatInputControls">
747750
<VimTextArea
@@ -754,6 +757,12 @@ export const ChatInput: React.FC<ChatInputProps> = ({
754757
suppressKeys={showCommandSuggestions ? COMMAND_SUGGESTION_KEYS : undefined}
755758
placeholder={placeholder}
756759
disabled={disabled || isSending || isCompacting}
760+
aria-label={editingMessage ? "Edit your last message" : "Message Claude"}
761+
aria-autocomplete="list"
762+
aria-controls={
763+
showCommandSuggestions && commandSuggestions.length > 0 ? commandListId : undefined
764+
}
765+
aria-expanded={showCommandSuggestions && commandSuggestions.length > 0}
757766
/>
758767
</InputControls>
759768
<ModeToggles data-component="ChatModeToggles">

src/components/ChatInputToast.tsx

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -202,7 +202,7 @@ export const ChatInputToast: React.FC<ChatInputToastProps> = ({ toast, onDismiss
202202
if (isRichError) {
203203
return (
204204
<ToastWrapper>
205-
<ErrorContainer>
205+
<ErrorContainer role="alert" aria-live="assertive">
206206
<div style={{ display: "flex", alignItems: "flex-start", gap: "6px" }}>
207207
<ToastIcon></ToastIcon>
208208
<div style={{ flex: 1 }}>
@@ -212,7 +212,9 @@ export const ChatInputToast: React.FC<ChatInputToastProps> = ({ toast, onDismiss
212212
<ErrorDetails>{toast.message}</ErrorDetails>
213213
{toast.solution && <ErrorSolution>{toast.solution}</ErrorSolution>}
214214
</div>
215-
<CloseButton onClick={handleDismiss}>×</CloseButton>
215+
<CloseButton onClick={handleDismiss} aria-label="Dismiss">
216+
×
217+
</CloseButton>
216218
</div>
217219
</ErrorContainer>
218220
</ToastWrapper>
@@ -222,7 +224,12 @@ export const ChatInputToast: React.FC<ChatInputToastProps> = ({ toast, onDismiss
222224
// Regular toast for simple messages and success
223225
return (
224226
<ToastWrapper>
225-
<ToastContainer type={toast.type} isLeaving={isLeaving}>
227+
<ToastContainer
228+
type={toast.type}
229+
isLeaving={isLeaving}
230+
role={toast.type === "error" ? "alert" : "status"}
231+
aria-live={toast.type === "error" ? "assertive" : "polite"}
232+
>
226233
<ToastIcon>{toast.type === "success" ? "✓" : "⚠"}</ToastIcon>
227234
<ToastContent>
228235
{toast.title && <ToastTitle>{toast.title}</ToastTitle>}

0 commit comments

Comments
 (0)