Skip to content

Commit c4ce0ce

Browse files
committed
feat: add e2e test harness and accessibility polish
Introduce Playwright-driven Electron e2e runner with mock AI scenario to capture deterministic transcripts and videos for review flows. Seed test config via CMUX_TEST_ROOT, adjust Config to honor it, and teach main process to boot against the dev server during e2e runs. Stub mermaid in Vite to keep captures predictable and expand tsconfig coverage for new test sources. Harden Bash and StreamManager tests for CI timing variance. Improve keyboard and screen-reader access across chat transcript, command palette, slider, sidebar, and toast UI.
1 parent 146adcd commit c4ce0ce

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

41 files changed

+2064
-454
lines changed

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,3 +77,6 @@ docs/mermaid.min.js
7777
**.cpuprofile
7878
profile.txt
7979
src/version.ts
80+
81+
artifacts/
82+
tests/e2e/tmp/

bun.lock

Lines changed: 109 additions & 135 deletions
Large diffs are not rendered by default.

bunfig.toml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
[test]
2+
root = "src"
3+
match = [
4+
"**/*.test.ts",
5+
"**/*.test.tsx",
6+
"**/*.spec.ts",
7+
"**/*.spec.tsx"
8+
]

docs/e2e/mock-transcript.md

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
# Mock Transcript: Cmux End-to-End Demo
2+
3+
This scripted flow is tailored for automated Playwright captures. It walks through the highest-impact UI affordances so reviewers can skim a single recording and understand the app’s behavior.
4+
5+
## Goals
6+
7+
- Exercise key surfaces: project sidebar, workspace modal, chat surface (streaming + reasoning + tool call), chat meta sidebar, plan/exec toggle, thinking slider, error banner, edit flow, and history truncation.
8+
- Keep every interaction deterministic so the mock AI backend can replay the same transcript on every run.
9+
- Ensure videos remain under ~90 seconds by limiting idle time while still letting animations complete.
10+
11+
## Environment Prep
12+
13+
1. Launch Cmux with `CMUX_MOCK_AI=1` so the main process swaps in the scripted responder.
14+
2. Pre-seed `~/.cmux/config.json` with a single project `demo-repo` and worktree `feature/login`. (The Playwright harness will handle this setup.)
15+
3. Start the app with the project pre-selected so the chat surface is immediately visible.
16+
4. Install Playwright’s bundled ffmpeg runtime via `bunx playwright install ffmpeg` to ensure Electron video capture works reliably.
17+
18+
## High-Level Timeline
19+
20+
| Step | UI Action | Transcript Snippet | Feature Coverage |
21+
| ---- | ---------------------------------------------------------------------------------------------------- | -------------------------- | --------------------------------------------------------------- |
22+
| 0 | Hover "Tips" carousel briefly before interacting || Carousel animation baseline |
23+
| 1 | Open project sidebar menu → click `+` to add workspace || Project sidebar controls, modal launch |
24+
| 2 | Use `NewWorkspaceModal` to create branch `demo-review` || Modal form + validation |
25+
| 3 | Select `demo-review` workspace || Workspace selection, metadata refresh |
26+
| 4 | Adjust plan/exec toggle to `plan` and drag thinking slider to `3` || Input controls, tooltips |
27+
| 5 | Send message `Let's summarize the current branches.` | `User#1` | Chat input send, persisted state |
28+
| 6 | Mock assistant streams plan-style response with reasoning preamble and tool call to `git.branchList` | `Assistant#1` | Streaming text, reasoning block, tool message, message metadata |
29+
| 7 | Switch toggle to `exec`, thinking level `1`, send follow-up `Open the onboarding doc.` | `User#2` | Mode swap effect, second send |
30+
| 8 | Mock assistant attempts `filesystem.open` tool, emits `StreamError` (simulated ENOENT) | `Assistant#2` (error) | Error rendering, cancel streaming state |
31+
| 9 | Click edit on `User#2`, modify text to `Show the onboarding doc contents instead.` and submit | `User#2-edit` | Edit barrier, resend flow |
32+
| 10 | Assistant retries, succeeds with streamed content and closes tool call | `Assistant#3` | Stream restart after edit, reasoning end, tool output |
33+
| 11 | Invoke `/truncate 50` command from command palette to trim history | `System` message (backend) | Slash command handling, delete message event |
34+
| 12 | Chat auto-scroll hint appears and is dismissed via tooltip/button || Jump-to-bottom affordance |
35+
| 13 | Use chat meta sidebar to collapse/expand (`ChatMetaSidebar`), ensure recording captures state change || Sidebar interactions |
36+
37+
## Detailed Transcript
38+
39+
The mock backend will replay the following payloads (history sequences are strictly increasing):
40+
41+
1. **User#1** (`historySequence: 1`)
42+
- Text: "Let's summarize the current branches."
43+
- Metadata: plan mode, thinking level 3.
44+
2. **Assistant#1** (streamed)
45+
- `stream-start` for `msg-plan-1` (`historySequence: 2`).
46+
- `reasoning-delta`: "Looking at demo-repo/workspaces…" → "Found three branches." (two chunks).
47+
- `tool-call-start`: id `tool-branches`, name `git.branchList`, args `{ project: "demo-repo" }`.
48+
- `tool-call-end`: same id, result `[{ name: "main" }, { name: "feature/login" }, { name: "demo-review" }]`.
49+
- `stream-delta` chunks forming assistant text:
50+
1. "Here’s the current branch roster:"
51+
2. "• `main` – release baseline"
52+
3. "• `feature/login` – authentication refresh"
53+
4. "• `demo-review` – sandbox you just created"
54+
- `stream-end` with metadata `{ model: "mock:planner", usage: { inputTokens: 128, outputTokens: 85 } }`.
55+
3. **User#2** (`historySequence: 3`)
56+
- Text: "Open the onboarding doc."
57+
- Metadata: exec mode, thinking level 1.
58+
4. **Assistant#2 error run**
59+
- `stream-start` for `msg-exec-1` (`historySequence: 4`).
60+
- `tool-call-start`: id `tool-open`, name `filesystem.open`, args `{ path: "docs/onboarding.md" }`.
61+
- `stream-error`: `{ messageId: "msg-exec-1", error: "ENOENT: docs/onboarding.md not found", errorType: "tool_failed" }`.
62+
5. **User#2 edit** (`historySequence: 4` replacement)
63+
- Edited text: "Show the onboarding doc contents instead." (same history slot replaces prior message; backend replays truncated history before new message).
64+
6. **Assistant#3 success run**
65+
- `stream-start` for `msg-exec-2` (`historySequence: 5`).
66+
- `tool-call-start`: id `tool-open`, name `filesystem.open`, args `{ path: "docs/onboarding.md" }`.
67+
- `tool-call-end`: result `{ excerpt: "1. Clone the repo…" }`.
68+
- `stream-delta` chunks narrating successful retrieval.
69+
- `stream-end` metadata `{ model: "mock:executor", usage: { inputTokens: 96, outputTokens: 142 } }`.
70+
7. **System truncate acknowledgement**
71+
- After `/truncate 50`, backend emits `DeleteMessage` for sequences `[1, 2]` followed by informational assistant message `historySequence: 6` summarizing remaining context.
72+
73+
## Notes for Automation
74+
75+
- Every event is timestamped deterministically (e.g., add 1s per history sequence) so recordings align across runs.
76+
- Tool outputs should stay compact to avoid long scrolls; prefer bullet lists under 5 items.
77+
- When the error fires, keep stream duration short (<2s) so reviewers see the red banner without waiting.
78+
- After truncation, ensure the jump-to-bottom hint becomes visible by temporarily scrolling up before the delete event.
79+
80+
This transcript can be encoded as a JSON/TypeScript fixture and consumed by the mock AI service during tests.

package.json

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@
2525
"test:watch": "./scripts/test.sh --watch",
2626
"test:coverage": "./scripts/test.sh --coverage",
2727
"test:integration": "bun test src && TEST_INTEGRATION=1 jest tests",
28+
"test:e2e": "PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD=1 bunx playwright test --project=electron",
2829
"dist": "bun run build && electron-builder --publish never",
2930
"dist:mac": "bun run build && electron-builder --mac --publish never",
3031
"dist:win": "bun run build && electron-builder --win --publish never",
@@ -63,6 +64,7 @@
6364
},
6465
"devDependencies": {
6566
"@eslint/js": "^9.36.0",
67+
"@playwright/test": "^1.56.0",
6668
"@testing-library/react": "^16.3.0",
6769
"@types/bun": "^1.2.23",
6870
"@types/diff": "^8.0.0",
@@ -84,6 +86,7 @@
8486
"eslint-plugin-react": "^7.37.5",
8587
"eslint-plugin-react-hooks": "^5.2.0",
8688
"jest": "^30.1.3",
89+
"playwright": "^1.56.0",
8790
"prettier": "^3.6.2",
8891
"ts-jest": "^29.4.4",
8992
"tsc-alias": "^1.8.16",

playwright.config.ts

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
import { defineConfig } from "@playwright/test";
2+
3+
const isCI = process.env.CI === "true";
4+
5+
export default defineConfig({
6+
testDir: "./tests/e2e",
7+
timeout: 120_000,
8+
expect: {
9+
timeout: 5_000,
10+
},
11+
fullyParallel: false,
12+
forbidOnly: isCI,
13+
retries: isCI ? 1 : 0,
14+
reporter: [
15+
["list"],
16+
["html", { outputFolder: "artifacts/playwright-report", open: "never" }],
17+
],
18+
workers: 1,
19+
use: {
20+
trace: isCI ? "on-first-retry" : "retain-on-failure",
21+
screenshot: "only-on-failure",
22+
video: {
23+
mode: "on",
24+
size: { width: 1280, height: 720 },
25+
},
26+
},
27+
outputDir: "artifacts/playwright-output",
28+
projects: [
29+
{
30+
name: "electron",
31+
testDir: "./tests/e2e",
32+
},
33+
],
34+
});

scripts/check_codex_comments.sh

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,9 +14,8 @@ BOT_LOGIN_GRAPHQL="chatgpt-codex-connector" # GraphQL does not
1414
echo "Checking for unresolved Codex comments in PR #${PR_NUMBER}..."
1515

1616
# Get all regular issue comments from the Codex bot (these can't be resolved)
17-
# Filter out "all clear" comments that indicate no issues found
1817
REGULAR_COMMENTS=$(gh api "/repos/{owner}/{repo}/issues/${PR_NUMBER}/comments" \
19-
--jq "[.[] | select(.user.login == \"${BOT_LOGIN_REST}\") | select(.body | test(\"Didn't find any major issues\") | not)]")
18+
--jq "[.[] | select(.user.login == \"${BOT_LOGIN_REST}\")]")
2019

2120
REGULAR_COUNT=$(echo "$REGULAR_COMMENTS" | jq 'length')
2221

src/components/AIView.tsx

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -138,7 +138,7 @@ const EditBarrier = styled.div`
138138
text-align: center;
139139
`;
140140

141-
const JumpToBottomIndicator = styled.div`
141+
const JumpToBottomIndicator = styled.button`
142142
position: absolute;
143143
bottom: 8px;
144144
left: 50%;
@@ -355,6 +355,11 @@ const AIViewInner: React.FC<AIViewProps> = ({
355355
onWheel={markUserInteraction}
356356
onTouchMove={markUserInteraction}
357357
onScroll={handleScroll}
358+
role="log"
359+
aria-live={canInterrupt ? "polite" : "off"}
360+
aria-busy={canInterrupt}
361+
aria-label="Conversation transcript"
362+
tabIndex={0}
358363
>
359364
{messages.length === 0 ? (
360365
<EmptyState>
@@ -394,7 +399,7 @@ const AIViewInner: React.FC<AIViewProps> = ({
394399
)}
395400
</OutputContent>
396401
{!autoScroll && (
397-
<JumpToBottomIndicator onClick={jumpToBottom}>
402+
<JumpToBottomIndicator onClick={jumpToBottom} type="button">
398403
Press {formatKeybind(KEYBINDS.JUMP_TO_BOTTOM)} to jump to bottom
399404
</JumpToBottomIndicator>
400405
)}

src/components/ChatInput.tsx

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
import React, { useState, useRef, useCallback, useEffect } from "react";
1+
import React, { useState, useRef, useCallback, useEffect, useId } from "react";
22
import styled from "@emotion/styled";
33
import { CommandSuggestions, COMMAND_SUGGESTION_KEYS } from "./CommandSuggestions";
44
import type { Toast } from "./ChatInputToast";
@@ -334,6 +334,7 @@ export const ChatInput: React.FC<ChatInputProps> = ({
334334
const [thinkingLevel] = useThinkingLevel();
335335
const [mode, setMode] = useMode();
336336
const { recentModels } = useModelLRU();
337+
const commandListId = useId();
337338

338339
const focusMessageInput = useCallback(() => {
339340
const element = inputRef.current;
@@ -730,6 +731,8 @@ export const ChatInput: React.FC<ChatInputProps> = ({
730731
onSelectSuggestion={handleCommandSelect}
731732
onDismiss={() => setShowCommandSuggestions(false)}
732733
isVisible={showCommandSuggestions}
734+
ariaLabel="Slash command suggestions"
735+
listId={commandListId}
733736
/>
734737
<InputControls>
735738
<InputField
@@ -750,6 +753,12 @@ export const ChatInput: React.FC<ChatInputProps> = ({
750753
placeholder={placeholder}
751754
disabled={disabled || isSending || isCompacting}
752755
canInterrupt={canInterrupt}
756+
aria-label={editingMessage ? "Edit your last message" : "Message Claude"}
757+
aria-autocomplete="list"
758+
aria-controls={
759+
showCommandSuggestions && commandSuggestions.length > 0 ? commandListId : undefined
760+
}
761+
aria-expanded={showCommandSuggestions && commandSuggestions.length > 0}
753762
/>
754763
</InputControls>
755764
<ModeToggles>

src/components/ChatInputToast.tsx

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -202,7 +202,7 @@ export const ChatInputToast: React.FC<ChatInputToastProps> = ({ toast, onDismiss
202202
if (isRichError) {
203203
return (
204204
<ToastWrapper>
205-
<ErrorContainer>
205+
<ErrorContainer role="alert" aria-live="assertive">
206206
<div style={{ display: "flex", alignItems: "flex-start", gap: "6px" }}>
207207
<ToastIcon></ToastIcon>
208208
<div style={{ flex: 1 }}>
@@ -212,7 +212,9 @@ export const ChatInputToast: React.FC<ChatInputToastProps> = ({ toast, onDismiss
212212
<ErrorDetails>{toast.message}</ErrorDetails>
213213
{toast.solution && <ErrorSolution>{toast.solution}</ErrorSolution>}
214214
</div>
215-
<CloseButton onClick={handleDismiss}>×</CloseButton>
215+
<CloseButton onClick={handleDismiss} aria-label="Dismiss">
216+
×
217+
</CloseButton>
216218
</div>
217219
</ErrorContainer>
218220
</ToastWrapper>
@@ -222,7 +224,12 @@ export const ChatInputToast: React.FC<ChatInputToastProps> = ({ toast, onDismiss
222224
// Regular toast for simple messages and success
223225
return (
224226
<ToastWrapper>
225-
<ToastContainer type={toast.type} isLeaving={isLeaving}>
227+
<ToastContainer
228+
type={toast.type}
229+
isLeaving={isLeaving}
230+
role={toast.type === "error" ? "alert" : "status"}
231+
aria-live={toast.type === "error" ? "assertive" : "polite"}
232+
>
226233
<ToastIcon>{toast.type === "success" ? "✓" : "⚠"}</ToastIcon>
227234
<ToastContent>
228235
{toast.title && <ToastTitle>{toast.title}</ToastTitle>}

0 commit comments

Comments
 (0)