-
Notifications
You must be signed in to change notification settings - Fork 27
🤖 feat: add workspace scripts with discovery and execution #510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
93154a5
6040004
ba7ad4c
93b27ae
89843b0
73d75c1
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| #!/usr/bin/env bash | ||
| # Description: Demo script to showcase the script execution feature. Accepts no arguments. | ||
| set -euo pipefail | ||
|
|
||
| # Progress messages to stderr (shown to user, not sent to agent) | ||
| echo "Running demo script..." >&2 | ||
| echo "Current workspace: $(pwd)" >&2 | ||
| echo "Timestamp: $(date)" >&2 | ||
|
|
||
| # Structured output to stdout (sent to agent) | ||
| cat <<'EOF' | ||
| ## 🎉 Script Execution Demo | ||
|
|
||
| ✅ Script executed successfully! | ||
|
|
||
| **Output Semantics:** | ||
| - `stdout`: Sent to the agent as tool result | ||
| - `stderr`: Shown to user only (progress/debug info) | ||
|
|
||
| The demo script completed. You can create workspace-specific scripts to automate tasks. | ||
| EOF |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,35 @@ | ||
| #!/usr/bin/env bash | ||
| # Description: Echo arguments demo. Accepts any number of arguments (strings) which will be echoed back. | ||
| set -euo pipefail | ||
|
|
||
| # Check if arguments were provided | ||
| if [ $# -eq 0 ]; then | ||
| cat <<'EOF' | ||
| ## ⚠️ No Arguments Provided | ||
|
|
||
| Usage: `/s echo <message...>` | ||
|
|
||
| Example: `/s echo hello world` | ||
| EOF | ||
| exit 0 | ||
| fi | ||
|
|
||
| # Structured output to stdout (sent to agent) | ||
| cat <<EOF | ||
| ## 🔊 Echo Script | ||
|
|
||
| **You said:** $@ | ||
|
|
||
| **Arguments received:** | ||
| - Count: $# arguments | ||
| - First arg: ${1:-none} | ||
| - Second arg: ${2:-none} | ||
| - All args: $@ | ||
|
|
||
| **Individual arguments:** | ||
| EOF | ||
|
|
||
| # Loop through each argument | ||
| for i in $(seq 1 $#); do | ||
| echo "- Arg $i: ${!i}" | ||
| done |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| #!/usr/bin/env bash | ||
| # Description: Wait for PR checks to pass on GitHub. Use this after pushing changes to origin, to catch CI failures. Accepts no arguments. | ||
| set -euo pipefail | ||
|
|
||
| BRANCH=$(git branch --show-current) | ||
| NUMBER=$(gh pr list --head "$BRANCH" --json number | jq -cr '.[0].number') | ||
| ./scripts/wait_pr_checks.sh "$NUMBER" | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again, would be neat if we could seamlessly link scripts and they behaved well under both execution paradigms
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That makes sense. I'll update the ./scripts/wait_pr_checks.sh script to fetch the branch and number, like in this script, when they're not specified. |
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,47 @@ | ||
| name: Codex Comment Watch | ||
|
|
||
| on: | ||
| issue_comment: | ||
| types: | ||
| - created | ||
| pull_request_review_comment: | ||
| types: | ||
| - created | ||
| pull_request_review: | ||
| types: | ||
| - submitted | ||
|
|
||
| permissions: | ||
| contents: read | ||
| pull-requests: read | ||
|
|
||
| concurrency: | ||
| group: codex-comment-watch-${{ github.event.issue.number || github.event.pull_request.number || github.run_id }} | ||
| cancel-in-progress: true | ||
|
|
||
| jobs: | ||
| check-codex-comments: | ||
| name: Check Codex Comments | ||
| runs-on: ${{ github.repository_owner == 'coder' && 'depot-ubuntu-22.04-16' || 'ubuntu-latest' }} | ||
| if: > | ||
| contains(fromJson('["chatgpt-codex-connector","chatgpt-codex-connector[bot]"]'), github.event.sender.login) | ||
| && (github.event_name != 'issue_comment' || github.event.issue.pull_request != null) | ||
| steps: | ||
| - name: Checkout code | ||
| uses: actions/checkout@v4 | ||
| with: | ||
| fetch-depth: 0 # Required for git describe to find tags | ||
|
|
||
| - name: Determine PR number | ||
| id: determine-pr | ||
| run: | | ||
| if [[ "${{ github.event_name }}" == "issue_comment" ]]; then | ||
| echo "value=${{ github.event.issue.number }}" >> "$GITHUB_OUTPUT" | ||
| else | ||
| echo "value=${{ github.event.pull_request.number }}" >> "$GITHUB_OUTPUT" | ||
| fi | ||
|
|
||
| - name: Check for unresolved Codex comments | ||
| env: | ||
| GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} | ||
| run: ./scripts/check_codex_comments.sh ${{ steps.determine-pr.outputs.value }} |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -13,6 +13,7 @@ | |
| - [SSH](./ssh.md) | ||
| - [Forking](./fork.md) | ||
| - [Init Hooks](./init-hooks.md) | ||
| - [Workspace Scripts](./scripts.md) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "Scripts" is sufficient for name
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed. Since I couldn't think of a better name, I called it 'scripts' for now. Alternatively, we could rename the page to 'Extensions' and include a section for scripts. |
||
| - [VS Code Extension](./vscode-extension.md) | ||
| - [Models](./models.md) | ||
| - [Keyboard Shortcuts](./keybinds.md) | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,187 @@ | ||
| # Workspace Scripts | ||
|
|
||
| Execute custom scripts from your workspace using slash commands or let the AI Agent run them as tools. | ||
|
|
||
| ## Overview | ||
|
|
||
| Scripts are stored in `.mux/scripts/` within each workspace. They serve two purposes: | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should consider using our standard paradigm of coalescing ~/.mux with project .mux. Should also consider suggesting symlink here. |
||
|
|
||
| 1. **Human Use**: Executable via `/script <name>` or `/s <name>` in chat. | ||
| 2. **Agent Use**: Automatically exposed to the AI as tools (`script_<name>`), allowing the agent to run complex workflows you define. | ||
|
|
||
| Scripts run in the workspace directory with full access to project secrets and environment variables. | ||
|
|
||
| **Key Point**: Scripts are workspace-specific. Each workspace has its own custom toolkit defined in `.mux/scripts/`. | ||
|
|
||
| ## Creating Scripts | ||
|
|
||
| 1. **Create the scripts directory**: | ||
|
|
||
| ```bash | ||
| mkdir -p .mux/scripts | ||
| ``` | ||
|
|
||
| 2. **Add an executable script**: | ||
|
|
||
| ```bash | ||
| #!/usr/bin/env bash | ||
| # Description: Deploy to staging. Accepts one optional argument: 'dry-run' to simulate. | ||
|
|
||
| if [ "${1:-}" == "dry-run" ]; then | ||
| echo "Simulating deployment..." | ||
| else | ||
| echo "Deploying to staging..." | ||
| fi | ||
| ``` | ||
|
|
||
| **Crucial**: The `# Description:` line is what the AI reads to understand the tool. Be descriptive about what the script does and what arguments it accepts. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is interesting, but it's a little bit unclear how it would work for non-bash scripts, which should definitely be supported. One idea is calling Some more ideas:
We could also implement comment parsing for common languages. Nothing clean comes to mind unfortunately.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. E.g. i believe we have a .ts script for updating models. On iPad rn so can't easily check.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree that we should support both binaries and non-bash scripts. For Python and TypeScript, this approach should work well, as we can read these files as text. Since our files include a shebang at the top, it's also feasible to add a description right below the shebang. So the current approach should also work for the .ts script for updating models. For binaries, we can introduce a new heuristic in the future: obtain descriptions using the |
||
|
|
||
| 3. **Make it executable**: | ||
|
|
||
| ```bash | ||
| chmod +x .mux/scripts/deploy | ||
| ``` | ||
|
|
||
| ## Agent Integration (AI Tools) | ||
|
|
||
| Every executable script in `.mux/scripts/` is automatically registered as a tool for the AI Agent. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Curious for rationale of this vs. using a single Another issue with a tool for each script is a lot of duplicate token usage for tool description. For example, each script has the same input and output parameters (argv, stdout, exit_code). |
||
|
|
||
| - **Tool Name**: `script_<name>` (e.g., `deploy` -> `script_deploy`, `run-tests` -> `script_run_tests`) | ||
| - **Tool Description**: Taken from the script's header comment (`# Description: ...`). | ||
| - **Arguments**: The AI can pass an array of string arguments to the script. | ||
|
|
||
| ### Optimization for AI | ||
|
|
||
| To make your scripts effective AI tools: | ||
|
|
||
| 1. **Clear Descriptions**: Explicitly state what the script does and what arguments it expects. | ||
|
|
||
| ```bash | ||
| # Description: Fetch logs. Requires one argument: the environment name (dev|prod). | ||
| ``` | ||
|
|
||
| 2. **Robustness**: Use `set -euo pipefail` to ensure the script fails loudly if something goes wrong, allowing the AI to catch the error. | ||
| 3. **Clear Output**: Write structured output to stdout so the agent can understand results and take action. | ||
|
|
||
| ## Usage | ||
|
|
||
| ### Basic Execution | ||
|
|
||
| Type `/s` or `/script` in chat to see available scripts with auto-completion: | ||
|
|
||
| ``` | ||
| /s deploy | ||
| ``` | ||
|
|
||
| ### With Arguments | ||
|
|
||
| Pass arguments to scripts: | ||
|
|
||
| ``` | ||
| /s deploy --dry-run | ||
| /script test --verbose --coverage | ||
| ``` | ||
|
|
||
| Arguments are passed directly to the script as `$1`, `$2`, etc. | ||
|
|
||
| ## Execution Context | ||
|
|
||
| Scripts run with: | ||
|
|
||
| - **Working directory**: The workspace directory. | ||
| - **Environment**: Full workspace environment + project secrets + special cmux variables. | ||
| - **Timeout**: 5 minutes by default. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this would fail for wait_pr_checks generally - suggest not having a timeout or making it longer. User should be able to interrupt hanging script anyways. we should also eventually loop this into our background bash system so that the agent can poll output and do useful work while waiting + cancel an apparently hanging script. |
||
| - **Streams**: stdout/stderr are captured. | ||
| - **Human**: Visible in the chat card. | ||
| - **Agent**: Returned as the tool execution result. | ||
|
|
||
| ### Standard Streams | ||
|
|
||
| Scripts follow Unix conventions for output: | ||
|
|
||
| - **stdout**: Sent to the agent as the tool result. Use this for structured output the agent should act on. | ||
| - **stderr**: Shown to the user in the UI but **not** sent to the agent. Use this for progress messages, logs, or debugging info that doesn't need AI attention. | ||
|
|
||
| This design means scripts work identically whether run inside mux or directly from the command line. | ||
|
|
||
| #### Example: Test Runner | ||
|
|
||
| ```bash | ||
| #!/usr/bin/env bash | ||
| # Description: Run tests and report failures for the agent to fix | ||
|
|
||
| set -euo pipefail | ||
|
|
||
| # Progress to stderr (user sees it, agent doesn't) | ||
| echo "Running test suite..." >&2 | ||
|
|
||
| if npm test > test.log 2>&1; then | ||
| # Success message to stdout (agent sees it) | ||
| echo "✅ All tests passed" | ||
| else | ||
| # Structured failure info to stdout (agent sees and can act on it) | ||
| cat << EOF | ||
| ❌ Tests failed. Here is the log: | ||
| \`\`\` | ||
| $(cat test.log) | ||
| \`\`\` | ||
| Please analyze this error and propose a fix. | ||
| EOF | ||
| exit 1 | ||
| fi | ||
| ``` | ||
|
|
||
| **Result**: | ||
|
|
||
| 1. User sees "Running test suite..." progress message. | ||
| 2. On failure, agent receives the structured error with test log and instructions. | ||
| 3. Agent can immediately analyze and propose fixes. | ||
|
|
||
| ## Example Scripts | ||
|
|
||
| ### Deployment Script | ||
|
|
||
| ```bash | ||
| #!/usr/bin/env bash | ||
| # Description: Deploy application. Accepts one arg: environment (default: staging). | ||
| set -euo pipefail | ||
|
|
||
| ENV=${1:-staging} | ||
| echo "Deploying to $ENV..." | ||
| # ... deployment logic ... | ||
| echo "Deployment complete!" | ||
| ``` | ||
|
|
||
| ### Web Fetch Utility | ||
|
|
||
| ```bash | ||
| #!/usr/bin/env bash | ||
| # Description: Fetch a URL. Accepts exactly one argument: the URL. | ||
| set -euo pipefail | ||
|
|
||
| if [ $# -ne 1 ]; then | ||
| echo "Usage: $0 <url>" | ||
| exit 1 | ||
| fi | ||
| curl -sL "$1" | ||
| ``` | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since I added a web_fetch tool directly we should remove this example to avoid confusion. Unfortunately web_fetch is not well represented as a bash script but could be as a typescript script since we want to filter out a lot of the noise in the HTML and extract just the textual content. |
||
|
|
||
| ## Script Discovery | ||
|
|
||
| - Scripts are discovered automatically from `.mux/scripts/` in the current workspace. | ||
| - Discovery is cached for performance but refreshes intelligently. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Refresh model should be more clearly explained. E.g. should the user expect any edits to the scripts to be immediately reflected or do they have to wait for some timer? Also, it's not totally clear whether we load scripts from the local project dir or the workspace dir. This determines whether we can easily plug into inotify (projects are always local and we can bypass runtime interface) |
||
| - **Sanitization**: Script names are sanitized for tool use (e.g., `my-script.sh` -> `script_my_script_sh`). | ||
|
|
||
| ## Troubleshooting | ||
|
|
||
| **Script not appearing in suggestions or tools?** | ||
|
|
||
| - Ensure file is executable: `chmod +x .mux/scripts/scriptname` | ||
| - Verify file is in `.mux/scripts/` directory. | ||
| - Check for valid description header. | ||
|
|
||
| **Agent using script incorrectly?** | ||
|
|
||
| - Improve the `# Description:` header. Explicitly tell the agent what arguments to pass. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We could define extraction as the first non-shebang comment. IMO it should include multi-line comments. |
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we symlink
scripts -> .mux/scriptsin our repo? I assume that should be standard practice for any mux heavy repo.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think that makes sense.