Module: terminal
Embedded terminals for working on a machine from inside the app.
Status: frontend implemented (D2). The terminal.instance panel (xterm.js)
binds to a backend PTY session over the terminal channel — input/resize up,
output down — and kills its PTY on close. runCommand opens a terminal
pre-running a command (via an initialCommand pane param). The agent's gated
terminal.exec tool is D3.
Contributions to the layout shell
- Panels:
terminal.instance(one PTY per instance, default: bottom dock tab group). - Commands:
terminal.new,terminal.kill(active),terminal.clear(active),terminal.focusNext/focusPrev(cycle, via thelayoutController.focusPaneseam). - Keybindings:
mod+k→terminal.clear, scoped toterminal.instance: it clears the focused terminal (the iTerm/VS Code convention) while shadowing the global command-palette shortcut, which still works everywhere else. See keybinding scopes. - Services for other modules:
runCommand(cmd, opts)(exported asrunTerminalCommand) lets modules (e.g. agent chat showing a suggested command) open a terminal pre-running a command — always visibly, never hidden execution. - Settings:
terminal.fontFamily(enum of common monospace fonts; falls back to the system monospace if the chosen font isn't installed) andterminal.fontSize(px). Read live viauseSettinginTerminalPane; changes apply to open terminals immediately (xterm re-fits and the PTY is resized).
Backend surface
backend/modules/terminal/ owns the PTYs — backend implemented. A
TerminalManager lives per /ws connection and routes the shared socket's
terminal channel; the frontend (xterm.js) will render it. The PTY always runs
where the backend runs — there is no client-side shell in either layout, so
browser and desktop are byte-for-byte identical and agents and humans share the
same terminals. Cross-platform PTYs unify on ptyprocess (POSIX) and pywinpty
(Windows ConPTY), which expose the same PtyProcess.spawn API behind
pty.py; the default shell is PowerShell on Windows. On POSIX it prefers
$SHELL, but that is often unset when the app is launched from a GUI launcher
rather than a terminal (and /bin/bash doesn't exist on every distro, e.g.
NixOS), so it falls back through common shells down to /bin/sh. Each PTY's
blocking read runs in a thread so it never stalls the event loop; sessions are
killed when the socket closes (resume-on-reconnect is a later enhancement). PTY
spawn/IO failures are sent to the pane as an error event and shown inline
([terminal error] …) rather than leaving a blank, dead terminal.
Channel protocol ({channel:'terminal', event, data}):
| Direction | event | data |
|---|---|---|
| client→server | start / input / resize / kill | {id, …} |
| server→client | started / output / exit / error | {id, …} |
Agent integration
Status: implemented (D3). Declared on the terminal.instance panel. Verified
live: the agent ran a command through gated terminal.exec, which prompted with
the rendered {command} shell specifier and ran it in a visible terminal on
approval. The terminal exposes
agent tools & getAgentContext, and is the most
security-sensitive tool surface in the app:
- Read (ungated):
terminal.list()andterminal.read(id)(recent scrollback); the active terminal's cwd viagetAgentContext(). - Gated:
terminal.exec(id?, command). There is no separate "authority" switch — what the agent may run is purely a function of the permission mode and rules: inplan/defaultthe command is prefilled and prompts before running; inautonomousit runs (subject todeny/askrules and therm -rf /circuit breaker). Execution is always visible in a real terminal pane — never hidden.
Matching a shell command safely is the bulk of the permission engine's work. The
terminal's specifiers reuse Claude Code's Bash/PowerShell rule logic: * globs
with word boundaries (terminal.exec(npm run *)), compound-command splitting on
&& || ; | (a rule must match each subcommand), wrapper stripping (timeout,
nice, …), and a built-in read-only allowlist (ls, cat, pwd, …) that runs
without a prompt in every mode.
Browser vs desktop
| Concern | Browser | Desktop |
|---|---|---|
| Where the shell runs | on the backend host (local dev: your machine; remote backend: the server — make this visible in the panel title) | localhost backend = your machine |
| Shell defaults | backend host's default shell | same (PowerShell on Windows) |
| Copy/paste, scrollback, themes | identical (xterm.js) | identical |
| Keybinding capture | browser reserves some chords (e.g. Ctrl+W) — the shell keybinding service must not bind those for terminal focus | full capture available |
Security note: exposing the backend beyond localhost exposes shell execution on that host. Any future remote-access story must address auth at the backend boundary, not in this module.