# Browser Automation CLI Reference Technical reference for the `browse` CLI tool. ## Table of Contents - [Architecture](#architecture) - [Command Reference](#command-reference) - [Navigation](#navigation) - [Page State](#page-state) - [Interaction](#interaction) - [Session Management](#session-management) - [JavaScript Evaluation](#javascript-evaluation) - [Viewport](#viewport) - [Network Capture](#network-capture) - [Configuration](#configuration) - [Global Flags](#global-flags) - [Environment Variables](#environment-variables) - [Error Messages](#error-messages) ## Architecture The browse CLI is a **daemon-based** command-line tool: - **Daemon process**: A background process manages the browser instance. Auto-starts on the first command (e.g., `browse open`), persists across commands, and stops with `browse stop`. - **Local mode** (default): Launches a local Chrome/Chromium instance. - **Remote mode** (Browserbase): Connects to a Browserbase cloud browser session when `BROWSERBASE_API_KEY` and `BROWSERBASE_PROJECT_ID` are set. - **Accessibility-first**: Use `browse snapshot` to get the page's accessibility tree with element refs, then interact using those refs. ## Command Reference ### Navigation #### `open ` Navigate to a URL. Alias: `goto`. Auto-starts the daemon if not running. ```bash browse open https://example.com browse open https://example.com --wait networkidle # wait for all network requests to finish (useful for SPAs) browse open https://example.com --wait domcontentloaded ``` The `--wait` flag controls when navigation is considered complete. Values: `load` (default), `domcontentloaded`, `networkidle`. Use `networkidle` for JavaScript-heavy pages that fetch data after initial load. #### `reload` Reload the current page. ```bash browse reload ``` #### `back` / `forward` Navigate browser history. ```bash browse back browse forward ``` --- ### Page State #### `snapshot` Get the accessibility tree with interactive element refs. This is the primary way to understand page structure. ```bash browse snapshot browse snapshot --compact # tree only, no ref maps ``` Returns a text representation of the page with refs like `@0-5` that can be passed to `click`. Use `--compact` for shorter output when you only need the tree. #### `screenshot [path]` Take a visual screenshot. Slower than snapshot and uses vision tokens. ```bash browse screenshot # auto-generated path browse screenshot ./capture.png # custom path browse screenshot --full-page # capture entire scrollable page ``` #### `get [selector]` Get page properties. Available properties: `url`, `title`, `text`, `html`, `value`, `box`, `visible`, `checked`. ```bash browse get url # current URL browse get title # page title browse get text "body" # all visible text (selector required) browse get text ".product-info" # text within a CSS selector browse get html "#main" # inner HTML of an element browse get value "#email-input" # value of a form field browse get box "#header" # bounding box (centroid coordinates) browse get visible ".modal" # check if element is visible browse get checked "#agree" # check if checkbox/radio is checked ``` **Note**: `get text` requires a CSS selector argument — use `"body"` for full page text. #### `refs` Show the cached ref map from the last `browse snapshot`. Useful for looking up element refs without re-running a full snapshot. ```bash browse refs ``` --- ### Interaction #### `click ` Click an element by its ref from `browse snapshot` output. ```bash browse click @0-5 # click element with ref 0-5 ``` #### `click_xy ` Click at exact viewport coordinates. ```bash browse click_xy 500 300 ``` #### `hover ` Hover at viewport coordinates. ```bash browse hover 500 300 ``` #### `type ` Type text into the currently focused element. ```bash browse type "Hello, world!" browse type "slow typing" --delay 100 # 100ms between keystrokes browse type "human-like" --mistakes # simulate human typing with typos ``` #### `fill ` Fill an input element matching a CSS selector and press Enter. ```bash browse fill "#search" "browser automation" browse fill "input[name=email]" "user@example.com" browse fill "#search" "query" --no-press-enter # fill without pressing Enter ``` #### `select ` Select option(s) from a dropdown. ```bash browse select "#country" "United States" browse select "#tags" "javascript" "typescript" # multi-select ``` #### `press ` Press a keyboard key or key combination. ```bash browse press Enter browse press Tab browse press Escape browse press Cmd+A # select all (Mac) browse press Ctrl+C # copy (Linux/Windows) ``` #### `scroll ` Scroll at a given position by a given amount. ```bash browse scroll 500 300 0 -300 # scroll up at (500, 300) browse scroll 500 300 0 500 # scroll down ``` #### `drag ` Drag from one viewport coordinate to another. ```bash browse drag 80 80 310 100 # drag with default 10 steps browse drag 80 80 310 100 --steps 20 # more intermediate steps browse drag 80 80 310 100 --delay 50 # 50ms between steps browse drag 80 80 310 100 --button right # use right mouse button browse drag 80 80 310 100 --xpath # return source/target XPaths ``` #### `highlight ` Highlight an element on the page for visual debugging. ```bash browse highlight "#submit-btn" # highlight for 2 seconds (default) browse highlight ".nav" -d 5000 # highlight for 5 seconds ``` #### `is ` Check element state. Available checks: `visible`, `checked`. ```bash browse is visible ".modal" # returns { visible: true/false } browse is checked "#agree" # returns { checked: true/false } ``` #### `wait [arg]` Wait for a condition. ```bash browse wait load # wait for page load browse wait "selector" ".results" # wait for element to appear browse wait timeout 3000 # wait 3 seconds ``` --- ### Session Management #### `start` Start the browser daemon manually. Usually not needed — the daemon auto-starts on first command. ```bash browse start ``` #### `stop` Stop the browser daemon and close the browser. ```bash browse stop browse stop --force # force kill if daemon is unresponsive ``` #### `status` Check whether the daemon is running, its connection details, and current environment. ```bash browse status ``` #### `env [local|remote]` Show or switch the browser environment. Without arguments, prints the current environment. With an argument, stops the running daemon and restarts in the specified environment. The switch is sticky — subsequent commands stay in the chosen environment until you switch again or run `browse stop`. ```bash browse env # print current environment browse env local # switch to local Chrome browse env remote # switch to Browserbase (requires API keys) ``` #### `newpage [url]` Create a new tab, optionally navigating to a URL. ```bash browse newpage # open blank tab browse newpage https://example.com # open tab with URL ``` #### `pages` List all open tabs. ```bash browse pages ``` #### `tab_switch ` Switch to a tab by its index (from `browse pages`). ```bash browse tab_switch 1 ``` #### `tab_close [index]` Close a tab. Closes current tab if no index given. ```bash browse tab_close # close current tab browse tab_close 2 # close tab at index 2 ``` --- ### JavaScript Evaluation #### `eval ` Evaluate JavaScript in the page context. ```bash browse eval "document.title" browse eval "document.querySelectorAll('a').length" ``` --- ### Viewport #### `viewport ` Set the browser viewport size. ```bash browse viewport 1920 1080 ``` --- ### Network Capture Capture network requests to the filesystem for inspection. #### `network on` Enable network request capture. Creates a temp directory where requests and responses are saved as JSON files. ```bash browse network on ``` #### `network off` Disable network capture. ```bash browse network off ``` #### `network path` Show the capture directory path. ```bash browse network path ``` #### `network clear` Clear all captured requests. ```bash browse network clear ``` --- ## Configuration ### Global Flags #### `--json` Output as JSON for all commands. Useful for structured, parseable output. ```bash browse --json get url # returns {"url": "https://..."} browse --json snapshot # returns JSON accessibility tree ``` #### `--session ` Run commands against a named session, enabling multiple concurrent browsers. ```bash browse --session work open https://a.com browse --session personal open https://b.com ``` ### Environment Variables | Variable | Required | Description | |----------|----------|-------------| | `BROWSERBASE_API_KEY` | For remote mode | API key from https://browserbase.com/settings | | `BROWSERBASE_PROJECT_ID` | For remote mode | Project ID from Browserbase dashboard | When both are set, the CLI uses Browserbase remote sessions. Otherwise, it falls back to local Chrome. ### Setting credentials ```bash export BROWSERBASE_API_KEY="bb_live_..." export BROWSERBASE_PROJECT_ID="proj_..." ``` Get these values from https://browserbase.com/settings. --- ## Error Messages **"No active page"** - The daemon is running but has no page open. - Fix: Run `browse open `. If the issue persists, run `browse stop` and retry. For zombie daemons: `pkill -f "browse.*daemon"`. **"Chrome not found"** / **"Could not find local Chrome installation"** - Chrome/Chromium is not installed or not in a standard location. - Fix: Install Chrome, or switch to remote with `browse env remote` (no local browser needed). **"Daemon not running"** - No daemon process is active. Most commands auto-start the daemon, but `snapshot`, `click`, etc. require an active session. - Fix: Run `browse open ` to start a session. **Element ref not found (e.g., "@0-5")** - The ref from a previous snapshot is no longer valid (page changed). - Fix: Run `browse snapshot` again to get fresh refs. **Timeout errors** - The page took too long to load or an element didn't appear. - Fix: Try `browse wait load` before interacting, or increase wait time.