Initial commit with translated description
This commit is contained in:
109
EXAMPLES.md
Normal file
109
EXAMPLES.md
Normal file
@@ -0,0 +1,109 @@
|
||||
# Browser Automation Examples
|
||||
|
||||
Common browser automation workflows using the `browse` CLI. Each example demonstrates a distinct pattern using real commands.
|
||||
|
||||
## Example 1: Extract Data from a Page
|
||||
|
||||
**User request**: "Get the product details from example.com/product/123"
|
||||
|
||||
```bash
|
||||
browse open https://example.com/product/123
|
||||
browse snapshot # read page structure + element refs
|
||||
browse get text "body" # extract all visible text content
|
||||
browse stop
|
||||
```
|
||||
|
||||
Parse the text output to extract structured data (name, price, description, etc.).
|
||||
|
||||
For a specific section, use a CSS selector:
|
||||
|
||||
```bash
|
||||
browse get text ".product-details" # text from a specific container
|
||||
```
|
||||
|
||||
**Note**: `browse get text` requires a CSS selector — use `"body"` for all page text.
|
||||
|
||||
## Example 2: Fill and Submit a Form
|
||||
|
||||
**User request**: "Fill out the contact form on example.com with my information"
|
||||
|
||||
```bash
|
||||
browse open https://example.com/contact
|
||||
browse snapshot # find form fields and their refs
|
||||
browse click @0-3 # click the Name input (ref from snapshot)
|
||||
browse type "John Doe"
|
||||
browse press Tab # move to next field
|
||||
browse type "john@example.com"
|
||||
browse fill "#message" "I would like to inquire about your services"
|
||||
browse snapshot # verify fields are filled
|
||||
browse click @0-8 # click Submit button (ref from snapshot)
|
||||
browse snapshot # confirm submission result
|
||||
browse stop
|
||||
```
|
||||
|
||||
**Key pattern**: Use `browse snapshot` before interacting to discover element refs, then `browse click <ref>` and `browse type` to interact.
|
||||
|
||||
## Example 3: Multi-Step Navigation
|
||||
|
||||
**User request**: "Get headlines from the first 3 pages of results on example.com/news"
|
||||
|
||||
```bash
|
||||
browse open https://example.com/news
|
||||
browse snapshot # read page 1 content
|
||||
browse get text ".headline" # extract headlines
|
||||
|
||||
browse snapshot # find "Next" button ref
|
||||
browse click @0-12 # click Next (ref from snapshot)
|
||||
browse wait load # wait for page 2 to load
|
||||
browse get text ".headline" # extract page 2 headlines
|
||||
|
||||
browse snapshot # find Next again (ref may change)
|
||||
browse click @0-15 # click Next
|
||||
browse wait load
|
||||
browse get text ".headline" # extract page 3 headlines
|
||||
|
||||
browse stop
|
||||
```
|
||||
|
||||
**Key pattern**: Re-run `browse snapshot` after each navigation because element refs change when the page updates.
|
||||
|
||||
## Example 4: Escalate to Remote Mode
|
||||
|
||||
**User request**: "Scrape pricing from competitor.com" (a site with Cloudflare protection)
|
||||
|
||||
```bash
|
||||
# Attempt 1: local mode
|
||||
browse open https://competitor.com/pricing
|
||||
browse snapshot
|
||||
# Output shows: "Checking your browser..." (Cloudflare interstitial)
|
||||
# or: page content is empty / access denied
|
||||
browse stop
|
||||
```
|
||||
|
||||
The agent detects bot protection and tells the user:
|
||||
|
||||
> This site has Cloudflare bot detection. Browserbase remote mode can bypass this with anti-bot stealth and residential proxies. Want me to set it up?
|
||||
|
||||
If the user agrees:
|
||||
|
||||
```bash
|
||||
# Set Browserbase credentials
|
||||
export BROWSERBASE_API_KEY="bb_live_..."
|
||||
export BROWSERBASE_PROJECT_ID="proj_..."
|
||||
|
||||
# Retry in remote mode
|
||||
browse env remote
|
||||
browse open https://competitor.com/pricing
|
||||
browse snapshot # full page content now accessible
|
||||
browse get text ".pricing-table"
|
||||
browse stop
|
||||
```
|
||||
|
||||
## Tips
|
||||
|
||||
- **Snapshot first**: Always run `browse snapshot` before interacting — it gives you the accessibility tree with element refs
|
||||
- **Use refs to click**: `browse click @0-5` is more reliable than trying to describe elements
|
||||
- **Re-snapshot after actions**: Element refs change when the page updates
|
||||
- **`get text` for data extraction**: Use `browse get text [selector]` to pull text content from specific elements
|
||||
- **`stop` when done**: Always `browse stop` to clean up the browser session
|
||||
- **Prefer snapshot over screenshot**: Snapshot is fast and structured; screenshot is slow and uses vision tokens. Only screenshot when you need visual context (layout, images, debugging)
|
||||
21
LICENSE.txt
Normal file
21
LICENSE.txt
Normal file
@@ -0,0 +1,21 @@
|
||||
MIT License
|
||||
|
||||
Copyright (c) 2026 Browserbase, Inc.
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE.
|
||||
432
REFERENCE.md
Normal file
432
REFERENCE.md
Normal file
@@ -0,0 +1,432 @@
|
||||
# Browser Automation CLI Reference
|
||||
|
||||
Technical reference for the `browse` CLI tool.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [Architecture](#architecture)
|
||||
- [Command Reference](#command-reference)
|
||||
- [Navigation](#navigation)
|
||||
- [Page State](#page-state)
|
||||
- [Interaction](#interaction)
|
||||
- [Session Management](#session-management)
|
||||
- [JavaScript Evaluation](#javascript-evaluation)
|
||||
- [Viewport](#viewport)
|
||||
- [Network Capture](#network-capture)
|
||||
- [Configuration](#configuration)
|
||||
- [Global Flags](#global-flags)
|
||||
- [Environment Variables](#environment-variables)
|
||||
- [Error Messages](#error-messages)
|
||||
|
||||
## Architecture
|
||||
|
||||
The browse CLI is a **daemon-based** command-line tool:
|
||||
|
||||
- **Daemon process**: A background process manages the browser instance. Auto-starts on the first command (e.g., `browse open`), persists across commands, and stops with `browse stop`.
|
||||
- **Local mode** (default): Launches a local Chrome/Chromium instance.
|
||||
- **Remote mode** (Browserbase): Connects to a Browserbase cloud browser session when `BROWSERBASE_API_KEY` and `BROWSERBASE_PROJECT_ID` are set.
|
||||
- **Accessibility-first**: Use `browse snapshot` to get the page's accessibility tree with element refs, then interact using those refs.
|
||||
|
||||
## Command Reference
|
||||
|
||||
### Navigation
|
||||
|
||||
#### `open <url>`
|
||||
|
||||
Navigate to a URL. Alias: `goto`. Auto-starts the daemon if not running.
|
||||
|
||||
```bash
|
||||
browse open https://example.com
|
||||
browse open https://example.com --wait networkidle # wait for all network requests to finish (useful for SPAs)
|
||||
browse open https://example.com --wait domcontentloaded
|
||||
```
|
||||
|
||||
The `--wait` flag controls when navigation is considered complete. Values: `load` (default), `domcontentloaded`, `networkidle`. Use `networkidle` for JavaScript-heavy pages that fetch data after initial load.
|
||||
|
||||
#### `reload`
|
||||
|
||||
Reload the current page.
|
||||
|
||||
```bash
|
||||
browse reload
|
||||
```
|
||||
|
||||
#### `back` / `forward`
|
||||
|
||||
Navigate browser history.
|
||||
|
||||
```bash
|
||||
browse back
|
||||
browse forward
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Page State
|
||||
|
||||
#### `snapshot`
|
||||
|
||||
Get the accessibility tree with interactive element refs. This is the primary way to understand page structure.
|
||||
|
||||
```bash
|
||||
browse snapshot
|
||||
browse snapshot --compact # tree only, no ref maps
|
||||
```
|
||||
|
||||
Returns a text representation of the page with refs like `@0-5` that can be passed to `click`. Use `--compact` for shorter output when you only need the tree.
|
||||
|
||||
#### `screenshot [path]`
|
||||
|
||||
Take a visual screenshot. Slower than snapshot and uses vision tokens.
|
||||
|
||||
```bash
|
||||
browse screenshot # auto-generated path
|
||||
browse screenshot ./capture.png # custom path
|
||||
browse screenshot --full-page # capture entire scrollable page
|
||||
```
|
||||
|
||||
#### `get <property> [selector]`
|
||||
|
||||
Get page properties. Available properties: `url`, `title`, `text`, `html`, `value`, `box`, `visible`, `checked`.
|
||||
|
||||
```bash
|
||||
browse get url # current URL
|
||||
browse get title # page title
|
||||
browse get text "body" # all visible text (selector required)
|
||||
browse get text ".product-info" # text within a CSS selector
|
||||
browse get html "#main" # inner HTML of an element
|
||||
browse get value "#email-input" # value of a form field
|
||||
browse get box "#header" # bounding box (centroid coordinates)
|
||||
browse get visible ".modal" # check if element is visible
|
||||
browse get checked "#agree" # check if checkbox/radio is checked
|
||||
```
|
||||
|
||||
**Note**: `get text` requires a CSS selector argument — use `"body"` for full page text.
|
||||
|
||||
#### `refs`
|
||||
|
||||
Show the cached ref map from the last `browse snapshot`. Useful for looking up element refs without re-running a full snapshot.
|
||||
|
||||
```bash
|
||||
browse refs
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Interaction
|
||||
|
||||
#### `click <ref>`
|
||||
|
||||
Click an element by its ref from `browse snapshot` output.
|
||||
|
||||
```bash
|
||||
browse click @0-5 # click element with ref 0-5
|
||||
```
|
||||
|
||||
#### `click_xy <x> <y>`
|
||||
|
||||
Click at exact viewport coordinates.
|
||||
|
||||
```bash
|
||||
browse click_xy 500 300
|
||||
```
|
||||
|
||||
#### `hover <x> <y>`
|
||||
|
||||
Hover at viewport coordinates.
|
||||
|
||||
```bash
|
||||
browse hover 500 300
|
||||
```
|
||||
|
||||
#### `type <text>`
|
||||
|
||||
Type text into the currently focused element.
|
||||
|
||||
```bash
|
||||
browse type "Hello, world!"
|
||||
browse type "slow typing" --delay 100 # 100ms between keystrokes
|
||||
browse type "human-like" --mistakes # simulate human typing with typos
|
||||
```
|
||||
|
||||
#### `fill <selector> <value>`
|
||||
|
||||
Fill an input element matching a CSS selector and press Enter.
|
||||
|
||||
```bash
|
||||
browse fill "#search" "browser automation"
|
||||
browse fill "input[name=email]" "user@example.com"
|
||||
browse fill "#search" "query" --no-press-enter # fill without pressing Enter
|
||||
```
|
||||
|
||||
#### `select <selector> <values...>`
|
||||
|
||||
Select option(s) from a dropdown.
|
||||
|
||||
```bash
|
||||
browse select "#country" "United States"
|
||||
browse select "#tags" "javascript" "typescript" # multi-select
|
||||
```
|
||||
|
||||
#### `press <key>`
|
||||
|
||||
Press a keyboard key or key combination.
|
||||
|
||||
```bash
|
||||
browse press Enter
|
||||
browse press Tab
|
||||
browse press Escape
|
||||
browse press Cmd+A # select all (Mac)
|
||||
browse press Ctrl+C # copy (Linux/Windows)
|
||||
```
|
||||
|
||||
#### `scroll <x> <y> <deltaX> <deltaY>`
|
||||
|
||||
Scroll at a given position by a given amount.
|
||||
|
||||
```bash
|
||||
browse scroll 500 300 0 -300 # scroll up at (500, 300)
|
||||
browse scroll 500 300 0 500 # scroll down
|
||||
```
|
||||
|
||||
#### `drag <fromX> <fromY> <toX> <toY>`
|
||||
|
||||
Drag from one viewport coordinate to another.
|
||||
|
||||
```bash
|
||||
browse drag 80 80 310 100 # drag with default 10 steps
|
||||
browse drag 80 80 310 100 --steps 20 # more intermediate steps
|
||||
browse drag 80 80 310 100 --delay 50 # 50ms between steps
|
||||
browse drag 80 80 310 100 --button right # use right mouse button
|
||||
browse drag 80 80 310 100 --xpath # return source/target XPaths
|
||||
```
|
||||
|
||||
#### `highlight <selector>`
|
||||
|
||||
Highlight an element on the page for visual debugging.
|
||||
|
||||
```bash
|
||||
browse highlight "#submit-btn" # highlight for 2 seconds (default)
|
||||
browse highlight ".nav" -d 5000 # highlight for 5 seconds
|
||||
```
|
||||
|
||||
#### `is <check> <selector>`
|
||||
|
||||
Check element state. Available checks: `visible`, `checked`.
|
||||
|
||||
```bash
|
||||
browse is visible ".modal" # returns { visible: true/false }
|
||||
browse is checked "#agree" # returns { checked: true/false }
|
||||
```
|
||||
|
||||
#### `wait <type> [arg]`
|
||||
|
||||
Wait for a condition.
|
||||
|
||||
```bash
|
||||
browse wait load # wait for page load
|
||||
browse wait "selector" ".results" # wait for element to appear
|
||||
browse wait timeout 3000 # wait 3 seconds
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Session Management
|
||||
|
||||
#### `start`
|
||||
|
||||
Start the browser daemon manually. Usually not needed — the daemon auto-starts on first command.
|
||||
|
||||
```bash
|
||||
browse start
|
||||
```
|
||||
|
||||
#### `stop`
|
||||
|
||||
Stop the browser daemon and close the browser.
|
||||
|
||||
```bash
|
||||
browse stop
|
||||
browse stop --force # force kill if daemon is unresponsive
|
||||
```
|
||||
|
||||
#### `status`
|
||||
|
||||
Check whether the daemon is running, its connection details, and current environment.
|
||||
|
||||
```bash
|
||||
browse status
|
||||
```
|
||||
|
||||
#### `env [local|remote]`
|
||||
|
||||
Show or switch the browser environment. Without arguments, prints the current environment. With an argument, stops the running daemon and restarts in the specified environment. The switch is sticky — subsequent commands stay in the chosen environment until you switch again or run `browse stop`.
|
||||
|
||||
```bash
|
||||
browse env # print current environment
|
||||
browse env local # switch to local Chrome
|
||||
browse env remote # switch to Browserbase (requires API keys)
|
||||
```
|
||||
|
||||
#### `newpage [url]`
|
||||
|
||||
Create a new tab, optionally navigating to a URL.
|
||||
|
||||
```bash
|
||||
browse newpage # open blank tab
|
||||
browse newpage https://example.com # open tab with URL
|
||||
```
|
||||
|
||||
#### `pages`
|
||||
|
||||
List all open tabs.
|
||||
|
||||
```bash
|
||||
browse pages
|
||||
```
|
||||
|
||||
#### `tab_switch <index>`
|
||||
|
||||
Switch to a tab by its index (from `browse pages`).
|
||||
|
||||
```bash
|
||||
browse tab_switch 1
|
||||
```
|
||||
|
||||
#### `tab_close [index]`
|
||||
|
||||
Close a tab. Closes current tab if no index given.
|
||||
|
||||
```bash
|
||||
browse tab_close # close current tab
|
||||
browse tab_close 2 # close tab at index 2
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### JavaScript Evaluation
|
||||
|
||||
#### `eval <expression>`
|
||||
|
||||
Evaluate JavaScript in the page context.
|
||||
|
||||
```bash
|
||||
browse eval "document.title"
|
||||
browse eval "document.querySelectorAll('a').length"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Viewport
|
||||
|
||||
#### `viewport <width> <height>`
|
||||
|
||||
Set the browser viewport size.
|
||||
|
||||
```bash
|
||||
browse viewport 1920 1080
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Network Capture
|
||||
|
||||
Capture network requests to the filesystem for inspection.
|
||||
|
||||
#### `network on`
|
||||
|
||||
Enable network request capture. Creates a temp directory where requests and responses are saved as JSON files.
|
||||
|
||||
```bash
|
||||
browse network on
|
||||
```
|
||||
|
||||
#### `network off`
|
||||
|
||||
Disable network capture.
|
||||
|
||||
```bash
|
||||
browse network off
|
||||
```
|
||||
|
||||
#### `network path`
|
||||
|
||||
Show the capture directory path.
|
||||
|
||||
```bash
|
||||
browse network path
|
||||
```
|
||||
|
||||
#### `network clear`
|
||||
|
||||
Clear all captured requests.
|
||||
|
||||
```bash
|
||||
browse network clear
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Configuration
|
||||
|
||||
### Global Flags
|
||||
|
||||
#### `--json`
|
||||
|
||||
Output as JSON for all commands. Useful for structured, parseable output.
|
||||
|
||||
```bash
|
||||
browse --json get url # returns {"url": "https://..."}
|
||||
browse --json snapshot # returns JSON accessibility tree
|
||||
```
|
||||
|
||||
#### `--session <name>`
|
||||
|
||||
Run commands against a named session, enabling multiple concurrent browsers.
|
||||
|
||||
```bash
|
||||
browse --session work open https://a.com
|
||||
browse --session personal open https://b.com
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
|
||||
| Variable | Required | Description |
|
||||
|----------|----------|-------------|
|
||||
| `BROWSERBASE_API_KEY` | For remote mode | API key from https://browserbase.com/settings |
|
||||
| `BROWSERBASE_PROJECT_ID` | For remote mode | Project ID from Browserbase dashboard |
|
||||
|
||||
When both are set, the CLI uses Browserbase remote sessions. Otherwise, it falls back to local Chrome.
|
||||
|
||||
### Setting credentials
|
||||
|
||||
```bash
|
||||
export BROWSERBASE_API_KEY="bb_live_..."
|
||||
export BROWSERBASE_PROJECT_ID="proj_..."
|
||||
```
|
||||
|
||||
Get these values from https://browserbase.com/settings.
|
||||
|
||||
---
|
||||
|
||||
## Error Messages
|
||||
|
||||
**"No active page"**
|
||||
- The daemon is running but has no page open.
|
||||
- Fix: Run `browse open <url>`. If the issue persists, run `browse stop` and retry. For zombie daemons: `pkill -f "browse.*daemon"`.
|
||||
|
||||
**"Chrome not found"** / **"Could not find local Chrome installation"**
|
||||
- Chrome/Chromium is not installed or not in a standard location.
|
||||
- Fix: Install Chrome, or switch to remote with `browse env remote` (no local browser needed).
|
||||
|
||||
**"Daemon not running"**
|
||||
- No daemon process is active. Most commands auto-start the daemon, but `snapshot`, `click`, etc. require an active session.
|
||||
- Fix: Run `browse open <url>` to start a session.
|
||||
|
||||
**Element ref not found (e.g., "@0-5")**
|
||||
- The ref from a previous snapshot is no longer valid (page changed).
|
||||
- Fix: Run `browse snapshot` again to get fresh refs.
|
||||
|
||||
**Timeout errors**
|
||||
- The page took too long to load or an element didn't appear.
|
||||
- Fix: Try `browse wait load` before interacting, or increase wait time.
|
||||
161
SKILL.md
Normal file
161
SKILL.md
Normal file
@@ -0,0 +1,161 @@
|
||||
---
|
||||
name: browser
|
||||
description: "使用自然语言通过CLI命令自动化网页浏览器交互。"
|
||||
compatibility: "Requires the browse CLI (`npm install -g @browserbasehq/browse-cli`). Optional: set BROWSERBASE_API_KEY and BROWSERBASE_PROJECT_ID for remote Browserbase sessions; falls back to local Chrome otherwise."
|
||||
license: MIT
|
||||
allowed-tools: Bash
|
||||
metadata:
|
||||
openclaw:
|
||||
requires:
|
||||
bins:
|
||||
- browse
|
||||
install:
|
||||
- kind: node
|
||||
package: "@browserbasehq/browse-cli"
|
||||
bins: [browse]
|
||||
homepage: https://github.com/browserbase/skills
|
||||
---
|
||||
|
||||
# Browser Automation
|
||||
|
||||
Automate browser interactions using the browse CLI with Claude.
|
||||
|
||||
## Setup check
|
||||
|
||||
Before running any browser commands, verify the CLI is available:
|
||||
|
||||
```bash
|
||||
which browse || npm install -g @browserbasehq/browse-cli
|
||||
```
|
||||
|
||||
## Environment Selection (Local vs Remote)
|
||||
|
||||
The CLI automatically selects between local and remote browser environments based on available configuration:
|
||||
|
||||
### Local mode (default)
|
||||
- Uses local Chrome — no API keys needed
|
||||
- Best for: development, simple pages, trusted sites with no bot protection
|
||||
|
||||
### Remote mode (Browserbase)
|
||||
- Activated when `BROWSERBASE_API_KEY` and `BROWSERBASE_PROJECT_ID` are set
|
||||
- Provides: anti-bot stealth, automatic CAPTCHA solving, residential proxies, session persistence
|
||||
- **Use remote mode when:** the target site has bot detection, CAPTCHAs, IP rate limiting, Cloudflare protection, or requires geo-specific access
|
||||
- Get credentials at https://browserbase.com/settings
|
||||
|
||||
### When to choose which
|
||||
- **Simple browsing** (docs, wikis, public APIs): local mode is fine
|
||||
- **Protected sites** (login walls, CAPTCHAs, anti-scraping): use remote mode
|
||||
- **If local mode fails** with bot detection or access denied: switch to remote mode
|
||||
|
||||
## Commands
|
||||
|
||||
All commands work identically in both modes. The daemon auto-starts on first command.
|
||||
|
||||
### Navigation
|
||||
```bash
|
||||
browse open <url> # Go to URL (aliases: goto)
|
||||
browse reload # Reload current page
|
||||
browse back # Go back in history
|
||||
browse forward # Go forward in history
|
||||
```
|
||||
|
||||
### Page state (prefer snapshot over screenshot)
|
||||
```bash
|
||||
browse snapshot # Get accessibility tree with element refs (fast, structured)
|
||||
browse screenshot [path] # Take visual screenshot (slow, uses vision tokens)
|
||||
browse get url # Get current URL
|
||||
browse get title # Get page title
|
||||
browse get text <selector> # Get text content (use "body" for all text)
|
||||
browse get html <selector> # Get HTML content of element
|
||||
browse get value <selector> # Get form field value
|
||||
```
|
||||
|
||||
Use `browse snapshot` as your default for understanding page state — it returns the accessibility tree with element refs you can use to interact. Only use `browse screenshot` when you need visual context (layout, images, debugging).
|
||||
|
||||
### Interaction
|
||||
```bash
|
||||
browse click <ref> # Click element by ref from snapshot (e.g., @0-5)
|
||||
browse type <text> # Type text into focused element
|
||||
browse fill <selector> <value> # Fill input and press Enter
|
||||
browse select <selector> <values...> # Select dropdown option(s)
|
||||
browse press <key> # Press key (Enter, Tab, Escape, Cmd+A, etc.)
|
||||
browse drag <fromX> <fromY> <toX> <toY> # Drag from one point to another
|
||||
browse scroll <x> <y> <deltaX> <deltaY> # Scroll at coordinates
|
||||
browse highlight <selector> # Highlight element on page
|
||||
browse is visible <selector> # Check if element is visible
|
||||
browse is checked <selector> # Check if element is checked
|
||||
browse wait <type> [arg] # Wait for: load, selector, timeout
|
||||
```
|
||||
|
||||
### Session management
|
||||
```bash
|
||||
browse stop # Stop the browser daemon
|
||||
browse status # Check daemon status (includes env)
|
||||
browse env # Show current environment (local or remote)
|
||||
browse env local # Switch to local Chrome
|
||||
browse env remote # Switch to Browserbase (requires API keys)
|
||||
browse pages # List all open tabs
|
||||
browse tab_switch <index> # Switch to tab by index
|
||||
browse tab_close [index] # Close tab
|
||||
```
|
||||
|
||||
### Typical workflow
|
||||
1. `browse open <url>` — navigate to the page
|
||||
2. `browse snapshot` — read the accessibility tree to understand page structure and get element refs
|
||||
3. `browse click <ref>` / `browse type <text>` / `browse fill <selector> <value>` — interact using refs from snapshot
|
||||
4. `browse snapshot` — confirm the action worked
|
||||
5. Repeat 3-4 as needed
|
||||
6. `browse stop` — close the browser when done
|
||||
|
||||
## Quick Example
|
||||
|
||||
```bash
|
||||
browse open https://example.com
|
||||
browse snapshot # see page structure + element refs
|
||||
browse click @0-5 # click element with ref 0-5
|
||||
browse get title
|
||||
browse stop
|
||||
```
|
||||
|
||||
## Mode Comparison
|
||||
|
||||
| Feature | Local | Browserbase |
|
||||
|---------|-------|-------------|
|
||||
| Speed | Faster | Slightly slower |
|
||||
| Setup | Chrome required | API key required |
|
||||
| Stealth mode | No | Yes (custom Chromium, anti-bot fingerprinting) |
|
||||
| CAPTCHA solving | No | Yes (automatic reCAPTCHA/hCaptcha) |
|
||||
| Residential proxies | No | Yes (201 countries, geo-targeting) |
|
||||
| Session persistence | No | Yes (cookies/auth persist across sessions) |
|
||||
| Best for | Development/simple pages | Protected sites, bot detection, production scraping |
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Always `browse open` first** before interacting
|
||||
2. **Use `browse snapshot`** to check page state — it's fast and gives you element refs
|
||||
3. **Only screenshot when visual context is needed** (layout checks, images, debugging)
|
||||
4. **Use refs from snapshot** to click/interact — e.g., `browse click @0-5`
|
||||
5. **`browse stop`** when done to clean up the browser session
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
- **"No active page"**: Run `browse stop`, then check `browse status`. If it still says running, kill the zombie daemon with `pkill -f "browse.*daemon"`, then retry `browse open`
|
||||
- **Chrome not found**: Install Chrome or use `browse env remote`
|
||||
- **Action fails**: Run `browse snapshot` to see available elements and their refs
|
||||
- **Browserbase fails**: Verify API key and project ID are set
|
||||
|
||||
## Switching to Remote Mode
|
||||
|
||||
Switch to remote when you detect: CAPTCHAs (reCAPTCHA, hCaptcha, Turnstile), bot detection pages ("Checking your browser..."), HTTP 403/429, empty pages on sites that should have content, or the user asks for it.
|
||||
|
||||
Don't switch for simple sites (docs, wikis, public APIs, localhost).
|
||||
|
||||
```bash
|
||||
browse env remote # switch to Browserbase
|
||||
browse env local # switch back to local Chrome
|
||||
```
|
||||
|
||||
The switch is sticky until you run `browse stop` or switch again.
|
||||
|
||||
For detailed examples, see [EXAMPLES.md](EXAMPLES.md).
|
||||
For API reference, see [REFERENCE.md](REFERENCE.md).
|
||||
6
_meta.json
Normal file
6
_meta.json
Normal file
@@ -0,0 +1,6 @@
|
||||
{
|
||||
"ownerId": "kn7f3h94x6dsndkdjph76br4pd803szg",
|
||||
"slug": "browse",
|
||||
"version": "2.0.2",
|
||||
"publishedAt": 1772680539406
|
||||
}
|
||||
Reference in New Issue
Block a user