commit adc00017d758ccef5260ca34f5c0c5e683b60ca3 Author: zlei9 Date: Sun Mar 29 13:22:23 2026 +0800 Initial commit with translated description diff --git a/SKILL.md b/SKILL.md new file mode 100644 index 0000000..104df8f --- /dev/null +++ b/SKILL.md @@ -0,0 +1,202 @@ +--- +name: browser-use +description: "自动化浏览器交互以进行网页测试、表单填写、屏幕截图和数据提取。" +allowed-tools: Bash(browser-use:*) +--- + +# Browser Automation with browser-use CLI + +The `browser-use` command provides fast, persistent browser automation. A background daemon keeps the browser open across commands, giving ~50ms latency per call. + +## Prerequisites + +```bash +browser-use doctor # Verify installation +``` + +For setup details, see https://github.com/browser-use/browser-use/blob/main/browser_use/skill_cli/README.md + +## Core Workflow + +1. **Navigate**: `browser-use open ` — starts browser if needed +2. **Inspect**: `browser-use state` — returns clickable elements with indices +3. **Interact**: use indices from state (`browser-use click 5`, `browser-use input 3 "text"`) +4. **Verify**: `browser-use state` or `browser-use screenshot` to confirm +5. **Repeat**: browser stays open between commands +6. **Cleanup**: `browser-use close` when done + +## Browser Modes + +```bash +browser-use open # Default: headless Chromium +browser-use --headed open # Visible window +browser-use --profile "Default" open # Real Chrome with Default profile (existing logins/cookies) +browser-use --profile "Profile 1" open # Real Chrome with named profile +browser-use --connect open # Auto-discover running Chrome via CDP +browser-use --cdp-url ws://localhost:9222/... open # Connect via CDP URL +``` + +`--connect`, `--cdp-url`, and `--profile` are mutually exclusive. + +## Commands + +```bash +# Navigation +browser-use open # Navigate to URL +browser-use back # Go back in history +browser-use scroll down # Scroll down (--amount N for pixels) +browser-use scroll up # Scroll up +browser-use switch # Switch to tab by index +browser-use close-tab [tab] # Close tab (current if no index) + +# Page State — always run state first to get element indices +browser-use state # URL, title, clickable elements with indices +browser-use screenshot [path.png] # Screenshot (base64 if no path, --full for full page) + +# Interactions — use indices from state +browser-use click # Click element by index +browser-use click # Click at pixel coordinates +browser-use type "text" # Type into focused element +browser-use input "text" # Click element, then type +browser-use keys "Enter" # Send keyboard keys (also "Control+a", etc.) +browser-use select "option" # Select dropdown option +browser-use upload # Upload file to file input +browser-use hover # Hover over element +browser-use dblclick # Double-click element +browser-use rightclick # Right-click element + +# Data Extraction +browser-use eval "js code" # Execute JavaScript, return result +browser-use get title # Page title +browser-use get html [--selector "h1"] # Page HTML (or scoped to selector) +browser-use get text # Element text content +browser-use get value # Input/textarea value +browser-use get attributes # Element attributes +browser-use get bbox # Bounding box (x, y, width, height) + +# Wait +browser-use wait selector "css" # Wait for element (--state visible|hidden|attached|detached, --timeout ms) +browser-use wait text "text" # Wait for text to appear + +# Cookies +browser-use cookies get [--url ] # Get cookies (optionally filtered) +browser-use cookies set # Set cookie (--domain, --secure, --http-only, --same-site, --expires) +browser-use cookies clear [--url ] # Clear cookies +browser-use cookies export # Export to JSON +browser-use cookies import # Import from JSON + +# Python — persistent session with browser access +browser-use python "code" # Execute Python (variables persist across calls) +browser-use python --file script.py # Run file +browser-use python --vars # Show defined variables +browser-use python --reset # Clear namespace + +# Session +browser-use close # Close browser and stop daemon +browser-use sessions # List active sessions +browser-use close --all # Close all sessions +``` + +The Python `browser` object provides: `browser.url`, `browser.title`, `browser.html`, `browser.goto(url)`, `browser.back()`, `browser.click(index)`, `browser.type(text)`, `browser.input(index, text)`, `browser.keys(keys)`, `browser.upload(index, path)`, `browser.screenshot(path)`, `browser.scroll(direction, amount)`, `browser.wait(seconds)`. + +## Cloud API + +```bash +browser-use cloud connect # Provision cloud browser and connect +browser-use cloud connect --timeout 120 --proxy-country US # With options +browser-use cloud login # Save API key (or set BROWSER_USE_API_KEY) +browser-use cloud logout # Remove API key +browser-use cloud v2 GET /browsers # REST passthrough (v2 or v3) +browser-use cloud v2 POST /tasks '{"task":"...","url":"..."}' +browser-use cloud v2 poll # Poll task until done +browser-use cloud v2 --help # Show API endpoints +``` + +`cloud connect` provisions a cloud browser, connects via CDP, and prints a live URL. `browser-use close` disconnects AND stops the cloud browser. + +## Tunnels + +```bash +browser-use tunnel # Start Cloudflare tunnel (idempotent) +browser-use tunnel list # Show active tunnels +browser-use tunnel stop # Stop tunnel +browser-use tunnel stop --all # Stop all tunnels +``` + +## Profile Management + +```bash +browser-use profile list # List detected browsers and profiles +browser-use profile sync --all # Sync profiles to cloud +browser-use profile update # Download/update profile-use binary +``` + +## Command Chaining + +Commands can be chained with `&&`. The browser persists via the daemon, so chaining is safe and efficient. + +```bash +browser-use open https://example.com && browser-use state +browser-use input 5 "user@example.com" && browser-use input 6 "password" && browser-use click 7 +``` + +Chain when you don't need intermediate output. Run separately when you need to parse `state` to discover indices first. + +## Common Workflows + +### Authenticated Browsing + +When a task requires an authenticated site (Gmail, GitHub, internal tools), use Chrome profiles: + +```bash +browser-use profile list # Check available profiles +# Ask the user which profile to use, then: +browser-use --profile "Default" open https://github.com # Already logged in +``` + +### Connecting to Existing Chrome + +```bash +browser-use --connect open https://example.com # Auto-discovers Chrome's CDP endpoint +``` + +Requires Chrome with remote debugging enabled. Falls back to probing ports 9222/9229. + +### Exposing Local Dev Servers + +```bash +browser-use tunnel 3000 # → https://abc.trycloudflare.com +browser-use open https://abc.trycloudflare.com # Browse the tunnel +``` + +## Global Options + +| Option | Description | +|--------|-------------| +| `--headed` | Show browser window | +| `--profile [NAME]` | Use real Chrome (bare `--profile` uses "Default") | +| `--connect` | Auto-discover running Chrome via CDP | +| `--cdp-url ` | Connect via CDP URL (`http://` or `ws://`) | +| `--session NAME` | Target a named session (default: "default") | +| `--json` | Output as JSON | +| `--mcp` | Run as MCP server via stdin/stdout | + +## Tips + +1. **Always run `state` first** to see available elements and their indices +2. **Use `--headed` for debugging** to see what the browser is doing +3. **Sessions persist** — browser stays open between commands +4. **CLI aliases**: `bu`, `browser`, and `browseruse` all work + +## Troubleshooting + +- **Browser won't start?** `browser-use close` then `browser-use --headed open ` +- **Element not found?** `browser-use scroll down` then `browser-use state` +- **Run diagnostics:** `browser-use doctor` + +## Cleanup + +```bash +browser-use close # Close browser session +browser-use tunnel stop --all # Stop tunnels (if any) +``` diff --git a/_meta.json b/_meta.json new file mode 100644 index 0000000..6aff36a --- /dev/null +++ b/_meta.json @@ -0,0 +1,6 @@ +{ + "ownerId": "kn71fxj97n86164tdd84bymp3n7zypxq", + "slug": "browser-use", + "version": "2.0.0", + "publishedAt": 1774158611795 +} \ No newline at end of file