commit f8151bb976431226d99b555a12145b6edef64343 Author: zlei9 Date: Sun Mar 29 09:37:27 2026 +0800 Initial commit with translated description diff --git a/SKILL.md b/SKILL.md new file mode 100644 index 0000000..f1c29ed --- /dev/null +++ b/SKILL.md @@ -0,0 +1,48 @@ +--- +name: agent-browser-core +description: "OpenClaw的agent-browser CLI技能(基于Rust,Node.js回退),通过快照、引用和结构化命令实现AI友好的网页自动化。" +--- + +# Agent Browser Skill (Core) + +## Purpose +Provide an advanced, production-ready playbook for using agent-browser to automate web tasks via CLI and structured commands. + +## Best fit +- You need deterministic automation for AI agents. +- You want compact snapshots with refs and JSON output. +- You prefer a fast CLI with Node.js fallback. + +## Not a fit +- You require a full SDK or custom JS integration. +- You must stream large uploads or complex media workflows. + +## Quick orientation +- Read `references/agent-browser-overview.md` for install, architecture, and core concepts. +- Read `references/agent-browser-command-map.md` for command categories and flags. +- Read `references/agent-browser-safety.md` for high-risk controls and safe mode rules. +- Read `references/agent-browser-workflows.md` for recommended AI workflows. +- Read `references/agent-browser-troubleshooting.md` for common issues and fixes. + +## Required inputs +- Installed agent-browser CLI and browser runtime. +- Target URLs and workflow steps. +- Session or profile strategy if authentication is required. + +## Expected output +- A clear command sequence and operational guardrails for automation. + +## Operational notes +- Snapshot early, act via refs, then snapshot again after DOM changes. +- Use `--json` for machine parsing and scripting. +- Use waits and load-state checks before actions. +- Close tabs or sessions when done to release resources. + +## Safe mode defaults +- Do not use `eval`, `--allow-file-access`, custom `--executable-path`, or arbitrary `--args` without explicit approval. +- Avoid `network route`, `set credentials`, and cookie/storage mutations unless the task requires it. +- Allowlist domains and block localhost or private network targets. + +## Security notes +- Treat tokens and credentials as secrets. +- Avoid `--allow-file-access` unless explicitly required. diff --git a/_meta.json b/_meta.json new file mode 100644 index 0000000..c200d60 --- /dev/null +++ b/_meta.json @@ -0,0 +1,6 @@ +{ + "ownerId": "kn7ehv4at8yekzag31spcarxm180bev0", + "slug": "agent-browser-core", + "version": "1.0.1", + "publishedAt": 1770369553491 +} \ No newline at end of file diff --git a/references/agent-browser-command-map.md b/references/agent-browser-command-map.md new file mode 100644 index 0000000..de728c4 --- /dev/null +++ b/references/agent-browser-command-map.md @@ -0,0 +1,28 @@ +# Agent Browser Command Map + +> Note: Command availability can vary by version. Use `agent-browser help` to confirm. + +## Safe defaults (typical) +- `open`, `click`, `dblclick`, `fill`, `type`, `press`, `hover`, `select` +- `check`, `uncheck`, `scroll`, `screenshot`, `snapshot`, `close` +- `back`, `forward`, `reload` +- `wait`, `wait --text`, `wait --url`, `wait --load networkidle` +- `get text`, `get html`, `get value`, `get attr`, `get title`, `get url` +- `find role`, `find text`, `find label`, `find placeholder` + +## Sensitive / explicit approval +- `eval` (arbitrary JS execution) +- `download ` (writes to disk) +- `set credentials`, `cookies`, `storage` (stateful secrets) +- `network route` / `network requests` (traffic interception) +- `set headers`, `--proxy` (traffic manipulation) +- `--allow-file-access` (local file access) +- `--executable-path`, `--args`, `--cdp` (custom runtime control) + +## Debug and state +- `trace start/stop`, `console`, `errors`, `highlight` +- `state save`, `state load` (treat state files as sensitive) + +## Tabs and frames +- `tab`, `tab new`, `tab `, `tab close` +- `frame `, `frame main` diff --git a/references/agent-browser-overview.md b/references/agent-browser-overview.md new file mode 100644 index 0000000..dab14e4 --- /dev/null +++ b/references/agent-browser-overview.md @@ -0,0 +1,33 @@ +# Agent Browser Overview + +## 1) What it is +- A fast Rust-based headless browser automation CLI with a Node.js fallback. +- Designed for AI agents to navigate, click, type, and snapshot pages via structured commands. +- Uses a background daemon and Playwright for browser control. + +## 2) Install and setup (hardened) +- Pin the version you trust: + - `npm install -g agent-browser@` +- Prefer a dedicated environment or container for installs. +- Avoid running with elevated OS privileges. +- Install browser runtime: + - `agent-browser install` +- Linux dependencies (if needed): + - `agent-browser install --with-deps` + - or `npx playwright install-deps chromium` + +## 3) Browser engines +- Chromium is the default browser engine. +- Firefox and WebKit are supported through Playwright. + +## 4) Snapshot concept +- `snapshot` returns a structured view with stable element refs. +- Refs are designed for compact, deterministic automation. + +## 5) Sessions +- The CLI supports multiple sessions so agents can isolate work. + +## 6) Security posture +- Treat the CLI as high privilege; run with strict allowlists. +- Avoid file access and arbitrary script execution unless required. +- Keep profiles and state files ephemeral by default. diff --git a/references/agent-browser-safety.md b/references/agent-browser-safety.md new file mode 100644 index 0000000..cd4ea6c --- /dev/null +++ b/references/agent-browser-safety.md @@ -0,0 +1,25 @@ +# Safety and Risk Controls + +## High-risk capabilities +- `eval` (arbitrary JavaScript) +- `--allow-file-access` (local file access) +- `--executable-path`, `--args`, `--cdp` (custom runtime control) +- `network route` / `set headers` / `--proxy` (traffic manipulation) +- `set credentials`, cookies, storage, and state files (secret handling) + +## Safe mode checklist +1. Allowlist target domains; block localhost and private networks. +2. Disallow `eval` unless explicitly required. +3. Disallow local file access unless explicitly required. +4. Avoid downloads and filesystem writes by default. +5. Use ephemeral sessions; avoid persistent profiles when possible. +6. Redact tokens in logs and outputs. + +## Escalation policy +- Require explicit human approval before using any high-risk capability. +- Record the reason and scope of the approval (which URLs, which action). + +## Supply-chain hygiene +- Pin CLI version and review upgrades. +- Install in a dedicated environment. +- Avoid running with elevated OS privileges. diff --git a/references/agent-browser-troubleshooting.md b/references/agent-browser-troubleshooting.md new file mode 100644 index 0000000..15d1e49 --- /dev/null +++ b/references/agent-browser-troubleshooting.md @@ -0,0 +1,17 @@ +# Troubleshooting + +## CLI runs but no browser opens +- Run `agent-browser install` to download Chromium. +- On Linux, run `agent-browser install --with-deps` if dependencies are missing. + +## Native binary not available +- The CLI falls back to the Node.js daemon automatically. +- Ensure Node.js is installed and available. + +## Debugging +- Use `--headed` to see the browser UI. +- Use `--debug` for verbose logs. + +## Instability in DOM targeting +- Resnapshot after any navigation or DOM changes. +- Prefer refs from `snapshot` over brittle CSS selectors. diff --git a/references/agent-browser-workflows.md b/references/agent-browser-workflows.md new file mode 100644 index 0000000..991480c --- /dev/null +++ b/references/agent-browser-workflows.md @@ -0,0 +1,26 @@ +# Agent Browser Workflows + +## 1) Snapshot-first loop +1. `open ` +2. `snapshot -i` and extract refs +3. Act using refs: `click @e12`, `fill @e14 "text"` +4. `snapshot -i` again after DOM changes + +## 2) JSON mode for agents +- Prefer `snapshot -i` and `--json` outputs for deterministic parsing. +- Keep a local map of ref -> intent. + +## 3) Authentication and reuse +- Log in once and `state save`. +- Reuse with `state load` in later runs. +- Treat state files as secrets and rotate when needed. + +## 4) Stability tips +- Wait for load state before actions: `wait --load networkidle`. +- Use `wait --text` or `wait --url` for dynamic flows. +- Prefer refs from `snapshot` over brittle CSS selectors. + +## 5) Safe automation loop +- Validate URL against an allowlist before `open`. +- Avoid `eval` and file access unless explicitly approved. +- Prefer read-only operations when possible.