Initial commit with translated description

2026-03-29 08:23:03 +08:00
commit 6f86b743a6
6 changed files with 619 additions and 0 deletions
--- a/SKILL.md
+++ b/SKILL.md
@@ -0,0 +1,128 @@
 ---
 name: agent-team-orchestration
 description: "编排具有明确定义角色、任务生命周期、交接协议和审查工作流的多代理团队。在以下情况使用：（1）组建2个以上具有不同专业领域的代理团队，（2）定义任务路由和生命周期（收件箱→规格→构建→审查→完成），（3）创建代理之间的交接协议，（4）建立审查和质量关卡，（5）管理代理之间的异步通信和工件共享。"
 ---
 # Agent Team Orchestration
 Production playbook for running multi-agent teams with clear roles, structured task flow, and quality gates.
 ## Quick Start: Minimal 2-Agent Team
 A builder and a reviewer. The simplest useful team.
 ### 1. Define Roles
 ```
 Orchestrator (you) — Route tasks, track state, report results
 Builder agent     — Execute work, produce artifacts
 ```
 ### 2. Spawn a Task
 ```
 1. Create task record (file, DB, or task board)
 2. Spawn builder with:
   - Task ID and description
   - Output path for artifacts
   - Handoff instructions (what to produce, where to put it)
 3. On completion: review artifacts, mark done, report
 ```
 ### 3. Add a Reviewer
 ```
 Builder produces artifact → Reviewer checks it → Orchestrator ships or returns
 ```
 That's the core loop. Everything below scales this pattern.
 ## Core Concepts
 ### Roles
 Every agent has one primary role. Overlap causes confusion.
 | Role | Purpose | Model guidance |
 |------|---------|---------------|
 | **Orchestrator** | Route work, track state, make priority calls | High-reasoning model (handles judgment) |
 | **Builder** | Produce artifacts — code, docs, configs | Can use cost-effective models for mechanical work |
 | **Reviewer** | Verify quality, push back on gaps | High-reasoning model (catches what builders miss) |
 | **Ops** | Cron jobs, standups, health checks, dispatching | Cheapest model that's reliable |
 → *Read [references/team-setup.md](references/team-setup.md) when defining a new team or adding agents.*
 ### Task States
 Every task moves through a defined lifecycle:
 ```
 Inbox → Assigned → In Progress → Review → Done | Failed
 ```
 **Rules:**
 - Orchestrator owns state transitions — don't rely on agents to update their own status
 - Every transition gets a comment (who, what, why)
 - Failed is a valid end state — capture why and move on
 → *Read [references/task-lifecycle.md](references/task-lifecycle.md) when designing task flows or debugging stuck tasks.*
 ### Handoffs
 When work passes between agents, the handoff message includes:
 1. **What was done** — summary of changes/output
 2. **Where artifacts are** — exact file paths
 3. **How to verify** — test commands or acceptance criteria
 4. **Known issues** — anything incomplete or risky
 5. **What's next** — clear next action for the receiving agent
 Bad handoff: *"Done, check the files."*
 Good handoff: *"Built auth module at `/shared/artifacts/auth/`. Run `npm test auth` to verify. Known issue: rate limiting not implemented yet. Next: reviewer checks error handling edge cases."*
 ### Reviews
 Cross-role reviews prevent quality drift:
 - **Builders review specs** — "Is this feasible? What's missing?"
 - **Reviewers check builds** — "Does this match the spec? Edge cases?"
 - **Orchestrator reviews priorities** — "Is this the right work right now?"
 Skip the review step and quality degrades within 3-5 tasks. Every time.
 → *Read [references/communication.md](references/communication.md) when setting up agent communication channels.*
 → *Read [references/patterns.md](references/patterns.md) for proven multi-step workflows.*
 ## Reference Files
 | File | Read when... |
 |------|-------------|
 | [team-setup.md](references/team-setup.md) | Defining agents, roles, models, workspaces |
 | [task-lifecycle.md](references/task-lifecycle.md) | Designing task states, transitions, comments |
 | [communication.md](references/communication.md) | Setting up async/sync communication, artifact paths |
 | [patterns.md](references/patterns.md) | Implementing specific workflows (spec→build→test, parallel research, escalation) |
 ## Common Pitfalls
 ### Spawning without clear artifact output paths
 Agent produces great work, but you can't find it. Always specify the exact output path in the spawn prompt. Use a shared artifacts directory with predictable structure.
 ### No review step = quality drift
 "It's a small change, skip review." Do this three times and you have compounding errors. Every artifact gets at least one set of eyes that didn't produce it.
 ### Agents not commenting on task progress
 Silent agents create coordination blind spots. Require comments at: start, blocker, handoff, completion. If an agent goes silent, assume it's stuck.
 ### Not verifying agent capabilities before assigning
 Assigning browser-based testing to an agent without browser access. Assigning image work to a text-only model. Check capabilities before routing.
 ### Orchestrator doing execution work
 The orchestrator routes and tracks — it doesn't build. The moment you start "just quickly doing this one thing," you've lost oversight of the rest of the team.
 ## When NOT to Use This Skill
 - **Single-agent setups** — Just follow standard AGENTS.md conventions. Team orchestration adds overhead that solo agents don't need.
 - **One-off task delegation** — Use `sessions_spawn` directly. This skill is for sustained workflows with multiple handoffs.
 - **Simple question routing** — If you're just forwarding a question to a specialist, that's a message, not a workflow.
 This skill is for **sustained team workflows** — recurring collaboration patterns where agents depend on each other's output over multiple tasks.
--- a/_meta.json
+++ b/_meta.json
@@ -0,0 +1,6 @@
 {
  "ownerId": "kn77yy30hx6jk3x3j2dwc9tj3d808mp4",
  "slug": "agent-team-orchestration",
  "version": "1.0.0",
  "publishedAt": 1770912001303
 }
--- a/references/communication.md
+++ b/references/communication.md
@@ -0,0 +1,110 @@
 # Communication
 How agents coordinate: sync vs async, spawning vs messaging, and artifact sharing.
 ## Communication Channels
 ### Shared Files (Primary — Async)
 The default communication method. Persistent, auditable, no timing dependency.
 ```
 /shared/
 ├── specs/          — Requirements, research, analysis
 ├── artifacts/      — Build outputs, deliverables
 ├── reviews/        — Review notes and feedback
 ├── decisions/      — Architecture and product decisions
 ```
 **Use for:** Deliverables, specs, reviews, decisions — anything another agent needs to find later.
 ### Task Comments (Async)
 Attached to specific tasks. Chronological record of progress.
 **Use for:** Status updates, blockers, handoff messages, review feedback.
 ### sessions_send (Sync — Urgent)
 Direct message to a running agent session. Interrupts their current work.
 **Use for:**
 - Urgent priority changes ("Drop everything, critical bug")
 - Quick questions that block progress ("Is feature X in scope?")
 - Coordination that can't wait for task comment review
 **Don't use for:**
 - Routine updates (use task comments)
 - Delivering artifacts (use shared files)
 - Anything the agent needs to reference later (messages are ephemeral)
 ## Spawn vs Send
 ### Spawn a new sub-agent when:
 - The task is self-contained with clear inputs and outputs
 - You want isolation — the work shouldn't affect other running sessions
 - The task needs a different model or capability set
 - You're parallelizing — multiple independent tasks at once
 ### Send to an existing session when:
 - The agent is already working on related context
 - You need a quick answer, not a full task execution
 - The work is a small addition to something already in progress
 **Default to spawn.** It's cleaner. Send is for exceptions.
 ## Spawn Prompt Template
 Every spawn includes:
 ```markdown
 ## Task: [Title]
 **Task ID:** [ID]
 **Role:** [What this agent is]
 **Priority:** [High/Medium/Low]
 ### Context
 [What the agent needs to know]
 ### Deliverables
 [Exactly what to produce]
 ### Output Path
 [Exact directory/file path for artifacts]
 ### Handoff
 When complete:
 1. Write artifacts to [output path]
 2. Comment on task with handoff summary
 3. Include: what was done, how to verify, known issues
 ```
 **Critical fields:**
 - **Output Path** — Without this, you'll lose the work. Always specify.
 - **Handoff instructions** — Tell the agent exactly how to signal completion.
 ## Artifact Conventions
 ### Naming
 ```
 /shared/artifacts/[task-id]-[short-name]/
 /shared/specs/[date]-[topic].md
 /shared/decisions/[date]-[title].md
 /shared/reviews/[task-id]-review.md
 ```
 ### Rules
 - All deliverables go to `/shared/` — never to personal agent workspaces
 - One directory per task for multi-file outputs
 - Include a brief README or summary at the top of the artifact directory if it contains 3+ files
 - Overwrite previous versions in place — don't create v2, v3 copies
 ## Avoiding Communication Failures
 **Silent agents:** If an agent doesn't comment within its expected timeframe, assume it's stuck. Check on it or restart the task.
 **Lost artifacts:** Always verify the output path exists after a task completes. Agents sometimes write to wrong directories.
 **Context gaps:** When spawning, include all context the agent needs. Don't assume it can read other agent sessions or recent conversations. Shared files are the bridge.
 **Message timing:** `sessions_send` only works if the target session is active. If unsure, spawn a new session instead.
--- a/references/patterns.md
+++ b/references/patterns.md
@@ -0,0 +1,141 @@
 # Patterns
 Proven multi-agent workflows. Copy and adapt.
 ## Spec → Review → Build → Test
 The full quality loop. Use for any non-trivial feature.
 ```
 1. Orchestrator creates task, assigns to Spec Writer
 2. Spec Writer produces spec at /shared/specs/[task]-spec.md
 3. Orchestrator assigns spec review to Builder (feasibility check)
 4. Builder reviews: "feasible" / "change X because Y"
 5. If changes needed → back to Spec Writer → re-review
 6. Orchestrator assigns build to Builder
 7. Builder produces artifacts at /shared/artifacts/[task]/
 8. Orchestrator assigns review to Reviewer
 9. Reviewer approves or returns with feedback
 10. If returned → Builder fixes → re-review
 11. Orchestrator marks Done, reports to stakeholders
 ```
 **Key:** The person who writes the spec doesn't review the build. The person who builds doesn't approve their own work. Cross-role verification is the whole point.
 ### Minimal version (2 agents):
 ```
 1. Orchestrator writes brief spec
 2. Builder implements
 3. Orchestrator reviews output
 4. Done or return for fixes
 ```
 ## Parallel Research
 Multiple agents research independently, then merge. Use for broad investigation.
 ```
 1. Orchestrator defines research question + splits into angles
 2. Spawn Agent A: "Research [angle 1], write findings to /shared/specs/research-[topic]-a.md"
 3. Spawn Agent B: "Research [angle 2], write findings to /shared/specs/research-[topic]-b.md"
 4. Wait for both to complete
 5. Orchestrator (or designated agent) merges into /shared/specs/research-[topic]-final.md
 6. Use merged research to inform next decision
 ```
 **Rules:**
 - Define non-overlapping angles to avoid duplicate work
 - Set a time/scope limit per agent — research expands to fill available time
 - The merge step is mandatory — raw research without synthesis is useless
 ## Escalation
 Agent hits a blocker it can't resolve. Structured escalation prevents stalling.
 ```
 1. Agent comments on task: "Blocked: [specific problem]"
 2. Agent continues with other work if possible (don't idle)
 3. Orchestrator sees blocker, decides:
   a. Resolve directly (answer the question, provide access)
   b. Reassign to a more capable agent
   c. Escalate to human stakeholder
   d. Deprioritize/defer the task
 4. Orchestrator comments decision and unblocks or reassigns
 ```
 **Escalation triggers:**
 - Missing access or credentials
 - Ambiguous requirements that need product decisions
 - Technical blocker outside agent's expertise
 - Task exceeds estimated scope by 2x+
 **Anti-pattern:** Agent silently struggling for 30 minutes instead of escalating after 10. Set the expectation: escalate early, escalate with context.
 ## Cron-Based Ops
 Scheduled tasks for team health. Assign to the cheapest reliable agent.
 ### Daily Standup
 ```
 Schedule: Every morning
 Agent: Ops
 1. Read all open tasks
 2. Check for stale tasks (no comment in 24h+)
 3. Check for overdue tasks
 4. Produce standup summary:
   - What completed yesterday
   - What's in progress
   - What's blocked
   - What's stale
 5. Post to orchestrator or team channel
 ```
 ### Task Dispatch
 ```
 Schedule: Every few hours (or on trigger)
 Agent: Orchestrator
 1. Check inbox for new tasks
 2. Prioritize by urgency/importance
 3. Match to available agents (check capabilities)
 4. Assign and spawn
 ```
 ### Health Check
 ```
 Schedule: Periodic
 Agent: Ops
 1. Verify shared directories exist and are writable
 2. Check for orphaned tasks (assigned but no agent session)
 3. Check for artifact path conflicts
 4. Report anomalies to orchestrator
 ```
 ## Batch Processing
 Multiple similar tasks that can run in parallel.
 ```
 1. Orchestrator creates N tasks from a list
 2. Spawn up to M agents in parallel (M ≤ concurrency limit)
 3. Each agent picks one task, completes it, writes output
 4. Orchestrator collects results as agents finish
 5. Spawn next batch if more tasks remain
 6. Final aggregation once all tasks complete
 ```
 **Sizing:** Start with 2-3 parallel agents. More isn't always faster — coordination overhead grows.
 ## Review Rotation
 Prevent review fatigue and bias by rotating reviewers.
 ```
 Task produced by Agent A → Reviewed by Agent B
 Task produced by Agent B → Reviewed by Agent C
 Task produced by Agent C → Reviewed by Agent A
 ```
 **Why:** Same reviewer for the same builder creates blind spots. Rotation catches different things.
--- a/references/task-lifecycle.md
+++ b/references/task-lifecycle.md
@@ -0,0 +1,129 @@
 # Task Lifecycle
 Task states, transitions, comment conventions, and decision logging.
 ## States
 ```
 Inbox → Assigned → In Progress → Review → Done | Failed
 ```
 | State | Meaning | Owner |
 |-------|---------|-------|
 | **Inbox** | New task, unassigned | Orchestrator |
 | **Assigned** | Agent selected, not yet started | Orchestrator |
 | **In Progress** | Agent actively working | Assigned agent |
 | **Review** | Work complete, awaiting verification | Reviewer |
 | **Done** | Verified and shipped | Orchestrator |
 | **Failed** | Abandoned with documented reason | Orchestrator |
 ## Transition Rules
 **Orchestrator transitions:**
 - Inbox → Assigned (picks the agent)
 - Assigned → In Progress (spawns the agent or sends the task)
 - Review → Done (accepts the deliverable)
 - Any state → Failed (with reason)
 **Agents transition:**
 - In Progress → Review (submits deliverable with handoff comment)
 **Reviewers transition:**
 - Review → In Progress (returns with feedback — agent must address it)
 - Review → Done (approves — orchestrator confirms)
 **Never skip Review.** The orchestrator may override for trivial tasks, but document it.
 ## Comment Conventions
 Every state change gets a comment. Format:
 ```
 [Agent] [Action]: [Details]
 ```
 ### Required comments:
 **Starting work:**
 ```
 [Builder] Starting: Picking up auth module. Questions: Should rate limiting be per-user or per-IP?
 ```
 **Blocker found:**
 ```
 [Builder] Blocked: Need API credentials for the payment gateway. Who has access?
 ```
 **Submitting for review:**
 ```
 [Builder] Handoff: Auth module complete at /shared/artifacts/auth/.
 - Added JWT validation middleware
 - Tests at /shared/artifacts/auth/tests/
 - Run `npm test -- --grep auth` to verify
 - Known issue: refresh token rotation not implemented (out of scope per spec)
 - Next: Reviewer checks error handling paths
 ```
 **Review feedback:**
 ```
 [Reviewer] Feedback: Two issues found.
 1. Missing input validation on email field — SQL injection risk
 2. Error messages expose internal paths in production mode
 Returning to builder. Fix both, then resubmit.
 ```
 **Completion:**
 ```
 [Reviewer] Approved: All issues addressed. Auth module ready to ship.
 ```
 **Failure:**
 ```
 [Orchestrator] Failed: Deprioritized — superseded by new auth provider integration. Preserving spec at /shared/specs/auth-v1.md for reference.
 ```
 ## Decision Logging
 Architecture or product decisions made during task execution go in a shared decisions directory.
 ```markdown
 # Decision: [Title]
 **Date:** YYYY-MM-DD
 **Author:** [Agent]
 **Status:** Proposed | Accepted | Rejected
 **Task:** [Task ID if applicable]
 ## Context
 Why this decision came up.
 ## Options Considered
 1. Option A — tradeoffs
 2. Option B — tradeoffs
 ## Decision
 What was chosen and why.
 ## Consequences
 What changes as a result.
 ```
 **When to log a decision:**
 - Choosing between two valid architectural approaches
 - Changing a spec during implementation
 - Rejecting a requirement as infeasible
 - Any choice that future agents will wonder "why did we do it this way?"
 ## Multi-Step Task Workflows
 Complex tasks split into sub-tasks. Track the parent relationship:
 ```
 Task #12: Build user dashboard
  ├── #12a: Write spec (Assigned: Spec writer)
  ├── #12b: Review spec (Assigned: Builder — feasibility check)
  ├── #12c: Build frontend (Assigned: Builder)
  ├── #12d: Build API endpoints (Assigned: Builder)
  └── #12e: Integration test (Assigned: Reviewer)
 ```
 The orchestrator tracks the parent task and only marks it Done when all sub-tasks complete.
--- a/references/team-setup.md
+++ b/references/team-setup.md
@@ -0,0 +1,105 @@
 # Team Setup
 How to define agents, assign roles, select models, and isolate workspaces.
 ## Define Roles First, Then Agents
 Start with the work, not the agents. List the types of work, then create roles to cover them.
 **Minimal team (2 agents):**
 ```
 Orchestrator — routes tasks, tracks state
 Builder      — executes work
 ```
 **Standard team (3-4 agents):**
 ```
 Orchestrator — routes, prioritizes, reports to stakeholders
 Builder      — produces artifacts (code, docs, configs)
 Reviewer     — verifies quality, catches gaps
 Ops          — scheduled tasks, health checks, mechanical work
 ```
 **Rule:** One agent, one primary role. An agent can do secondary work, but its role determines what it's optimized for.
 ## Model Selection Per Role
 Match model cost to the cognitive demands of the role.
 | Role | Needs | Model tier |
 |------|-------|-----------|
 | Orchestrator | Judgment, prioritization, multi-context reasoning | Top tier (e.g., Claude Opus, GPT-4.5) |
 | Builder | Code generation, following specs, producing artifacts | Mid-to-top tier depending on complexity |
 | Reviewer | Critical analysis, catching edge cases, feasibility | Top tier — reviewers catch what builders miss |
 | Ops | Following templates, running scripts, dispatching | Cheapest reliable model (e.g., GPT-4o-mini, Haiku) |
 **Don't waste expensive models on mechanical work.** Cron-based standups, file organization, and template-following tasks don't need frontier reasoning.
 ## Workspace Isolation
 Each agent operates in its own workspace to prevent interference.
 ```
 /workspace/
 ├── agents/
 │   ├── builder/          — Builder's personal workspace
 │   │   └── SOUL.md       — Builder's identity and instructions
 │   ├── reviewer/         — Reviewer's personal workspace
 │   │   └── SOUL.md
 │   └── ops/
 │       └── SOUL.md
 ├── shared/               — Shared across all agents
 │   ├── specs/            — Requirements and specifications
 │   ├── artifacts/        — Build outputs
 │   ├── reviews/          — Review notes and feedback
 │   └── decisions/        — Architecture and product decisions
 ```
 **Rules:**
 - Agents read/write their own workspace freely
 - Agents write deliverables to `/shared/` — never to personal workspaces
 - Agents can read any shared directory
 - Orchestrator can read all workspaces for oversight
 ## Identity Files (SOUL.md)
 Each agent gets a SOUL.md that defines:
 1. **Role and scope** — What this agent does and doesn't do
 2. **Communication style** — How it writes comments, reports, asks questions
 3. **Boundaries** — What requires escalation vs. autonomous action
 4. **Team context** — Who else is on the team and how to interact with them
 Example SOUL.md for a builder agent:
 ```markdown
 # SOUL.md — Builder
 I build what the specs say. My job is execution, not product decisions.
 ## Scope
 - Implement features per approved specs
 - Write tests for what I build
 - Document non-obvious decisions in code comments
 - Hand off with clear verification steps
 ## Boundaries
 - Spec unclear? Ask the orchestrator, don't guess
 - Architecture change needed? Propose it, don't just do it
 - Blocked for >10 minutes? Comment on the task and move on
 ## Handoff Format
 Every completed task includes:
 1. What I changed and why
 2. File paths for all artifacts
 3. How to test/verify
 4. Known limitations
 ```
 ## Adding a New Agent
 1. Create the workspace directory
 2. Write its SOUL.md
 3. Update the team protocol with its role
 4. Verify it has the capabilities it needs (browser, tools, API access)
 5. Start with a small task to validate the setup before loading it into the rotation