commit 6f86b743a63c13965308fe2b2b167f6a914b790e Author: zlei9 Date: Sun Mar 29 08:23:03 2026 +0800 Initial commit with translated description diff --git a/SKILL.md b/SKILL.md new file mode 100644 index 0000000..d2a9ff6 --- /dev/null +++ b/SKILL.md @@ -0,0 +1,128 @@ +--- +name: agent-team-orchestration +description: "编排具有明确定义角色、任务生命周期、交接协议和审查工作流的多代理团队。在以下情况使用:(1)组建2个以上具有不同专业领域的代理团队,(2)定义任务路由和生命周期(收件箱→规格→构建→审查→完成),(3)创建代理之间的交接协议,(4)建立审查和质量关卡,(5)管理代理之间的异步通信和工件共享。" +--- + +# Agent Team Orchestration + +Production playbook for running multi-agent teams with clear roles, structured task flow, and quality gates. + +## Quick Start: Minimal 2-Agent Team + +A builder and a reviewer. The simplest useful team. + +### 1. Define Roles + +``` +Orchestrator (you) — Route tasks, track state, report results +Builder agent — Execute work, produce artifacts +``` + +### 2. Spawn a Task + +``` +1. Create task record (file, DB, or task board) +2. Spawn builder with: + - Task ID and description + - Output path for artifacts + - Handoff instructions (what to produce, where to put it) +3. On completion: review artifacts, mark done, report +``` + +### 3. Add a Reviewer + +``` +Builder produces artifact → Reviewer checks it → Orchestrator ships or returns +``` + +That's the core loop. Everything below scales this pattern. + +## Core Concepts + +### Roles + +Every agent has one primary role. Overlap causes confusion. + +| Role | Purpose | Model guidance | +|------|---------|---------------| +| **Orchestrator** | Route work, track state, make priority calls | High-reasoning model (handles judgment) | +| **Builder** | Produce artifacts — code, docs, configs | Can use cost-effective models for mechanical work | +| **Reviewer** | Verify quality, push back on gaps | High-reasoning model (catches what builders miss) | +| **Ops** | Cron jobs, standups, health checks, dispatching | Cheapest model that's reliable | + +→ *Read [references/team-setup.md](references/team-setup.md) when defining a new team or adding agents.* + +### Task States + +Every task moves through a defined lifecycle: + +``` +Inbox → Assigned → In Progress → Review → Done | Failed +``` + +**Rules:** +- Orchestrator owns state transitions — don't rely on agents to update their own status +- Every transition gets a comment (who, what, why) +- Failed is a valid end state — capture why and move on + +→ *Read [references/task-lifecycle.md](references/task-lifecycle.md) when designing task flows or debugging stuck tasks.* + +### Handoffs + +When work passes between agents, the handoff message includes: + +1. **What was done** — summary of changes/output +2. **Where artifacts are** — exact file paths +3. **How to verify** — test commands or acceptance criteria +4. **Known issues** — anything incomplete or risky +5. **What's next** — clear next action for the receiving agent + +Bad handoff: *"Done, check the files."* +Good handoff: *"Built auth module at `/shared/artifacts/auth/`. Run `npm test auth` to verify. Known issue: rate limiting not implemented yet. Next: reviewer checks error handling edge cases."* + +### Reviews + +Cross-role reviews prevent quality drift: + +- **Builders review specs** — "Is this feasible? What's missing?" +- **Reviewers check builds** — "Does this match the spec? Edge cases?" +- **Orchestrator reviews priorities** — "Is this the right work right now?" + +Skip the review step and quality degrades within 3-5 tasks. Every time. + +→ *Read [references/communication.md](references/communication.md) when setting up agent communication channels.* +→ *Read [references/patterns.md](references/patterns.md) for proven multi-step workflows.* + +## Reference Files + +| File | Read when... | +|------|-------------| +| [team-setup.md](references/team-setup.md) | Defining agents, roles, models, workspaces | +| [task-lifecycle.md](references/task-lifecycle.md) | Designing task states, transitions, comments | +| [communication.md](references/communication.md) | Setting up async/sync communication, artifact paths | +| [patterns.md](references/patterns.md) | Implementing specific workflows (spec→build→test, parallel research, escalation) | + +## Common Pitfalls + +### Spawning without clear artifact output paths +Agent produces great work, but you can't find it. Always specify the exact output path in the spawn prompt. Use a shared artifacts directory with predictable structure. + +### No review step = quality drift +"It's a small change, skip review." Do this three times and you have compounding errors. Every artifact gets at least one set of eyes that didn't produce it. + +### Agents not commenting on task progress +Silent agents create coordination blind spots. Require comments at: start, blocker, handoff, completion. If an agent goes silent, assume it's stuck. + +### Not verifying agent capabilities before assigning +Assigning browser-based testing to an agent without browser access. Assigning image work to a text-only model. Check capabilities before routing. + +### Orchestrator doing execution work +The orchestrator routes and tracks — it doesn't build. The moment you start "just quickly doing this one thing," you've lost oversight of the rest of the team. + +## When NOT to Use This Skill + +- **Single-agent setups** — Just follow standard AGENTS.md conventions. Team orchestration adds overhead that solo agents don't need. +- **One-off task delegation** — Use `sessions_spawn` directly. This skill is for sustained workflows with multiple handoffs. +- **Simple question routing** — If you're just forwarding a question to a specialist, that's a message, not a workflow. + +This skill is for **sustained team workflows** — recurring collaboration patterns where agents depend on each other's output over multiple tasks. diff --git a/_meta.json b/_meta.json new file mode 100644 index 0000000..7517d20 --- /dev/null +++ b/_meta.json @@ -0,0 +1,6 @@ +{ + "ownerId": "kn77yy30hx6jk3x3j2dwc9tj3d808mp4", + "slug": "agent-team-orchestration", + "version": "1.0.0", + "publishedAt": 1770912001303 +} \ No newline at end of file diff --git a/references/communication.md b/references/communication.md new file mode 100644 index 0000000..e1b7918 --- /dev/null +++ b/references/communication.md @@ -0,0 +1,110 @@ +# Communication + +How agents coordinate: sync vs async, spawning vs messaging, and artifact sharing. + +## Communication Channels + +### Shared Files (Primary — Async) + +The default communication method. Persistent, auditable, no timing dependency. + +``` +/shared/ +├── specs/ — Requirements, research, analysis +├── artifacts/ — Build outputs, deliverables +├── reviews/ — Review notes and feedback +├── decisions/ — Architecture and product decisions +``` + +**Use for:** Deliverables, specs, reviews, decisions — anything another agent needs to find later. + +### Task Comments (Async) + +Attached to specific tasks. Chronological record of progress. + +**Use for:** Status updates, blockers, handoff messages, review feedback. + +### sessions_send (Sync — Urgent) + +Direct message to a running agent session. Interrupts their current work. + +**Use for:** +- Urgent priority changes ("Drop everything, critical bug") +- Quick questions that block progress ("Is feature X in scope?") +- Coordination that can't wait for task comment review + +**Don't use for:** +- Routine updates (use task comments) +- Delivering artifacts (use shared files) +- Anything the agent needs to reference later (messages are ephemeral) + +## Spawn vs Send + +### Spawn a new sub-agent when: +- The task is self-contained with clear inputs and outputs +- You want isolation — the work shouldn't affect other running sessions +- The task needs a different model or capability set +- You're parallelizing — multiple independent tasks at once + +### Send to an existing session when: +- The agent is already working on related context +- You need a quick answer, not a full task execution +- The work is a small addition to something already in progress + +**Default to spawn.** It's cleaner. Send is for exceptions. + +## Spawn Prompt Template + +Every spawn includes: + +```markdown +## Task: [Title] +**Task ID:** [ID] +**Role:** [What this agent is] +**Priority:** [High/Medium/Low] + +### Context +[What the agent needs to know] + +### Deliverables +[Exactly what to produce] + +### Output Path +[Exact directory/file path for artifacts] + +### Handoff +When complete: +1. Write artifacts to [output path] +2. Comment on task with handoff summary +3. Include: what was done, how to verify, known issues +``` + +**Critical fields:** +- **Output Path** — Without this, you'll lose the work. Always specify. +- **Handoff instructions** — Tell the agent exactly how to signal completion. + +## Artifact Conventions + +### Naming +``` +/shared/artifacts/[task-id]-[short-name]/ +/shared/specs/[date]-[topic].md +/shared/decisions/[date]-[title].md +/shared/reviews/[task-id]-review.md +``` + +### Rules +- All deliverables go to `/shared/` — never to personal agent workspaces +- One directory per task for multi-file outputs +- Include a brief README or summary at the top of the artifact directory if it contains 3+ files +- Overwrite previous versions in place — don't create v2, v3 copies + +## Avoiding Communication Failures + +**Silent agents:** If an agent doesn't comment within its expected timeframe, assume it's stuck. Check on it or restart the task. + +**Lost artifacts:** Always verify the output path exists after a task completes. Agents sometimes write to wrong directories. + +**Context gaps:** When spawning, include all context the agent needs. Don't assume it can read other agent sessions or recent conversations. Shared files are the bridge. + +**Message timing:** `sessions_send` only works if the target session is active. If unsure, spawn a new session instead. diff --git a/references/patterns.md b/references/patterns.md new file mode 100644 index 0000000..c3de252 --- /dev/null +++ b/references/patterns.md @@ -0,0 +1,141 @@ +# Patterns + +Proven multi-agent workflows. Copy and adapt. + +## Spec → Review → Build → Test + +The full quality loop. Use for any non-trivial feature. + +``` +1. Orchestrator creates task, assigns to Spec Writer +2. Spec Writer produces spec at /shared/specs/[task]-spec.md +3. Orchestrator assigns spec review to Builder (feasibility check) +4. Builder reviews: "feasible" / "change X because Y" +5. If changes needed → back to Spec Writer → re-review +6. Orchestrator assigns build to Builder +7. Builder produces artifacts at /shared/artifacts/[task]/ +8. Orchestrator assigns review to Reviewer +9. Reviewer approves or returns with feedback +10. If returned → Builder fixes → re-review +11. Orchestrator marks Done, reports to stakeholders +``` + +**Key:** The person who writes the spec doesn't review the build. The person who builds doesn't approve their own work. Cross-role verification is the whole point. + +### Minimal version (2 agents): +``` +1. Orchestrator writes brief spec +2. Builder implements +3. Orchestrator reviews output +4. Done or return for fixes +``` + +## Parallel Research + +Multiple agents research independently, then merge. Use for broad investigation. + +``` +1. Orchestrator defines research question + splits into angles +2. Spawn Agent A: "Research [angle 1], write findings to /shared/specs/research-[topic]-a.md" +3. Spawn Agent B: "Research [angle 2], write findings to /shared/specs/research-[topic]-b.md" +4. Wait for both to complete +5. Orchestrator (or designated agent) merges into /shared/specs/research-[topic]-final.md +6. Use merged research to inform next decision +``` + +**Rules:** +- Define non-overlapping angles to avoid duplicate work +- Set a time/scope limit per agent — research expands to fill available time +- The merge step is mandatory — raw research without synthesis is useless + +## Escalation + +Agent hits a blocker it can't resolve. Structured escalation prevents stalling. + +``` +1. Agent comments on task: "Blocked: [specific problem]" +2. Agent continues with other work if possible (don't idle) +3. Orchestrator sees blocker, decides: + a. Resolve directly (answer the question, provide access) + b. Reassign to a more capable agent + c. Escalate to human stakeholder + d. Deprioritize/defer the task +4. Orchestrator comments decision and unblocks or reassigns +``` + +**Escalation triggers:** +- Missing access or credentials +- Ambiguous requirements that need product decisions +- Technical blocker outside agent's expertise +- Task exceeds estimated scope by 2x+ + +**Anti-pattern:** Agent silently struggling for 30 minutes instead of escalating after 10. Set the expectation: escalate early, escalate with context. + +## Cron-Based Ops + +Scheduled tasks for team health. Assign to the cheapest reliable agent. + +### Daily Standup +``` +Schedule: Every morning +Agent: Ops + +1. Read all open tasks +2. Check for stale tasks (no comment in 24h+) +3. Check for overdue tasks +4. Produce standup summary: + - What completed yesterday + - What's in progress + - What's blocked + - What's stale +5. Post to orchestrator or team channel +``` + +### Task Dispatch +``` +Schedule: Every few hours (or on trigger) +Agent: Orchestrator + +1. Check inbox for new tasks +2. Prioritize by urgency/importance +3. Match to available agents (check capabilities) +4. Assign and spawn +``` + +### Health Check +``` +Schedule: Periodic +Agent: Ops + +1. Verify shared directories exist and are writable +2. Check for orphaned tasks (assigned but no agent session) +3. Check for artifact path conflicts +4. Report anomalies to orchestrator +``` + +## Batch Processing + +Multiple similar tasks that can run in parallel. + +``` +1. Orchestrator creates N tasks from a list +2. Spawn up to M agents in parallel (M ≤ concurrency limit) +3. Each agent picks one task, completes it, writes output +4. Orchestrator collects results as agents finish +5. Spawn next batch if more tasks remain +6. Final aggregation once all tasks complete +``` + +**Sizing:** Start with 2-3 parallel agents. More isn't always faster — coordination overhead grows. + +## Review Rotation + +Prevent review fatigue and bias by rotating reviewers. + +``` +Task produced by Agent A → Reviewed by Agent B +Task produced by Agent B → Reviewed by Agent C +Task produced by Agent C → Reviewed by Agent A +``` + +**Why:** Same reviewer for the same builder creates blind spots. Rotation catches different things. diff --git a/references/task-lifecycle.md b/references/task-lifecycle.md new file mode 100644 index 0000000..54e5f3a --- /dev/null +++ b/references/task-lifecycle.md @@ -0,0 +1,129 @@ +# Task Lifecycle + +Task states, transitions, comment conventions, and decision logging. + +## States + +``` +Inbox → Assigned → In Progress → Review → Done | Failed +``` + +| State | Meaning | Owner | +|-------|---------|-------| +| **Inbox** | New task, unassigned | Orchestrator | +| **Assigned** | Agent selected, not yet started | Orchestrator | +| **In Progress** | Agent actively working | Assigned agent | +| **Review** | Work complete, awaiting verification | Reviewer | +| **Done** | Verified and shipped | Orchestrator | +| **Failed** | Abandoned with documented reason | Orchestrator | + +## Transition Rules + +**Orchestrator transitions:** +- Inbox → Assigned (picks the agent) +- Assigned → In Progress (spawns the agent or sends the task) +- Review → Done (accepts the deliverable) +- Any state → Failed (with reason) + +**Agents transition:** +- In Progress → Review (submits deliverable with handoff comment) + +**Reviewers transition:** +- Review → In Progress (returns with feedback — agent must address it) +- Review → Done (approves — orchestrator confirms) + +**Never skip Review.** The orchestrator may override for trivial tasks, but document it. + +## Comment Conventions + +Every state change gets a comment. Format: + +``` +[Agent] [Action]: [Details] +``` + +### Required comments: + +**Starting work:** +``` +[Builder] Starting: Picking up auth module. Questions: Should rate limiting be per-user or per-IP? +``` + +**Blocker found:** +``` +[Builder] Blocked: Need API credentials for the payment gateway. Who has access? +``` + +**Submitting for review:** +``` +[Builder] Handoff: Auth module complete at /shared/artifacts/auth/. +- Added JWT validation middleware +- Tests at /shared/artifacts/auth/tests/ +- Run `npm test -- --grep auth` to verify +- Known issue: refresh token rotation not implemented (out of scope per spec) +- Next: Reviewer checks error handling paths +``` + +**Review feedback:** +``` +[Reviewer] Feedback: Two issues found. +1. Missing input validation on email field — SQL injection risk +2. Error messages expose internal paths in production mode +Returning to builder. Fix both, then resubmit. +``` + +**Completion:** +``` +[Reviewer] Approved: All issues addressed. Auth module ready to ship. +``` + +**Failure:** +``` +[Orchestrator] Failed: Deprioritized — superseded by new auth provider integration. Preserving spec at /shared/specs/auth-v1.md for reference. +``` + +## Decision Logging + +Architecture or product decisions made during task execution go in a shared decisions directory. + +```markdown +# Decision: [Title] +**Date:** YYYY-MM-DD +**Author:** [Agent] +**Status:** Proposed | Accepted | Rejected +**Task:** [Task ID if applicable] + +## Context +Why this decision came up. + +## Options Considered +1. Option A — tradeoffs +2. Option B — tradeoffs + +## Decision +What was chosen and why. + +## Consequences +What changes as a result. +``` + +**When to log a decision:** +- Choosing between two valid architectural approaches +- Changing a spec during implementation +- Rejecting a requirement as infeasible +- Any choice that future agents will wonder "why did we do it this way?" + +## Multi-Step Task Workflows + +Complex tasks split into sub-tasks. Track the parent relationship: + +``` +Task #12: Build user dashboard + ├── #12a: Write spec (Assigned: Spec writer) + ├── #12b: Review spec (Assigned: Builder — feasibility check) + ├── #12c: Build frontend (Assigned: Builder) + ├── #12d: Build API endpoints (Assigned: Builder) + └── #12e: Integration test (Assigned: Reviewer) +``` + +The orchestrator tracks the parent task and only marks it Done when all sub-tasks complete. diff --git a/references/team-setup.md b/references/team-setup.md new file mode 100644 index 0000000..c8e5feb --- /dev/null +++ b/references/team-setup.md @@ -0,0 +1,105 @@ +# Team Setup + +How to define agents, assign roles, select models, and isolate workspaces. + +## Define Roles First, Then Agents + +Start with the work, not the agents. List the types of work, then create roles to cover them. + +**Minimal team (2 agents):** +``` +Orchestrator — routes tasks, tracks state +Builder — executes work +``` + +**Standard team (3-4 agents):** +``` +Orchestrator — routes, prioritizes, reports to stakeholders +Builder — produces artifacts (code, docs, configs) +Reviewer — verifies quality, catches gaps +Ops — scheduled tasks, health checks, mechanical work +``` + +**Rule:** One agent, one primary role. An agent can do secondary work, but its role determines what it's optimized for. + +## Model Selection Per Role + +Match model cost to the cognitive demands of the role. + +| Role | Needs | Model tier | +|------|-------|-----------| +| Orchestrator | Judgment, prioritization, multi-context reasoning | Top tier (e.g., Claude Opus, GPT-4.5) | +| Builder | Code generation, following specs, producing artifacts | Mid-to-top tier depending on complexity | +| Reviewer | Critical analysis, catching edge cases, feasibility | Top tier — reviewers catch what builders miss | +| Ops | Following templates, running scripts, dispatching | Cheapest reliable model (e.g., GPT-4o-mini, Haiku) | + +**Don't waste expensive models on mechanical work.** Cron-based standups, file organization, and template-following tasks don't need frontier reasoning. + +## Workspace Isolation + +Each agent operates in its own workspace to prevent interference. + +``` +/workspace/ +├── agents/ +│ ├── builder/ — Builder's personal workspace +│ │ └── SOUL.md — Builder's identity and instructions +│ ├── reviewer/ — Reviewer's personal workspace +│ │ └── SOUL.md +│ └── ops/ +│ └── SOUL.md +├── shared/ — Shared across all agents +│ ├── specs/ — Requirements and specifications +│ ├── artifacts/ — Build outputs +│ ├── reviews/ — Review notes and feedback +│ └── decisions/ — Architecture and product decisions +``` + +**Rules:** +- Agents read/write their own workspace freely +- Agents write deliverables to `/shared/` — never to personal workspaces +- Agents can read any shared directory +- Orchestrator can read all workspaces for oversight + +## Identity Files (SOUL.md) + +Each agent gets a SOUL.md that defines: + +1. **Role and scope** — What this agent does and doesn't do +2. **Communication style** — How it writes comments, reports, asks questions +3. **Boundaries** — What requires escalation vs. autonomous action +4. **Team context** — Who else is on the team and how to interact with them + +Example SOUL.md for a builder agent: + +```markdown +# SOUL.md — Builder + +I build what the specs say. My job is execution, not product decisions. + +## Scope +- Implement features per approved specs +- Write tests for what I build +- Document non-obvious decisions in code comments +- Hand off with clear verification steps + +## Boundaries +- Spec unclear? Ask the orchestrator, don't guess +- Architecture change needed? Propose it, don't just do it +- Blocked for >10 minutes? Comment on the task and move on + +## Handoff Format +Every completed task includes: +1. What I changed and why +2. File paths for all artifacts +3. How to test/verify +4. Known limitations +``` + +## Adding a New Agent + +1. Create the workspace directory +2. Write its SOUL.md +3. Update the team protocol with its role +4. Verify it has the capabilities it needs (browser, tools, API access) +5. Start with a small task to validate the setup before loading it into the rotation