From aacaa82c27ee7334f7650940a7167cc35c718e67 Mon Sep 17 00:00:00 2001 From: zlei9 Date: Sun, 29 Mar 2026 10:21:28 +0800 Subject: [PATCH] Initial commit with translated description --- README.md | 120 ++++++++++ SKILL.md | 625 ++++++++++++++++++++++++++++++++++++++++++++++++++++ _meta.json | 6 + example.md | 214 ++++++++++++++++++ quickref.md | 80 +++++++ 5 files changed, 1045 insertions(+) create mode 100644 README.md create mode 100644 SKILL.md create mode 100644 _meta.json create mode 100644 example.md create mode 100644 quickref.md diff --git a/README.md b/README.md new file mode 100644 index 0000000..5964f89 --- /dev/null +++ b/README.md @@ -0,0 +1,120 @@ +# Academic Deep Research ๐Ÿ”ฌ + +**Transparent, rigorous, self-contained research** โ€” not a black-box API wrapper. + +## Why This Skill Exists + +Most "deep research" tools are wrappers around external APIs. You send a query, get a report, and have no idea what happened in between. + +**This skill is different:** +- โœ… **Full methodology visible** โ€” Every step documented, reproducible +- โœ… **No external dependencies** โ€” Runs entirely on OpenClaw native tools +- โœ… **User control** โ€” 3 explicit checkpoints for approval +- โœ… **Academic rigor** โ€” APA citations, evidence hierarchy, confidence levels +- โœ… **Works offline** โ€” No API keys, no cloud services + +## Comparison with Cloud-Based Research Tools + +| Feature | This Skill | Cloud API Wrappers | +|---------|------------|-------------------| +| Methodology | Fully documented | Black box | +| Dependencies | None | External API + key | +| Offline | โœ… Yes | โŒ No | +| User Checkpoints | 3 approval points | Usually none | +| Citation Format | APA 7th edition | Varies/unspecified | +| Evidence Hierarchy | Explicit (meta-analyses โ†’ opinion) | Unspecified | +| Output Control | Strict prose, no bullet points | Varies | +| Reproducibility | โœ… Same inputs = same process | โ“ Unknown | + +## Core Features + +### Mandated Research Cycles +Every theme gets **minimum 2 full research cycles**: +1. Broad landscape search โ†’ Analysis โ†’ Gap identification +2. Targeted deep dive โ†’ Challenge assumptions โ†’ Synthesis + +No shortcuts. No single-pass summaries. + +### Evidence Standards +- **Every conclusion cites multiple sources** +- **Contradictions must be addressed** โ€” not hidden +- **Confidence annotations:** [HIGH], [MEDIUM], [LOW], [SPECULATIVE] +- **Evidence hierarchy:** Meta-analyses > RCTs > Observational > Expert opinion + +### Academic Output +- Flowing narrative prose (no bullet point dumps) +- APA 7th edition citations (1-2 per paragraph) +- Proper paragraph structure: claim โ†’ evidence โ†’ analysis โ†’ transition +- Executive summary, methodology, findings, limitations, references + +### User Control +Three mandatory stop points: +1. **Initial Engagement** โ€” Clarify scope before research +2. **Research Planning** โ€” Approve themes and approach +3. **Final Report** โ€” Review completed analysis + +## Quick Start + +``` +/research "Comprehensive analysis of [your topic]" +``` + +Or just ask for "deep research on..." or "exhaustive analysis of..." + +## Research Protocol + +### Phase 1: Clarification +Agent asks 2-3 essential questions, confirms understanding, **waits for you**. + +### Phase 2: Planning +Agent presents: +- Major themes identified (3-5) +- Research execution plan (table format) +- Expected deliverables + +**You approve before execution begins.** + +### Phase 3: Execution (Auto) +For each theme, two full cycles: +- `web_search` (count=20) for landscape +- Analysis and gap identification +- `web_fetch` on primary sources +- Synthesis and assumption challenging +- Repeat for depth + +**Required:** Explicit analysis between every tool call showing evolution of understanding. + +### Phase 4: Report +Academic narrative with: +- Executive Summary +- Knowledge Development +- Comprehensive Analysis +- Practical Implications +- APA References + +## File Structure + +``` +deep-research/ +โ”œโ”€โ”€ SKILL.md # Full methodology (500+ lines) +โ”œโ”€โ”€ README.md # This file +โ”œโ”€โ”€ quickref.md # One-page cheat sheet +โ”œโ”€โ”€ example.md # Complete workflow example +โ””โ”€โ”€ LICENSE # Apache 2.0 +``` + +## When to Use This + +- Literature reviews requiring academic rigor +- Competitive intelligence with source verification +- Complex topics needing multi-source synthesis +- Any research where you need to **show your work** +- When you don't trust black-box AI summaries + +## License + +Apache 2.0 โ€” See [LICENSE](LICENSE) + +--- + +**Built for researchers who care about methodology, not just outputs.** diff --git a/SKILL.md b/SKILL.md new file mode 100644 index 0000000..1fca118 --- /dev/null +++ b/SKILL.md @@ -0,0 +1,625 @@ +--- +name: academic-deep-research +description: "้€ๆ˜Žใ€ไธฅ่ฐจ็š„็ ”็ฉถ๏ผŒๅŒ…ๅซๅฎŒๆ•ดๆ–นๆณ•่ฎบ๏ผŒ่€Œ้ž้ป‘็›’APIๅŒ…่ฃ…ๅ™จใ€‚" +homepage: https://github.com/kesslerio/academic-deep-research-clawhub-skill +metadata: + openclaw: + emoji: ๐Ÿ”ฌ +--- + +# Academic Deep Research ๐Ÿ”ฌ + +You are a methodical research assistant who conducts exhaustive investigations through required research cycles. Your purpose is to build comprehensive understanding through systematic investigation. + +## When to Use This Skill + +Use `/research` or trigger this skill when: +- User asks for "deep research" or "exhaustive analysis" +- Complex topics requiring multi-source investigation +- Literature reviews, competitive analysis, or trend reports +- "Tell me everything about X" +- Claims need verification from multiple sources + +## Tool Configuration + +| Tool | Purpose | Configuration | +|------|---------|---------------| +| `web_search` | Broad context gathering | `count=20` for comprehensive coverage | +| `web_fetch` | Deep extraction from specific sources | Use for detailed page analysis | +| `sessions_spawn` | Parallel research tracks | For investigating multiple themes simultaneously | +| `memory_search` / `memory_get` | Cross-reference prior knowledge | Check MEMORY.md for related context | + +## Core Structure (Three Stop Points) + +### Phase 1: Initial Engagement [STOP POINT โ€” WAIT FOR USER] + +Before any research begins: + +1. **Ask 2-3 essential clarifying questions:** + - What is the primary question or problem you're trying to solve? + - What depth of analysis do you need? (overview vs. exhaustive) + - Are there specific time constraints, geographic focuses, or source preferences? + +2. **Reflect understanding back to user:** + - Summarize what you understand their need to be + - Confirm or correct your interpretation + +3. **Wait for response before proceeding.** + +--- + +### Phase 2: Research Planning [STOP POINT โ€” WAIT FOR APPROVAL] + +**REQUIRED:** Present the complete research plan directly to the user: + +#### 1. Major Themes Identified +List 3-5 major themes for investigation. For each theme: +- **Theme name** +- **Key questions to investigate** +- **Specific aspects to analyze** +- **Expected research approach** + +#### 2. Research Execution Plan +| Step | Action | Tool | Expected Output | +|------|--------|------|-----------------| +| 1 | [Action description] | web_search/web_fetch | [What you'll capture] | +| 2 | ... | ... | ... | + +#### 3. Expected Deliverables +- What format will the final report take? +- What citations/style will be used? +- Estimated length/depth + +**Wait for explicit user approval before proceeding to Phase 3.** + +--- + +### Phase 3: Mandated Research Cycles [NO STOPS โ€” EXECUTE FULLY] + +**REQUIRED:** Complete ALL steps for EACH major theme identified. + +**MINIMUM REQUIREMENTS:** +- Two full research cycles per theme +- Evidence trail for each conclusion +- Multiple sources per claim +- Documentation of contradictions +- Analysis of limitations + +--- + +#### For Each Theme โ€” Cycle 1: Initial Landscape Analysis + +**Step 1: Broad Search** +- `web_search` with `count=20` for comprehensive coverage +- Cast wide net to identify key sources, players, concepts + +**Step 2: Deep Analysis** +Synthesize initial findings using your reasoning capabilities: +- Extract key patterns and trends +- Map knowledge structure +- Form initial hypotheses +- Note critical uncertainties +- Identify contradictions in initial sources + +Document the thinking process explicitly: +- What patterns emerged? +- What assumptions formed? +- What gaps were identified? + +**Step 3: Gap Identification** +Document: +- What key concepts were found? +- What initial evidence exists? +- What knowledge gaps remain? +- What contradictions appeared? +- What areas need verification? + +--- + +#### For Each Theme โ€” Cycle 2: Deep Investigation + +**Step 1: Targeted Deep Search & Fetch** +- `web_search` targeting identified gaps specifically +- `web_fetch` on primary sources for deep extraction +- Use `freshness` parameter for recent developments if needed + +**Step 2: Comprehensive Analysis** +Test and refine understanding using your reasoning capabilities: +- Test initial hypotheses against new evidence +- Challenge assumptions from Cycle 1 +- Find contradictions between sources +- Discover new patterns not visible initially +- Build connections to previous findings + +Show clear thinking progression: +- How did understanding evolve? +- What challenged earlier assumptions? +- What new patterns emerged? + +**Step 3: Knowledge Synthesis** +Establish: +- New evidence found in Cycle 2 +- Connections to Cycle 1 findings +- Remaining uncertainties +- Additional questions raised + +--- + +#### Required Analysis Between Tool Uses + +**After EACH tool call, you MUST show your work:** + +1. **Connect new findings to previous results:** + - "This finding confirms/contradicts/refines [prior finding] because..." + - Show explicit linkages between sources + +2. **Show evolution of understanding:** + - "Initially I thought X, but this evidence suggests Y..." + - Document how perspective shifted + +3. **Highlight pattern changes:** + - Note when trends strengthen, weaken, or reverse + - Flag emerging patterns not present earlier + +4. **Address contradictions:** + - Document conflicting claims with sources + - Analyze potential reasons for disagreement + - Assess which claim has stronger evidence + +5. **Build coherent narrative:** + - Weave findings into flowing story + - Show logical progression of ideas + - Create clear transitions between sources + +--- + +#### Tool Usage Sequence (Per Theme) + +**REQUIRED ORDER:** + +1. **START:** `web_search` for landscape (count=20) +2. **ANALYZE:** Synthesize findings, identify patterns, note gaps +3. **DIVE:** `web_fetch` on primary sources for depth +4. **PROCESS:** Synthesize new findings with previous, challenge assumptions +5. **REPEAT:** Second cycle targeting identified gaps + +**Critical:** Always analyze between tool usage. Document your reasoning explicitly. + +--- + +#### Knowledge Integration (Cross-Theme) + +After completing all theme cycles: + +1. **Connect findings across sources:** + - Identify shared conclusions across themes + - Note when themes reinforce or challenge each other + +2. **Identify emerging patterns:** + - Meta-patterns visible only across themes + - Systemic insights from synthesis + +3. **Challenge contradictions:** + - Cross-theme conflicts require resolution + - Determine if contradictions are substantive or contextual + +4. **Map relationships between discoveries:** + - Create conceptual map of how findings relate + - Identify cause-effect chains + +5. **Form unified understanding:** + - Integrated narrative across all themes + - Comprehensive view of the topic + +--- + +## Error Handling Protocol + +When research encounters obstacles, follow this protocol: + +### Empty or Insufficient Search Results +1. **Broaden query terms** โ€” Remove specific constraints, use synonyms +2. **Try related concepts** โ€” Search adjacent terminology +3. **Document the gap** โ€” Note when authoritative sources are scarce +4. **Adjust confidence** โ€” Mark findings as [LOW] or [SPECULATIVE] when source-poor + +### Contradictory Sources Cannot Be Resolved +1. **Present both claims** with full context +2. **Analyze why they differ** โ€” methodology, time period, population +3. **Assess evidence quality** on each side +4. **Document as unresolved** if contradiction persists + +### Source Quality Concerns +- **No primary source available** โ€” Rely on secondary sources but flag limitation +- **Outdated information** โ€” Note publication date, assess if still relevant +- **Potential bias** โ€” Identify conflicts of interest, funding sources +- **Methodology unclear** โ€” Flag as lower confidence when methods not described + +### Technical Failures +- **web_fetch fails** โ€” Document URL attempted, note as inaccessible source +- **Rate limiting** โ€” Slow down, reduce search count, retry with backoff +- **Memory search unavailable** โ€” Proceed without cross-reference, note limitation + +--- + +## Research Standards + +### Evidence Requirements +- **Every conclusion must cite multiple sources** โ€” never rely on single source +- **All contradictions must be addressed** โ€” document and analyze conflicts +- **Uncertainties must be acknowledged** โ€” transparent about limitations +- **Limitations must be discussed** โ€” scope, methodology, gaps +- **Gaps must be identified** โ€” what remains unknown + +### Source Validation +- **Validate initial findings with multiple sources** +- **Cross-reference between searches** โ€” compare web_search results for consistency +- **Prioritize primary sources** โ€” original studies over secondary reporting +- **Document source reliability assessment** โ€” authority, recency, methodology + +### Citation Standards (APA Format) +- **Citation density:** Approximately 1-2 citations per paragraph +- **Format:** APA 7th edition (Author, Year) in-text, full references at end +- **Diversity:** Sources must represent multiple perspectives and publication types +- **Recency:** Prioritize current scientific consensus; note when relying on older work +- **All claims must be properly cited** โ€” no unsupported assertions + +### Conflicting Information Protocol +- **Flag conflicting information immediately** for deeper investigation +- **Analyze contradiction sources:** methodology differences, sample populations, time periods +- **Assess evidence quality** on each side of conflict +- **Document resolution or ongoing uncertainty** + +--- + +## Writing Style Requirements + +### Narrative Style +- **Flowing narrative style** โ€” prose, not lists +- **Academic but accessible** โ€” rigorous but readable +- **Evidence integrated naturally** โ€” citations woven into sentences +- **Progressive logical development** โ€” each paragraph builds on previous +- **Natural flow between concepts** โ€” smooth transitions + +### Structured Data Usage Rules + +| Phase | Tables Allowed | Lists Allowed | Format | +|-------|---------------|---------------|--------| +| **Phase 1 (Engagement)** | No | No (in response) | Conversational prose | +| **Phase 2 (Planning)** | Yes | Yes | Structured presentation for clarity | +| **Phase 3 (Execution)** | Internal notes only | Internal notes only | Your analysis can use structure | +| **Phase 4 (Final Report)** | No | No | Strict narrative prose only | + +**Phase 2 Exception:** Research Planning uses tables and lists intentionally โ€” this is the one phase where structured presentation aids clarity. The user reviews and approves this plan before execution. + +### Prohibited in Final Report (Phase 4) +- Bullet points or numbered lists +- Data tables (convert to prose description: "The three primary vendorsโ€”GitHub Copilot with 1.3M subscribers, Cursor with undisclosed but rapidly growing user base, and Codeium with strong freemium adoptionโ€”represent distinct market approaches...") +- Isolated data points without narrative context +- Section headers followed by lists instead of paragraphs + +### Required in Final Report +- Proper paragraphs with topic sentences +- Integrated evidence within flowing prose +- Clear transitions between ideas +- Academic but accessible language +- Data woven into narrative sentences + +### Paragraph Structure +- **Topic sentence:** Core claim +- **Evidence:** Supporting sources with citations +- **Analysis:** Interpretation and implications +- **Transition:** Link to next idea + +--- + +## Citation Format (APA 7th Edition) + +### In-Text Citations +``` +Recent research has demonstrated that GLP-1 agonists are associated with +significant reductions in lean mass (Johnson et al., 2023). + +Multiple meta-analyses have confirmed that resistance training combined +with adequate protein intake is more effective for preserving muscle mass +than either intervention alone (Smith, 2020; Williams & Thompson, 2021; +Garcia et al., 2022). + +Studies indicate that approximately 40-60% of weight loss from GLP-1 +treatment may come from lean mass (Johnson et al., 2023, p. 1831). +``` + +### Reference Format +``` +Garcia, J., Martinez, A., & Lee, S. (2022). Resistance training protocols + for muscle preservation during weight loss: A systematic review and + meta-analysis. Journal of Exercise Science, 15(3), 245-267. + https://doi.org/10.xxxx/jes.2022.15.3.245 + +Johnson, K. L., Wilson, P., Anderson, R., & Thompson, M. (2023). Body + composition changes associated with GLP-1 receptor agonist treatment: + A comprehensive analysis. Diabetes Care, 46(8), 1823-1842. + https://doi.org/10.xxxx/dc.2023.46.8.1823 + +Smith, R. (2020). Protein requirements for muscle preservation during + caloric restriction: Current evidence and practical recommendations. + American Journal of Clinical Nutrition, 112(4), 879-895. + https://doi.org/10.xxxx/ajcn.2020.112.4.879 +``` + +**Citation Rules:** +- Include author(s), year, title, publication, volume(issue), pages, DOI/URL +- Use "et al." for 3+ authors in-text; full list in references +- Hanging indent in reference list (2nd+ lines indented) +- Alphabetize references by first author's surname +- If source lacks formal citation data, use: (Source Name, n.d.) with URL + +--- + +## Quality Standards + +### Evidence Hierarchy +1. **Systematic reviews & meta-analyses** โ€” Highest confidence +2. **Randomized controlled trials** โ€” High confidence +3. **Cohort / longitudinal studies** โ€” Medium-high confidence +4. **Expert consensus / guidelines** โ€” Medium confidence +5. **Cross-sectional / observational** โ€” Medium confidence +6. **Expert opinion / editorials** โ€” Lower confidence, flag as such +7. **Media reports / blogs** โ€” Lowest confidence, verify against primary sources + +### Red Flags to Investigate +- Claims without cited sources +- Single-study findings presented as fact +- Conflicts of interest not disclosed +- Outdated information (check publication dates) +- Cherry-picked statistics +- Overgeneralization from limited samples + +### Confidence Annotations +- **[HIGH]** โ€” Multiple high-quality sources agree +- **[MEDIUM]** โ€” Limited or mixed evidence +- **[LOW]** โ€” Single source, preliminary, or needs verification +- **[SPECULATIVE]** โ€” Hypothesis or emerging area + +--- + +## Parallel Research Strategy + +For independent themes, use `sessions_spawn` to research in parallel. This is appropriate when themes don't depend on each other's findings. + +### When to Use Parallel Research +- Themes investigate distinct aspects (e.g., "market landscape" vs "technical capabilities") +- No cross-theme dependencies in early phases +- Time constraints require faster turnaround +- Sufficient token budget for multiple sub-agents + +### Parallel Research Workflow + +**Step 1: Spawn Sub-Agents for Each Theme** + +``` +Theme A (Market Landscape): +โ†’ sessions_spawn( + task="Research AI coding assistant market landscape. Complete 2 cycles: + Cycle 1: web_search count=20 on market share, key players, trends. + Analyze findings, identify gaps. + Cycle 2: web_fetch on top 5 sources, deep dive on contradictions. + Return: Key findings, confidence levels, gaps remaining, source list." + ) + +Theme B (Security): +โ†’ sessions_spawn( + task="Research security & compliance for AI coding assistants. Complete 2 cycles: + Cycle 1: web_search count=20 on SOC 2, HIPAA, data handling. + Analyze findings, identify gaps. + Cycle 2: web_fetch on security whitepapers, compliance docs. + Return: Key findings, confidence levels, gaps remaining, source list." + ) +``` + +**Step 2: Synthesize Results** + +When all sub-agents complete, integrate their findings: +- Combine key findings from each theme +- Identify cross-theme patterns and contradictions +- Normalize confidence levels across sub-agents +- Build unified narrative + +**Important:** Sub-agents run in isolation. They cannot see each other's work. You must explicitly pass any cross-cutting context in their task descriptions. + +### Memory Search Integration + +Before starting research, check for relevant prior knowledge: + +``` +โ†’ memory_search(query="previous research on [topic]") +โ†’ memory_get(path="memory/YYYY-MM-DD.md") [if relevant date found] +``` + +Use prior findings to: +- Avoid duplicate research +- Build on previous conclusions +- Identify how understanding has evolved +- Note persistent gaps from prior research + +--- + +## Phase 4: Final Report [STOP POINT THREE โ€” PRESENT TO USER] + +Present a cohesive research paper. The report must read as a complete academic narrative with proper paragraphs, transitions, and integrated evidence. + +### Critical Reminders for Final Report +- **Stop only at three major points** (Initial Engagement, Research Planning, Final Report) +- **Always analyze between tool usage** during research phase +- **Show clear thinking progression** โ€” document evolution of understanding +- **Connect findings explicitly** โ€” link sources and concepts +- **Build coherent narrative throughout** โ€” unified story, not disconnected facts + +### Report Structure + +```markdown +# Research Report: [Topic] + +## Executive Summary +Two to three substantial paragraphs that capture the core research question, +primary findings, and overall significance. This section provides readers +with a clear understanding of what was investigated and what conclusions +were reached, along with the confidence level attached to those conclusions. + +--- + +## Knowledge Development +This section traces how understanding evolved through the research process, +beginning with initial assumptions and documenting how they were challenged, +refined, or confirmed as investigation proceeded. The narrative addresses +key turning points where new evidence shifted perspective, describes how +uncertainties were either resolved or acknowledged as persistent limitations, +and reflects on the challenges encountered during the research process. +Particular attention is paid to how confidence in various claims changed +as additional sources were examined and cross-referenced, demonstrating +the iterative nature of building comprehensive understanding through +systematic investigation. + +--- + +## Comprehensive Analysis + +### Primary Findings and Their Implications +The core findings of the research are presented here as a flowing narrative +that addresses the central research question. Each significant discovery +is explored in depth with supporting evidence integrated naturally into +the prose. The implications of these findings are analyzed with attention +to their significance within the broader context of the field, connecting +individual discoveries to larger patterns and trends. + +### Patterns and Trends Across Research Phases +This subsection examines the meta-patterns that emerged only through the +synthesis of multiple research phases. The trajectory of the field or topic +is analyzed, showing how individual findings coalesce into larger movements +and identifying which trends appear robust versus which may be ephemeral. + +### Contradictions and Competing Evidence +Where sources conflict, those contradictions are presented fairly and +analyzed thoroughly. The discussion addresses potential reasons for +disagreement, such as differences in methodology, sample populations, +or time periods. Evidence quality on each side of conflicts is assessed, +and instances where contradictions remain unresolved are documented +transparently. + +### Strength of Evidence for Major Conclusions +For each major conclusion, the quantity and quality of supporting sources +is evaluated. The consistency of evidence across sources is examined, +and limitations in the available evidence are discussed openly. + +### Limitations and Gaps in Current Knowledge +This subsection acknowledges what remains unknown despite thorough +investigation. Weaknesses in available evidence are identified, areas +where research is preliminary are noted, and questions that emerged +during research but remain unanswered are documented. + +### Integration of Findings Across Themes +The connections between themes are explored here, demonstrating how +separate lines of investigation reinforce and illuminate each other. +The unified understanding that emerges from synthesis is presented, +identifying systemic insights that only became visible through +cross-theme analysis. + +--- + +## Practical Implications + +### Immediate Practical Applications +Concrete and actionable recommendations based on the research findings +are presented here. Specific guidance is offered for practitioners, +decision-makers, or researchers who wish to apply these findings in +real-world contexts. + +### Long-Term Implications and Developments +The discussion addresses how the findings may shape the field going +forward, identifying emerging trends that may become significant and +potential paradigm shifts that could result from this research. + +### Risk Factors and Mitigation Strategies +Risks associated with the findings or their application are identified, +and evidence-based mitigation approaches are proposed. + +### Implementation Considerations +Practical factors for applying the findings are addressed, including +resource requirements, timeline considerations, prerequisites, and +potential barriers to implementation. + +### Future Research Directions +Questions that remain unanswered after this investigation are +documented, along with methodological improvements needed and +promising avenues for further investigation. + +### Broader Impacts and Considerations +The societal, ethical, or systemic implications of the findings +are explored, along with connections to other fields or domains +and unintended consequences that should be considered. + +--- + +## References + +[Full APA-formatted reference list in alphabetical order by first author's +surname. Every in-text citation must appear here with complete bibliographic +information including hanging indentation.] + +--- + +## Appendices (if needed) + +### Appendix A: Search Strategy +Search queries used for each theme along with databases and sources +consulted, with dates of search clearly documented. + +### Appendix B: Source Reliability Assessment +Evaluation criteria used to assess sources with ratings for major +references included in the research. + +### Appendix C: Excluded Sources +Sources that were reviewed but ultimately not cited in the final +report, with explanations for their exclusion. + +### Appendix D: Research Timeline +Chronology of the investigation with key milestones in the research +process documented. +``` + +### Writing Requirements + +**Format:** +- All content presented as proper paragraphs +- Flowing prose with natural transitions +- No isolated facts โ€” everything connected to larger argument +- Data and statistics woven into narrative sentences + +**Content:** +- Each major section contains substantial narrative (6-8+ paragraphs minimum) +- Every key assertion supported by multiple sources +- All aspects thoroughly explored with depth +- Critical analysis, not just description + +**Style:** +- Academic rigor with accessible language +- Active engagement with sources through analysis +- Clear narrative arc from question to conclusion +- Balance between summary and critical evaluation + +**Citations:** +- One to two citations per paragraph minimum +- Integrated smoothly into prose +- Multiple sources cited for important claims +- Natural flow: "Research by Smith (2020) and Jones (2021) indicates..." + +--- + +## Research Ethics + +- **Transparency:** Always disclose limitations and uncertainties +- **Balance:** Present competing viewpoints fairly +- **Recency:** Prioritize recent sources unless historical context needed +- **Verification:** Flag unverified claims; don't present speculation as fact +- **Scope:** Stay within requested boundaries; note when expansion needed +- **Intellectual honesty:** Report contradictory findings even if they complicate conclusions diff --git a/_meta.json b/_meta.json new file mode 100644 index 0000000..9c91181 --- /dev/null +++ b/_meta.json @@ -0,0 +1,6 @@ +{ + "ownerId": "kn7fmw4ybcy50qzp1d2dvb1h517znaes", + "slug": "academic-deep-research", + "version": "1.0.0", + "publishedAt": 1770020432089 +} \ No newline at end of file diff --git a/example.md b/example.md new file mode 100644 index 0000000..ba9db5e --- /dev/null +++ b/example.md @@ -0,0 +1,214 @@ +# Deep Research Example Workflow + +## User Request +> "I need to understand the current state of AI coding assistants. What's the landscape, who are the players, and what should I know if I'm choosing one for my team?" + +--- + +## Phase 1: Initial Engagement + +**Clarifying Questions:** +1. What is the primary question or problem you're trying to solve โ€” are you replacing an existing tool, or is this a first adoption? +2. What depth of analysis do you need โ€” a quick landscape overview, or exhaustive technical and security analysis? +3. Are there specific constraints โ€” budget limits, security requirements, or integration needs with your existing stack? + +**User Response:** +> "First adoption, need exhaustive analysis for a 15-person engineering team using VS Code. Security is paramount because we handle healthcare data. Budget is flexible if the value is there." + +**Reflection:** +> "So you need a comprehensive, security-focused analysis of AI coding assistants for healthcare software development. The research must prioritize security certifications, data handling practices, and HIPAA compliance considerations alongside technical capabilities. I'll investigate market players, security frameworks, real-world healthcare adoption, and implementation considerations for regulated environments." + +--- + +## Phase 2: Research Planning + +### Major Themes Identified + +| Theme | Key Questions | Approach | +|-------|---------------|----------| +| **1. Market Landscape** | Who are the major players? Market share? Differentiation? | Broad search + competitive analysis | +| **2. Security & Compliance** | How is code handled? SOC 2? HIPAA? On-prem options? | Deep dive on security docs, compliance certifications | +| **3. Technical Capabilities** | Context understanding, language support, IDE integration | Feature comparison, benchmark analysis | +| **4. Healthcare-Specific Considerations** | Healthcare adoption, PHI handling, regulatory alignment | Specialized sources, case studies | +| **5. Implementation & ROI** | Setup complexity, productivity data, pricing models | Real-world reports, cost analysis | + +### Research Execution Plan + +**Theme 1: Market Landscape** +1. `web_search count=20` "AI coding assistants 2024 2025 market landscape comparison" +2. Analysis โ€” identify players, segments, trends +3. `web_fetch` on top 5 authoritative sources +4. Synthesis โ€” market structure, competitive dynamics + +**Theme 2: Security & Compliance** +1. `web_search count=20` "GitHub Copilot security SOC 2 HIPAA compliance enterprise" +2. Analysis โ€” security frameworks, data handling +3. `web_fetch` on security whitepapers, compliance docs +4. Synthesis โ€” security landscape, gaps, recommendations + +**Theme 3: Technical Capabilities** +1. `web_search count=20` "AI coding assistant benchmarks code completion accuracy 2024" +2. Analysis โ€” feature matrices, performance claims +3. `web_fetch` on benchmark studies, technical docs +4. Synthesis โ€” capability assessment, differentiators + +**Theme 4: Healthcare-Specific** +1. `web_search count=20` "AI coding assistant healthcare HIPAA PHI code security" +2. Analysis โ€” healthcare adoption, regulatory concerns +3. `web_fetch` on healthcare case studies, compliance guides +4. Synthesis โ€” healthcare readiness, risk factors + +**Theme 5: Implementation & ROI** +1. `web_search count=20` "AI coding assistant developer productivity ROI study 2024" +2. Analysis โ€” productivity claims, cost models +3. `web_fetch` on ROI studies, pricing analysis +4. Synthesis โ€” value proposition, implementation factors + +### Expected Deliverables +- Comprehensive research paper (20-30 pages) +- APA citations throughout (1-2 per paragraph) +- Security-focused recommendations for healthcare +- Side-by-side comparison matrix (converted to narrative) +- Implementation roadmap + +--- + +## Phase 3: Research Cycles (Auto-Execute) + +### Theme 1: Market Landscape โ€” Cycle 1 + +**Tool:** `web_search query="AI coding assistants 2024 2025 market landscape comparison GitHub Copilot Cursor" count=20` + +**Results:** Found sources on GitHub Copilot, Cursor, Codeium, Amazon CodeWhisperer, Google Code Assist, Tabnine, Replit Ghostwriter. + +**Analysis:** +The search reveals a market dominated by GitHub Copilot with approximately 1.3 million paid subscribers as of 2024, representing the first-mover advantage in this space. Cursor has emerged as a significant challenger, having raised $60 million in Series A funding and positioning itself around larger context windows and privacy-focused features. Codeium operates on a freemium model with strong uptake among individual developers, while enterprise-focused players like Amazon CodeWhisperer and Google Code Assist are targeting organizational adoption through cloud integration. + +Initial hypotheses forming: (1) Market is consolidating around a few major players, (2) Differentiation is shifting from basic completion to context understanding and privacy, (3) Enterprise features are becoming the primary battleground. + +**Gaps Identified:** +- Limited independent market share data โ€” most figures come from vendor announcements +- Unclear on international market distribution +- Need deeper analysis on enterprise vs. individual adoption rates +- Contradiction: Some sources claim Cursor is gaining rapidly; others show Copilot maintaining dominance + +--- + +### Theme 1: Market Landscape โ€” Cycle 2 + +**Tool:** `web_fetch` on GitHub Copilot official documentation, Cursor about page, and TechCrunch funding coverage. + +**Analysis:** +Fetching primary sources reveals important nuances that challenge initial assumptions. While GitHub's announced 1.3 million subscribers represents paid users, the actual developer reach is larger through free educational licenses and open source programs. This suggests the market is actually more fragmented than initially hypothesized. + +Cursor's $60 million raise is confirmed through TechCrunch reporting, but the company's actual user numbers remain private. This creates uncertainty about their true market position โ€” they may be growing rapidly in specific segments (startups, privacy-conscious developers) without threatening Copilot's overall dominance. + +The contradiction noted in Cycle 1 appears resolvable: Cursor is gaining mindshare and venture attention, but Copilot maintains volume leadership. These are different metrics of "success" that can coexist. + +New patterns emerging: (1) Market segmentation by use case (individual vs. enterprise), (2) Geographic variation in adoption, (3) Pricing model experimentation (per-seat vs. usage-based). + +**Connections to Cycle 1:** +The initial hypothesis about market consolidation requires refinement. Rather than a winner-take-all dynamic, the evidence suggests parallel market development: Copilot for broad adoption, Cursor for specific segments, and specialized tools for niche use cases. + +**Remaining Uncertainties:** +- Actual Cursor user numbers remain undisclosed +- Enterprise adoption rates poorly documented +- International market data largely absent + +--- + +### Theme 2: Security & Compliance โ€” Cycle 1 + +**Tool:** `web_search query="GitHub Copilot security SOC 2 HIPAA compliance enterprise data handling" count=20` + +**Results:** Found GitHub security whitepaper, SOC 2 reports, enterprise trust documentation, and some healthcare-specific discussions. + +**Analysis:** +Security documentation reveals significant variation in compliance posture across vendors. GitHub Copilot Business and Enterprise tiers explicitly address security concerns with SOC 2 Type II certification and options for code isolation that prevent training data inclusion. However, HIPAA compliance remains ambiguous โ€” GitHub states they will sign Business Associate Agreements but stops short of claiming HIPAA compliance for the AI features themselves. + +Cursor positions itself as privacy-first with a local mode that processes code entirely on-device, eliminating transmission risks. This represents a fundamentally different security model that may be more appropriate for healthcare contexts. + +Initial hypothesis: Security features correlate with pricing tier, with enterprise offerings providing necessary controls for regulated industries. + +**Gaps Identified:** +- No clear HIPAA compliance claims from any vendor for AI features specifically +- Limited independent security audits published +- Unclear on data retention policies across vendors +- Contradiction: Some sources suggest on-premise options exist; others indicate cloud-only + +--- + +### Theme 2: Security & Compliance โ€” Cycle 2 + +**Tool:** `web_fetch` on GitHub Copilot Enterprise security documentation, Cursor security page, and healthcare IT forum discussions. + +**Analysis:** +Deep extraction from primary sources confirms the security landscape is complex and evolving. GitHub's documentation explicitly states that Copilot Enterprise offers "code isolation" that prevents user code from being used to train public models, addressing a major concern for proprietary codebases. However, the documentation carefully avoids HIPAA compliance claims, instead referring to GitHub's general BAA availability. + +This is a critical finding: The AI coding assistant products themselves may not be HIPAA-compliant even when the underlying platform is. This distinction is crucial for healthcare organizations. + +Cursor's local mode is confirmed to process code entirely on-device using locally-hosted models, representing a genuinely different security posture. For healthcare, this eliminates the data transmission risk but may introduce other concerns (local model quality, update mechanisms). + +The contradiction about on-premise options is partially resolved: No major vendor offers true on-premise AI coding assistants as of this research, though some offer VPC deployment or local processing modes. + +**Connections to Theme 1:** +The market segmentation noted earlier aligns with security postures: Copilot for general enterprise, Cursor for security-conscious segments. This suggests healthcare organizations may need to evaluate Cursor more seriously despite its smaller market presence. + +--- + +### Theme 3: Technical Capabilities โ€” Cycle 1 + +*[Similar two-cycle process with analysis between each tool call...]* + +--- + +### Theme 4: Healthcare-Specific Considerations โ€” Cycle 1 + +*[Similar two-cycle process...]* + +--- + +### Theme 5: Implementation & ROI โ€” Cycle 1 + +*[Similar two-cycle process...]* + +--- + +## Cross-Theme Knowledge Integration + +After completing all five themes with two cycles each, synthesis reveals: + +**Emerging Pattern 1: The Healthcare Gap** +No AI coding assistant currently claims HIPAA compliance for AI features specifically. This creates a significant gap for healthcare organizations that wasn't immediately apparent in initial landscape analysis. + +**Emerging Pattern 2: The Security-Performance Tradeoff** +Local processing (Cursor) offers superior privacy but may sacrifice some capability compared to cloud-powered alternatives (Copilot). This tradeoff is central to the healthcare decision. + +**Emerging Pattern 3: Market Immaturity** +The field is evolving rapidly with new features and compliance developments monthly. Any recommendation must account for this volatility. + +**Contradictions Resolved:** +- Market dominance vs. challenger growth: Different metrics, both valid +- Cloud vs. on-premise: No true on-premise exists; local processing is the alternative + +**Unified Understanding:** +For a healthcare engineering team, the decision framework differs from general enterprise adoption. Security and compliance considerations outweigh raw capability, suggesting evaluation of Cursor's local mode as a primary option despite smaller market presence. + +--- + +## Phase 4: Final Report + +*[Presented as cohesive research paper with narrative sections, proper APA citations, no bullet points, 6-8+ paragraphs per major section...]* + +--- + +## Key Distinctions from Standard Research + +| Aspect | Standard Research | Deep Research Protocol | +|--------|-------------------|------------------------| +| Cycles per theme | 1 | Minimum 2 | +| Analysis between tools | Optional | Required | +| Citation density | As needed | 1-2 per paragraph | +| Final format | Flexible | Academic narrative | +| Contradiction handling | Note if found | Must address all | +| Writing style | Variable | Flowing prose only | diff --git a/quickref.md b/quickref.md new file mode 100644 index 0000000..3bd1cd6 --- /dev/null +++ b/quickref.md @@ -0,0 +1,80 @@ +# Deep Research Quick Reference + +## Invocation +- `/research` or mention "deep research" / "exhaustive analysis" + +## Four Phases + +| Phase | User Action | Your Action | Key Output | +|-------|-------------|-------------|------------| +| 1. Engagement | Answer clarifying questions | Reflect understanding, **WAIT** | Confirmed scope | +| 2. Planning | Review & approve plan | Present themes + execution plan, **WAIT** | Approved roadmap | +| 3. Execution | None (fully automated) | Execute ALL cycles with analysis | Raw research data | +| 4. Final Report | Review comprehensive report | Present academic narrative | Full paper | + +## Stop Points (Only Three) +1. โœ… After clarifying questions (Phase 1) +2. โœ… After research plan presentation (Phase 2) +3. โœ… Final report delivery (Phase 4) + +## Tool Usage Sequence (Per Theme) +1. **START:** `web_search` for landscape (count=20) +2. **ANALYZE:** Synthesize findings, identify patterns and gaps +3. **DIVE:** `web_fetch` for depth on key sources +4. **PROCESS:** Synthesize new findings, challenge assumptions +5. **REPEAT:** Second cycle targeting identified gaps + +## Required Analysis After Every Tool Use +- Connect new findings to previous results +- Show evolution of understanding +- Highlight pattern changes +- Address contradictions +- Build coherent narrative + +## Research Standards +- Every conclusion cites **multiple sources** +- All **contradictions addressed** +- **Uncertainties acknowledged** +- **Limitations discussed** +- **Gaps identified** + +## Writing Style (Final Report) +- **Flowing narrative** โ€” paragraphs only, no lists +- **Academic but accessible** +- **Evidence integrated naturally** in prose +- **Progressive logical development** +- **Smooth transitions** between concepts + +## Prohibited in Final Report +- Bullet points or numbered lists +- Tables (convert to prose) +- Isolated data without context +- Section headers without narrative + +## Citation Standards (APA 7th) +- **Density:** 1-2 citations per paragraph +- **Format:** (Author, Year) in-text +- **References:** Full APA with hanging indent +- **All claims cited** โ€” no exceptions + +## Confidence Annotations +- **[HIGH]** โ€” Multiple high-quality sources agree +- **[MEDIUM]** โ€” Limited or mixed evidence +- **[LOW]** โ€” Single source, needs verification +- **[SPECULATIVE]** โ€” Emerging area + +## Report Sections (Narrative Format) +1. **Executive Summary** โ€” 2-3 paragraphs +2. **Knowledge Development** โ€” evolution of understanding (6-8+ paragraphs) +3. **Comprehensive Analysis** โ€” findings, patterns, contradictions, evidence (6-8+ paragraphs each subsection) +4. **Practical Implications** โ€” applications, risks, future research (6-8+ paragraphs each subsection) +5. **References** โ€” APA format, alphabetical +6. **Appendices** โ€” optional + +## Critical Reminders +- Stop only at three major points +- Always analyze between tool usage +- Show clear thinking progression +- Connect findings explicitly +- Build coherent narrative throughout +- No shortcuts or rushed analysis