Initial commit with translated description
This commit is contained in:
357
SKILL.md
Normal file
357
SKILL.md
Normal file
@@ -0,0 +1,357 @@
|
||||
---
|
||||
name: web-search-exa
|
||||
description: "神经网页搜索、内容提取、公司和人员研究、代码搜索和通过Exa MCP服务器的深度研究。"
|
||||
---
|
||||
|
||||
# Exa — Neural Web Search & Research
|
||||
|
||||
Exa is a neural search engine. Unlike keyword-based search, it understands meaning — you describe the page you're looking for and it finds it. Returns clean, LLM-ready content with no scraping needed.
|
||||
|
||||
**MCP server:** `https://mcp.exa.ai/mcp`
|
||||
**Free tier:** generous rate limits, no key needed for basic tools
|
||||
**API key:** [dashboard.exa.ai/api-keys](https://dashboard.exa.ai/api-keys) — unlocks higher limits + all tools
|
||||
**Docs:** [exa.ai/docs](https://exa.ai/docs)
|
||||
**GitHub:** [github.com/exa-labs/exa-mcp-server](https://github.com/exa-labs/exa-mcp-server)
|
||||
|
||||
## Setup
|
||||
|
||||
Add the MCP server to your agent config:
|
||||
|
||||
```bash
|
||||
# OpenClaw
|
||||
openclaw mcp add exa --url "https://mcp.exa.ai/mcp"
|
||||
```
|
||||
|
||||
Or in any MCP config JSON:
|
||||
```json
|
||||
{
|
||||
"mcpServers": {
|
||||
"exa": {
|
||||
"url": "https://mcp.exa.ai/mcp"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
To unlock all tools and remove rate limits, append your API key:
|
||||
```
|
||||
https://mcp.exa.ai/mcp?exaApiKey=YOUR_EXA_KEY
|
||||
```
|
||||
|
||||
To enable specific optional tools:
|
||||
```
|
||||
https://mcp.exa.ai/mcp?exaApiKey=YOUR_KEY&tools=web_search_exa,web_search_advanced_exa,people_search_exa,crawling_exa,company_research_exa,get_code_context_exa,deep_researcher_start,deep_researcher_check,deep_search_exa
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Tool Reference
|
||||
|
||||
### Default tools (available without API key)
|
||||
|
||||
| Tool | What it does |
|
||||
|------|-------------|
|
||||
| `web_search_exa` | General-purpose web search — clean content, fast |
|
||||
| `get_code_context_exa` | Code examples + docs from GitHub, Stack Overflow, official docs |
|
||||
| `company_research_exa` | Company overview, news, funding, competitors |
|
||||
|
||||
### Optional tools (enable via `tools` param, need API key for some)
|
||||
|
||||
| Tool | What it does |
|
||||
|------|-------------|
|
||||
| `web_search_advanced_exa` | Full-control search: domain filters, date ranges, categories, content modes |
|
||||
| `crawling_exa` | Extract full page content from a known URL — handles JS, PDFs, complex layouts |
|
||||
| `people_search_exa` | Find LinkedIn profiles, professional backgrounds, experts |
|
||||
| `deep_researcher_start` | Kick off an async multi-step research agent → detailed report |
|
||||
| `deep_researcher_check` | Poll status / retrieve results from deep research |
|
||||
| `deep_search_exa` | Single-call deep search with synthesized answer + citations (needs API key) |
|
||||
|
||||
---
|
||||
|
||||
## web_search_exa
|
||||
|
||||
Fast general search. Describe what you're looking for in natural language.
|
||||
|
||||
**Parameters:**
|
||||
- `query` (string, required) — describe the page you want to find
|
||||
- `numResults` (int) — number of results, default 10
|
||||
- `type` — `auto` (best quality), `fast` (lower latency), `deep` (multi-step reasoning)
|
||||
- `livecrawl` — `fallback` (default) or `preferred` (always fetch fresh)
|
||||
- `contextMaxCharacters` (int) — cap the returned content size
|
||||
|
||||
```
|
||||
web_search_exa {
|
||||
"query": "blog posts about using vector databases for recommendation systems",
|
||||
"numResults": 8
|
||||
}
|
||||
```
|
||||
|
||||
```
|
||||
web_search_exa {
|
||||
"query": "latest OpenAI announcements March 2026",
|
||||
"numResults": 5,
|
||||
"type": "fast"
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## web_search_advanced_exa
|
||||
|
||||
The power-user tool. Everything `web_search_exa` does, plus domain filters, date filters, category targeting, and content extraction modes.
|
||||
|
||||
**Extra parameters beyond basic search:**
|
||||
|
||||
| Parameter | Type | What it does |
|
||||
|-----------|------|-------------|
|
||||
| `includeDomains` | string[] | Only return results from these domains (max 1200) |
|
||||
| `excludeDomains` | string[] | Block results from these domains |
|
||||
| `category` | string | Target content type — see table below |
|
||||
| `startPublishedDate` | string | ISO date, results published after this |
|
||||
| `endPublishedDate` | string | ISO date, results published before this |
|
||||
| `maxAgeHours` | int | Content freshness — `0` = always livecrawl, `-1` = cache only, `24` = cache if <24h |
|
||||
| `contents.highlights` | object | Extractive snippets relevant to query. Set `maxCharacters` to control size |
|
||||
| `contents.text` | object | Full page as clean markdown. Set `maxCharacters` to cap |
|
||||
| `contents.summary` | object | LLM-generated summary. Supports `query` and JSON `schema` for structured extraction |
|
||||
|
||||
**Categories:**
|
||||
|
||||
| Category | Best for |
|
||||
|----------|---------|
|
||||
| `company` | Company pages, LinkedIn company profiles |
|
||||
| `people` | LinkedIn profiles, professional bios, personal sites |
|
||||
| `research paper` | arXiv, academic papers, peer-reviewed research |
|
||||
| `news` | Current events, journalism |
|
||||
| `tweet` | Posts from X/Twitter |
|
||||
| `personal site` | Blogs, personal pages |
|
||||
| `financial report` | SEC filings, earnings reports |
|
||||
|
||||
### Examples
|
||||
|
||||
**Research papers:**
|
||||
```
|
||||
web_search_advanced_exa {
|
||||
"query": "transformer architecture improvements for long-context windows",
|
||||
"category": "research paper",
|
||||
"numResults": 15,
|
||||
"contents": { "highlights": { "maxCharacters": 3000 } }
|
||||
}
|
||||
```
|
||||
|
||||
**Company list building with structured extraction:**
|
||||
```
|
||||
web_search_advanced_exa {
|
||||
"query": "Series A B2B SaaS companies in climate tech founded after 2022",
|
||||
"category": "company",
|
||||
"numResults": 25,
|
||||
"contents": {
|
||||
"summary": {
|
||||
"query": "company name, what they do, funding stage, location",
|
||||
"schema": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"name": { "type": "string" },
|
||||
"description": { "type": "string" },
|
||||
"funding": { "type": "string" },
|
||||
"location": { "type": "string" }
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**People search — find candidates with specific profiles:**
|
||||
```
|
||||
web_search_advanced_exa {
|
||||
"query": "machine learning engineers at fintech startups in NYC with experience in fraud detection",
|
||||
"category": "people",
|
||||
"numResults": 20,
|
||||
"contents": { "highlights": { "maxCharacters": 2000 } }
|
||||
}
|
||||
```
|
||||
|
||||
**Finding pages similar to a known URL:**
|
||||
Use the URL itself as the query — Exa will find semantically similar pages:
|
||||
```
|
||||
web_search_advanced_exa {
|
||||
"query": "https://linkedin.com/in/some-candidate-profile",
|
||||
"numResults": 15,
|
||||
"contents": { "highlights": { "maxCharacters": 2000 } }
|
||||
}
|
||||
```
|
||||
|
||||
**Recent news with freshness control:**
|
||||
```
|
||||
web_search_advanced_exa {
|
||||
"query": "AI regulation policy updates",
|
||||
"category": "news",
|
||||
"maxAgeHours": 72,
|
||||
"numResults": 10,
|
||||
"contents": { "highlights": { "maxCharacters": 4000 } }
|
||||
}
|
||||
```
|
||||
|
||||
**Scoped domain search:**
|
||||
```
|
||||
web_search_advanced_exa {
|
||||
"query": "authentication best practices",
|
||||
"includeDomains": ["owasp.org", "auth0.com", "docs.github.com"],
|
||||
"numResults": 10,
|
||||
"contents": { "text": { "maxCharacters": 5000 } }
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## company_research_exa
|
||||
|
||||
One-call company research. Returns business overview, recent news, funding, and competitive landscape.
|
||||
|
||||
```
|
||||
company_research_exa { "query": "Stripe payments company overview and recent news" }
|
||||
```
|
||||
|
||||
```
|
||||
company_research_exa { "query": "what does Anduril Industries do and who are their competitors" }
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## people_search_exa
|
||||
|
||||
Find professionals by role, company, location, expertise. Returns LinkedIn profiles and bios.
|
||||
|
||||
```
|
||||
people_search_exa { "query": "VP of Engineering at healthcare startups in San Francisco" }
|
||||
```
|
||||
|
||||
```
|
||||
people_search_exa { "query": "AI researchers specializing in multimodal models" }
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## get_code_context_exa
|
||||
|
||||
Search GitHub repos, Stack Overflow, and documentation for code examples and API usage patterns.
|
||||
|
||||
```
|
||||
get_code_context_exa { "query": "how to implement rate limiting in Express.js with Redis" }
|
||||
```
|
||||
|
||||
```
|
||||
get_code_context_exa { "query": "Python asyncio connection pooling example with aiohttp" }
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## crawling_exa
|
||||
|
||||
Extract clean content from a specific URL. Handles JavaScript-rendered pages, PDFs, and complex layouts. Returns markdown.
|
||||
|
||||
```
|
||||
crawling_exa { "url": "https://arxiv.org/abs/2301.07041" }
|
||||
```
|
||||
|
||||
Good for when you already have the URL and want to read the page.
|
||||
|
||||
---
|
||||
|
||||
## deep_researcher_start + deep_researcher_check
|
||||
|
||||
Long-running async research. Exa's research agent searches, reads, and compiles a detailed report.
|
||||
|
||||
**Start a research task:**
|
||||
```
|
||||
deep_researcher_start {
|
||||
"query": "competitive landscape of AI code generation tools in 2026 — key players, pricing, technical approaches, market share"
|
||||
}
|
||||
```
|
||||
|
||||
**Check status (use the researchId from the start response):**
|
||||
```
|
||||
deep_researcher_check { "researchId": "abc123..." }
|
||||
```
|
||||
|
||||
Poll `deep_researcher_check` until status is `completed`. The final response includes the full report.
|
||||
|
||||
---
|
||||
|
||||
## deep_search_exa
|
||||
|
||||
Single-call deep search: expands your query across multiple angles, searches, reads results, and returns a synthesized answer with grounded citations. Requires API key.
|
||||
|
||||
```
|
||||
deep_search_exa { "query": "what are the leading approaches to multimodal RAG in production systems" }
|
||||
```
|
||||
|
||||
Supports structured output via `outputSchema`:
|
||||
```
|
||||
deep_search_exa {
|
||||
"query": "top 10 aerospace companies by revenue",
|
||||
"type": "deep",
|
||||
"outputSchema": {
|
||||
"type": "object",
|
||||
"required": ["companies"],
|
||||
"properties": {
|
||||
"companies": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"name": { "type": "string" },
|
||||
"revenue": { "type": "string" },
|
||||
"hq": { "type": "string" }
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Query Craft
|
||||
|
||||
Exa is neural — it matches on meaning, not keywords. Write queries like you'd describe the ideal page to a colleague.
|
||||
|
||||
**Do:** "blog post about using embeddings for product recommendations at scale"
|
||||
**Don't:** "embeddings product recommendations"
|
||||
|
||||
**Do:** "Stripe payments company San Francisco fintech"
|
||||
**Don't:** "Stripe" (too ambiguous)
|
||||
|
||||
- Use `category` when you know the content type — it makes a big difference.
|
||||
- For broader coverage, run 2-3 query variations in parallel and deduplicate results.
|
||||
- For agentic workflows, use `highlights` instead of full `text` — it's 10x more token-efficient while keeping the relevant parts.
|
||||
|
||||
## Token Efficiency
|
||||
|
||||
| Content mode | When to use |
|
||||
|-------------|------------|
|
||||
| `highlights` | Agent workflows, factual lookups, multi-step pipelines — most token-efficient |
|
||||
| `text` | Deep analysis, when you need full page context |
|
||||
| `summary` | Quick overviews, structured extraction with JSON schema |
|
||||
|
||||
Set `maxCharacters` on any content mode to control output size.
|
||||
|
||||
## When to Reach for Which Tool
|
||||
|
||||
| I need to... | Use |
|
||||
|-------------|-----|
|
||||
| Quick web lookup | `web_search_exa` |
|
||||
| Research papers, academic search | `web_search_advanced_exa` + `category: "research paper"` |
|
||||
| Company intel, competitive analysis | `company_research_exa` or advanced + `category: "company"` |
|
||||
| Find people, candidates, experts | `people_search_exa` or advanced + `category: "people"` |
|
||||
| Code examples, API docs | `get_code_context_exa` |
|
||||
| Read a specific URL | `crawling_exa` |
|
||||
| Find pages similar to a URL | `web_search_advanced_exa` with URL as query |
|
||||
| Recent news / tweets | Advanced + `category: "news"` or `"tweet"` + `maxAgeHours` |
|
||||
| Detailed research report | `deep_researcher_start` → `deep_researcher_check` |
|
||||
| Quick answer with citations | `deep_search_exa` |
|
||||
|
||||
---
|
||||
|
||||
**Docs:** [exa.ai/docs](https://exa.ai/docs) — **Dashboard:** [dashboard.exa.ai](https://dashboard.exa.ai) — **Support:** support@exa.ai
|
||||
Reference in New Issue
Block a user