# Web Search Plus > Unified multi-provider web search with **Intelligent Auto-Routing** β€” uses multi-signal analysis to automatically select between **Serper**, **Tavily**, **Querit**, **Exa**, **Perplexity (Sonar Pro)**, **You.com**, and **SearXNG** with confidence scoring. [![ClawHub](https://img.shields.io/badge/ClawHub-web--search--plus-blue)](https://clawhub.ai) [![Version](https://img.shields.io/badge/version-2.9.0-green)](https://clawhub.ai) [![GitHub](https://img.shields.io/badge/GitHub-web--search--plus-blue)](https://github.com/robbyczgw-cla/web-search-plus) --- ## 🧠 Features (v2.9.0) **Intelligent Multi-Signal Routing** β€” The skill uses sophisticated query analysis: - **Intent Classification**: Shopping vs Research vs Discovery vs RAG/Real-time vs Privacy - **Linguistic Patterns**: "how much" (price) vs "how does" (research) vs "privately" (privacy) - **Entity Detection**: Product+brand combos, URLs, domains - **Complexity Analysis**: Long queries favor research providers - **Confidence Scoring**: Know how reliable the routing decision is ```bash python3 scripts/search.py -q "how much does iPhone 16 cost" # β†’ Serper (68% confidence) python3 scripts/search.py -q "how does quantum entanglement work" # β†’ Tavily (86% HIGH) python3 scripts/search.py -q "startups similar to Notion" # β†’ Exa (76% HIGH) python3 scripts/search.py -q "companies like stripe.com" # β†’ Exa (100% HIGH - URL detected) python3 scripts/search.py -q "summarize key points on AI" # β†’ You.com (68% MEDIUM - RAG intent) python3 scripts/search.py -q "search privately without tracking" # β†’ SearXNG (74% HIGH - privacy intent) ``` --- ## πŸ” When to Use Which Provider ### Built-in Brave Search (OpenClaw default) - βœ… General web searches - βœ… Privacy-focused - βœ… Quick lookups - βœ… Default fallback ### Serper (Google Results) - πŸ› **Product specs, prices, shopping** - πŸ“ **Local businesses, places** - 🎯 **"Google it" - explicit Google results** - πŸ“° **Shopping/images needed** - πŸ† **Knowledge Graph data** ### Tavily (AI-Optimized Research) - πŸ“š **Research questions, deep dives** - πŸ”¬ **Complex multi-part queries** - πŸ“„ **Need full page content** (not just snippets) - πŸŽ“ **Academic/technical research** - πŸ”’ **Domain filtering** (trusted sources) ### Querit (Multilingual AI Search) - 🌏 **Multilingual AI search** across 10+ languages - ⚑ **Fast real-time answers** with ~400ms latency - πŸ—ΊοΈ **International / cross-language queries** - πŸ“° **Recency-aware results** for current information - πŸ€– **Good fit for AI workflows** with clean metadata ### Exa (Neural Semantic Search) - πŸ”— **Find similar pages** - 🏒 **Company/startup discovery** - πŸ“ **Research papers** - πŸ’» **GitHub projects** - πŸ“… **Date-specific content** ### Perplexity (Sonar Pro via Kilo Gateway) - ⚑ **Direct answers** (great for β€œwho/what/define”) - 🧾 **Cited, answer-first output** - πŸ•’ **Current events / β€œas of” questions** - πŸ”‘ Auth via `KILOCODE_API_KEY` (routes to `https://api.kilo.ai`) ### You.com (RAG/Real-time) - πŸ€– **RAG applications** (LLM-ready snippets) - πŸ“° **Combined web + news** (single API call) - ⚑ **Real-time information** (current events) - πŸ“‹ **Summarization context** ("What's the latest...") - πŸ”„ **Live crawling** (full page content on demand) ### SearXNG (Privacy-First/Self-Hosted) - πŸ”’ **Privacy-preserving search** (no tracking) - 🌐 **Multi-source aggregation** (70+ engines) - πŸ’° **$0 API cost** (self-hosted) - 🎯 **Diverse perspectives** (results from multiple engines) - 🏠 **Self-hosted environments** (full control) --- ## Table of Contents - [Quick Start](#quick-start) - [Smart Auto-Routing](#smart-auto-routing) - [Configuration Guide](#configuration-guide) - [Provider Deep Dives](#provider-deep-dives) - [Usage Examples](#usage-examples) - [Workflow Examples](#workflow-examples) - [Optimization Tips](#optimization-tips) - [FAQ & Troubleshooting](#faq--troubleshooting) - [API Reference](#api-reference) --- ## Quick Start ### Option A: Interactive Setup (Recommended) ```bash # Run the setup wizard - it guides you through everything python3 scripts/setup.py ``` The wizard explains each provider, collects your API keys, and creates `config.json` automatically. ### Option B: Manual Setup ```bash # 1. Set up at least one API key (or SearXNG instance) export SERPER_API_KEY="your-key" # https://serper.dev export TAVILY_API_KEY="your-key" # https://tavily.com export QUERIT_API_KEY="your-key" # https://querit.ai export EXA_API_KEY="your-key" # https://exa.ai export KILOCODE_API_KEY="your-key" # enables Perplexity Sonar Pro via https://api.kilo.ai export YOU_API_KEY="your-key" # https://api.you.com export SEARXNG_INSTANCE_URL="https://your-instance.example.com" # Self-hosted # 2. Run a search (auto-routed!) python3 scripts/search.py -q "best laptop 2024" ``` ### Run a Search ```bash # Auto-routed to best provider python3 scripts/search.py -q "best laptop 2024" # Or specify a provider explicitly python3 scripts/search.py -p serper -q "iPhone 16 specs" python3 scripts/search.py -p tavily -q "quantum computing explained" --depth advanced python3 scripts/search.py -p querit -q "latest AI policy updates in Germany" python3 scripts/search.py -p exa -q "AI startups 2024" --category company python3 scripts/search.py -p perplexity -q "Who is the president of Austria?" ``` --- ## Smart Auto-Routing ### How It Works When you don't specify a provider, the skill analyzes your query and routes it to the best provider: | Query Contains | Routes To | Example | |---------------|-----------|---------| | "price", "buy", "shop", "cost" | **Serper** | "iPhone 16 price" | | "near me", "restaurant", "hotel" | **Serper** | "pizza near me" | | "weather", "news", "latest" | **Serper** | "weather Berlin" | | "how does", "explain", "what is" | **Tavily** | "how does TCP work" | | "research", "study", "analyze" | **Tavily** | "climate research" | | "tutorial", "guide", "learn" | **Tavily** | "python tutorial" | | multilingual, current status, latest updates | **Querit** | "latest AI policy updates in Germany" | | "similar to", "companies like" | **Exa** | "companies like Stripe" | | "startup", "Series A" | **Exa** | "AI startups Series A" | | "github", "research paper" | **Exa** | "LLM papers arxiv" | | "private", "anonymous", "no tracking" | **SearXNG** | "search privately" | | "multiple sources", "aggregate" | **SearXNG** | "results from all engines" | ### Examples ```bash # These are all auto-routed to the optimal provider: python3 scripts/search.py -q "MacBook Pro M3 price" # β†’ Serper python3 scripts/search.py -q "how does HTTPS work" # β†’ Tavily python3 scripts/search.py -q "latest AI policy updates in Germany" # β†’ Querit python3 scripts/search.py -q "startups like Notion" # β†’ Exa python3 scripts/search.py -q "best sushi restaurant near me" # β†’ Serper python3 scripts/search.py -q "explain attention mechanism" # β†’ Tavily python3 scripts/search.py -q "alternatives to Figma" # β†’ Exa python3 scripts/search.py -q "search privately without tracking" # β†’ SearXNG ``` ### Result Caching (introduced in v2.7.x) Search results are **automatically cached** for 1 hour to save API costs: ```bash # First request: fetches from API ($) python3 scripts/search.py -q "AI startups 2024" # Second request: uses cache (FREE!) python3 scripts/search.py -q "AI startups 2024" # Output includes: "cached": true # Bypass cache (force fresh results) python3 scripts/search.py -q "AI startups 2024" --no-cache # View cache stats python3 scripts/search.py --cache-stats # Clear all cached results python3 scripts/search.py --clear-cache # Custom TTL (in seconds, default: 3600 = 1 hour) python3 scripts/search.py -q "query" --cache-ttl 7200 ``` **Cache location:** `.cache/` in skill directory (override with `WSP_CACHE_DIR` environment variable) ### Debug Auto-Routing See exactly why a provider was selected: ```bash python3 scripts/search.py --explain-routing -q "best laptop to buy" ``` Output: ```json { "query": "best laptop to buy", "selected_provider": "serper", "reason": "matched_keywords (score=2)", "matched_keywords": ["buy", "best"], "available_providers": ["serper", "tavily", "exa"] } ``` ### Routing Info in Results Every search result includes routing information: ```json { "provider": "serper", "query": "iPhone 16 price", "results": [...], "routing": { "auto_routed": true, "selected_provider": "serper", "reason": "matched_keywords (score=1)", "matched_keywords": ["price"] } } ``` --- ## Configuration Guide ### Environment Variables Create a `.env` file or set these in your shell: ```bash # Required: Set at least one export SERPER_API_KEY="your-serper-key" export TAVILY_API_KEY="your-tavily-key" export EXA_API_KEY="your-exa-key" ``` ### Config File (config.json) The `config.json` file lets you customize auto-routing and provider defaults: ```json { "defaults": { "provider": "serper", "max_results": 5 }, "auto_routing": { "enabled": true, "fallback_provider": "serper", "provider_priority": ["serper", "tavily", "exa"], "disabled_providers": [], "keyword_mappings": { "serper": ["price", "buy", "shop", "cost", "deal", "near me", "weather"], "tavily": ["how does", "explain", "research", "what is", "tutorial"], "exa": ["similar to", "companies like", "alternatives", "startup", "github"] } }, "serper": { "country": "us", "language": "en" }, "tavily": { "depth": "basic", "topic": "general" }, "exa": { "type": "neural" } } ``` ### Configuration Examples #### Example 1: Disable Exa (Only Use Serper + Tavily) ```json { "auto_routing": { "disabled_providers": ["exa"] } } ``` #### Example 2: Make Tavily the Default ```json { "auto_routing": { "fallback_provider": "tavily" } } ``` #### Example 3: Add Custom Keywords ```json { "auto_routing": { "keyword_mappings": { "serper": [ "price", "buy", "shop", "amazon", "ebay", "walmart", "deal", "discount", "coupon", "sale", "cheap" ], "tavily": [ "how does", "explain", "research", "what is", "coursera", "udemy", "learn", "course", "certification" ], "exa": [ "similar to", "companies like", "competitors", "YC company", "funded startup", "Series A", "Series B" ] } } } ``` #### Example 4: German Locale for Serper ```json { "serper": { "country": "de", "language": "de" } } ``` #### Example 5: Disable Auto-Routing ```json { "auto_routing": { "enabled": false }, "defaults": { "provider": "serper" } } ``` #### Example 6: Research-Heavy Config ```json { "auto_routing": { "fallback_provider": "tavily", "provider_priority": ["tavily", "serper", "exa"] }, "tavily": { "depth": "advanced", "include_raw_content": true } } ``` --- ## Provider Deep Dives ### Serper (Google Search API) **What it is:** Direct access to Google Search results via API β€” the same results you'd see on google.com. #### Strengths | Strength | Description | |----------|-------------| | 🎯 **Accuracy** | Google's search quality, knowledge graph, featured snippets | | πŸ›’ **Shopping** | Product prices, reviews, shopping results | | πŸ“ **Local** | Business listings, maps, places | | πŸ“° **News** | Real-time news with Google News integration | | πŸ–Ό **Images** | Google Images search | | ⚑ **Speed** | Fastest response times (~200-400ms) | #### Best Use Cases - βœ… Product specifications and comparisons - βœ… Shopping and price lookups - βœ… Local business searches ("restaurants near me") - βœ… Quick factual queries (weather, conversions, definitions) - βœ… News headlines and current events - βœ… Image searches - βœ… When you need "what Google shows" #### Getting Your API Key 1. Go to [serper.dev](https://serper.dev) 2. Sign up with email or Google 3. Copy your API key from the dashboard 4. Set `SERPER_API_KEY` environment variable --- ### Tavily (Research Search) **What it is:** AI-optimized search engine built for research and RAG applications β€” returns synthesized answers plus full content. #### Strengths | Strength | Description | |----------|-------------| | πŸ“š **Research Quality** | Optimized for comprehensive, accurate research | | πŸ’¬ **AI Answers** | Returns synthesized answers, not just links | | πŸ“„ **Full Content** | Can return complete page content (raw_content) | | 🎯 **Domain Filtering** | Include/exclude specific domains | | πŸ”¬ **Deep Mode** | Advanced search for thorough research | | πŸ“° **Topic Modes** | Specialized for general vs news content | #### Best Use Cases - βœ… Research questions requiring synthesized answers - βœ… Academic or technical deep dives - βœ… When you need actual page content (not just snippets) - βœ… Multi-source information comparison - βœ… Domain-specific research (filter to authoritative sources) - βœ… News research with context - βœ… RAG/LLM applications #### Getting Your API Key 1. Go to [tavily.com](https://tavily.com) 2. Sign up and verify email 3. Navigate to API Keys section 4. Generate and copy your key 5. Set `TAVILY_API_KEY` environment variable --- ### Exa (Neural Search) **What it is:** Neural/semantic search engine that understands meaning, not just keywords β€” finds conceptually similar content. #### Strengths | Strength | Description | |----------|-------------| | 🧠 **Semantic Understanding** | Finds results by meaning, not keywords | | πŸ”— **Similar Pages** | Find pages similar to a reference URL | | 🏒 **Company Discovery** | Excellent for finding startups, companies | | πŸ“‘ **Category Filters** | Filter by type (company, paper, tweet, etc.) | | πŸ“… **Date Filtering** | Precise date range searches | | πŸŽ“ **Academic** | Great for research papers and technical content | #### Best Use Cases - βœ… Conceptual queries ("companies building X") - βœ… Finding similar companies or pages - βœ… Startup and company discovery - βœ… Research paper discovery - βœ… Finding GitHub projects - βœ… Date-filtered searches for recent content - βœ… When keyword matching fails #### Getting Your API Key 1. Go to [exa.ai](https://exa.ai) 2. Sign up with email or Google 3. Navigate to API section in dashboard 4. Copy your API key 5. Set `EXA_API_KEY` environment variable --- ### SearXNG (Privacy-First Meta-Search) **What it is:** Open-source, self-hosted meta-search engine that aggregates results from 70+ search engines without tracking. #### Strengths | Strength | Description | |----------|-------------| | πŸ”’ **Privacy-First** | No tracking, no profiling, no data collection | | 🌐 **Multi-Engine** | Aggregates Google, Bing, DuckDuckGo, and 70+ more | | πŸ’° **Free** | $0 API cost (self-hosted, unlimited queries) | | 🎯 **Diverse Results** | Get perspectives from multiple search engines | | βš™ **Customizable** | Choose which engines to use, SafeSearch, language | | 🏠 **Self-Hosted** | Full control over your search infrastructure | #### Best Use Cases - βœ… Privacy-sensitive searches (no tracking) - βœ… When you want diverse results from multiple engines - βœ… Budget-conscious (no API fees) - βœ… Self-hosted/air-gapped environments - βœ… Fallback when paid APIs are rate-limited - βœ… When "aggregate everything" is the goal #### Setting Up Your Instance ```bash # Docker (recommended, 5 minutes) docker run -d -p 8080:8080 searxng/searxng # Enable JSON API in settings.yml: # search: # formats: [html, json] ``` 1. See [docs.searxng.org](https://docs.searxng.org/admin/installation.html) 2. Deploy via Docker, pip, or your preferred method 3. Enable JSON format in `settings.yml` 4. Set `SEARXNG_INSTANCE_URL` environment variable --- ## Usage Examples ### Auto-Routed Searches (Recommended) ```bash # Just search β€” the skill picks the best provider python3 scripts/search.py -q "Tesla Model 3 price" python3 scripts/search.py -q "how do neural networks learn" python3 scripts/search.py -q "YC startups like Stripe" python3 scripts/search.py -q "search privately without tracking" ``` ### Serper Options ```bash # Different search types python3 scripts/search.py -p serper -q "gaming monitor" --type shopping python3 scripts/search.py -p serper -q "coffee shop" --type places python3 scripts/search.py -p serper -q "AI news" --type news # With time filter python3 scripts/search.py -p serper -q "OpenAI news" --time-range day # Include images python3 scripts/search.py -p serper -q "iPhone 16 Pro" --images # Different locale python3 scripts/search.py -p serper -q "Wetter Wien" --country at --language de ``` ### Tavily Options ```bash # Deep research mode python3 scripts/search.py -p tavily -q "quantum computing applications" --depth advanced # With full page content python3 scripts/search.py -p tavily -q "transformer architecture" --raw-content # Domain filtering python3 scripts/search.py -p tavily -q "AI research" --include-domains arxiv.org nature.com ``` ### Exa Options ```bash # Category filtering python3 scripts/search.py -p exa -q "AI startups Series A" --category company python3 scripts/search.py -p exa -q "attention mechanism" --category "research paper" # Date filtering python3 scripts/search.py -p exa -q "YC companies" --start-date 2024-01-01 # Find similar pages python3 scripts/search.py -p exa --similar-url "https://stripe.com" --category company ``` ### SearXNG Options ```bash # Basic search python3 scripts/search.py -p searxng -q "linux distros" # Specific engines only python3 scripts/search.py -p searxng -q "AI news" --engines "google,bing,duckduckgo" # SafeSearch (0=off, 1=moderate, 2=strict) python3 scripts/search.py -p searxng -q "privacy tools" --searxng-safesearch 2 # With time filter python3 scripts/search.py -p searxng -q "open source projects" --time-range week # Custom instance URL python3 scripts/search.py -p searxng -q "test" --searxng-url "http://localhost:8080" ``` --- ## Workflow Examples ### πŸ›’ Product Research Workflow ```bash # Step 1: Get product specs (auto-routed to Serper) python3 scripts/search.py -q "MacBook Pro M3 Max specs" # Step 2: Check prices (auto-routed to Serper) python3 scripts/search.py -q "MacBook Pro M3 Max price comparison" # Step 3: In-depth reviews (auto-routed to Tavily) python3 scripts/search.py -q "detailed MacBook Pro M3 Max review" ``` ### πŸ“š Academic Research Workflow ```bash # Step 1: Understand the topic (auto-routed to Tavily) python3 scripts/search.py -q "explain transformer architecture in deep learning" # Step 2: Find recent papers (Exa) python3 scripts/search.py -p exa -q "transformer improvements" --category "research paper" --start-date 2024-01-01 # Step 3: Find implementations (Exa) python3 scripts/search.py -p exa -q "transformer implementation" --category github ``` ### 🏒 Competitive Analysis Workflow ```bash # Step 1: Find competitors (auto-routed to Exa) python3 scripts/search.py -q "companies like Notion" # Step 2: Find similar products (Exa) python3 scripts/search.py -p exa --similar-url "https://notion.so" --category company # Step 3: Deep dive comparison (Tavily) python3 scripts/search.py -p tavily -q "Notion vs Coda comparison" --depth advanced ``` --- ## Optimization Tips ### Cost Optimization | Tip | Savings | |-----|---------| | Use SearXNG for routine queries | **$0 API cost** | | Use auto-routing (defaults to Serper, cheapest paid) | Best value | | Use Tavily `basic` before `advanced` | ~50% cost reduction | | Set appropriate `max_results` | Linear cost savings | | Use Exa only for semantic queries | Avoid waste | ### Performance Optimization | Tip | Impact | |-----|--------| | Serper is fastest (~200ms) | Use for time-sensitive queries | | Tavily `basic` faster than `advanced` | ~2x faster | | Lower `max_results` = faster response | Linear improvement | --- ## FAQ & Troubleshooting ### General Questions **Q: Do I need API keys for all three providers?** > No. You only need keys for providers you want to use. Auto-routing skips providers without keys. **Q: Which provider should I start with?** > Serper β€” it's the fastest, cheapest, and has the largest free tier (2,500 queries). **Q: Can I use multiple providers in one workflow?** > Yes! That's the recommended approach. See [Workflow Examples](#workflow-examples). **Q: How do I reduce API costs?** > Use auto-routing (defaults to cheapest), start with lower `max_results`, use Tavily `basic` before `advanced`. ### Auto-Routing Questions **Q: Why did my query go to the wrong provider?** > Use `--explain-routing` to debug. Add custom keywords to config.json if needed. **Q: Can I add my own keywords?** > Yes! Edit `config.json` β†’ `auto_routing.keyword_mappings`. **Q: How does keyword scoring work?** > Multi-word phrases get higher weights. "companies like" (2 words) scores higher than "like" (1 word). **Q: What if no keywords match?** > Uses the fallback provider (default: Serper). **Q: Can I force a specific provider?** > Yes, use `-p serper`, `-p tavily`, or `-p exa`. ### Troubleshooting **Error: "Missing API key"** ```bash # Check if key is set echo $SERPER_API_KEY # Set it export SERPER_API_KEY="your-key" ``` **Error: "API Error (401)"** > Your API key is invalid or expired. Generate a new one. **Error: "API Error (429)"** > Rate limited. Wait and retry, or upgrade your plan. **Empty results?** > Try a different provider, broaden your query, or remove restrictive filters. **Slow responses?** > Reduce `max_results`, use Tavily `basic`, or use Serper (fastest). --- ## API Reference ### Output Format All providers return unified JSON: ```json { "provider": "serper|tavily|exa", "query": "original search query", "results": [ { "title": "Page Title", "url": "https://example.com/page", "snippet": "Content excerpt...", "score": 0.95, "date": "2024-01-15", "raw_content": "Full page content (Tavily only)" } ], "images": ["url1", "url2"], "answer": "Synthesized answer", "knowledge_graph": { }, "routing": { "auto_routed": true, "selected_provider": "serper", "reason": "matched_keywords (score=1)", "matched_keywords": ["price"] } } ``` ### CLI Options Reference | Option | Providers | Description | |--------|-----------|-------------| | `-q, --query` | All | Search query | | `-p, --provider` | All | Provider: auto, serper, tavily, querit, exa, perplexity, you, searxng | | `-n, --max-results` | All | Max results (default: 5) | | `--auto` | All | Force auto-routing | | `--explain-routing` | All | Debug auto-routing | | `--images` | Serper, Tavily | Include images | | `--country` | Serper, You | Country code (default: us) | | `--language` | Serper, SearXNG | Language code (default: en) | | `--type` | Serper | search/news/images/videos/places/shopping | | `--time-range` | Serper, SearXNG | hour/day/week/month/year | | `--depth` | Tavily | basic/advanced | | `--topic` | Tavily | general/news | | `--raw-content` | Tavily | Include full page content | | `--querit-base-url` | Querit | Override Querit API base URL | | `--querit-base-path` | Querit | Override Querit API path | | `--exa-type` | Exa | neural/keyword | | `--category` | Exa | company/research paper/news/pdf/github/tweet | | `--start-date` | Exa | Start date (YYYY-MM-DD) | | `--end-date` | Exa | End date (YYYY-MM-DD) | | `--similar-url` | Exa | Find similar pages | | `--searxng-url` | SearXNG | Instance URL | | `--searxng-safesearch` | SearXNG | 0=off, 1=moderate, 2=strict | | `--engines` | SearXNG | Specific engines (google,bing,duckduckgo) | | `--categories` | SearXNG | Search categories (general,images,news) | | `--include-domains` | Tavily, Exa | Only these domains | | `--exclude-domains` | Tavily, Exa | Exclude these domains | | `--compact` | All | Compact JSON output | --- ## License MIT --- ## Links - [Serper](https://serper.dev) β€” Google Search API - [Tavily](https://tavily.com) β€” AI Research Search - [Exa](https://exa.ai) β€” Neural Search - [ClawHub](https://clawhub.ai) β€” OpenClaw Skills