skills/robbyczgw-cla_web-search-plus

Fork 0

Files

zlei9 60e6461707 Initial commit with translated description

2026-03-29 13:18:55 +08:00

9.1 KiB

Raw Blame History

Frequently Asked Questions

Caching (NEW in v2.7.0!)

How does caching work?

Search results are automatically cached locally for 1 hour (3600 seconds). When you make the same query again, you get instant results at $0 API cost. The cache key is based on: query text + provider + max_results.

Where are cached results stored?

In .cache/ directory inside the skill folder by default. Override with WSP_CACHE_DIR environment variable:

export WSP_CACHE_DIR="/path/to/custom/cache"

How do I see cache stats?

python3 scripts/search.py --cache-stats

This shows total entries, size, oldest/newest entries, and breakdown by provider.

How do I clear the cache?

python3 scripts/search.py --clear-cache

Can I change the cache TTL?

Yes! Default is 3600 seconds (1 hour). Set a custom TTL per request:

python3 scripts/search.py -q "query" --cache-ttl 7200  # 2 hours

How do I skip the cache?

Use --no-cache to always fetch fresh results:

python3 scripts/search.py -q "query" --no-cache

How do I know if a result was cached?

The response includes:

"cached": true/false — whether result came from cache
"cache_age_seconds": 1234 — how old the cached result is (when cached)

General

How does auto-routing decide which provider to use?

Multi-signal analysis scores each provider based on: price patterns, explanation phrases, similarity keywords, URLs, product+brand combos, and query complexity. Highest score wins. Use --explain-routing to see the decision breakdown.

What if it picks the wrong provider?

Override with -p serper/tavily/exa. Check --explain-routing to understand why it chose differently.

What does "low confidence" mean?

Query is ambiguous (e.g., "Tesla" could be cars, stock, or company). Falls back to Serper. Results may vary.

Can I disable a provider?

Yes! In config.json: "disabled_providers": ["exa"]

API Keys

Which API keys do I need?

At minimum ONE key (or SearXNG instance). You can use just Serper, just Tavily, just Exa, just You.com, or just SearXNG. Missing keys = that provider is skipped.

Where do I get API keys?

Serper: https://serper.dev (2,500 free queries, no credit card)
Tavily: https://tavily.com (1,000 free searches/month)
Exa: https://exa.ai (1,000 free searches/month)
You.com: https://api.you.com (Limited free tier for testing)
SearXNG: Self-hosted, no key needed! https://docs.searxng.org/admin/installation.html

How do I set API keys?

Two options (both auto-load):

Option A: .env file

export SERPER_API_KEY="your-key"

Option B: config.json (v2.2.1+)

{ "serper": { "api_key": "your-key" } }

Routing Details

How do I know which provider handled my search?

Check routing.provider in JSON output, or [🔍 Searched with: Provider] in chat responses.

Why does it sometimes choose Serper for research questions?

If the query has brand/product signals (e.g., "how does Tesla FSD work"), shopping intent may outweigh research intent. Override with -p tavily.

What's the confidence threshold?

Default: 0.3 (30%). Below this = low confidence, uses fallback. Adjustable in config.json.

You.com Specific

When should I use You.com over other providers?

You.com excels at:

RAG applications: Pre-extracted snippets ready for LLM consumption
Real-time information: Current events, breaking news, status updates
Combined sources: Web + news results in a single API call
Summarization tasks: "What's the latest on...", "Key points about..."

What's the livecrawl feature?

You.com can fetch full page content on-demand. Use --livecrawl web for web results, --livecrawl news for news articles, or --livecrawl all for both. Content is returned in Markdown format.

Does You.com include news automatically?

Yes! You.com's intelligent classification automatically includes relevant news results when your query has news intent. You can also use --include-news to explicitly enable it.

SearXNG Specific

Do I need my own SearXNG instance?

Yes! SearXNG is self-hosted. Most public instances disable the JSON API to prevent bot abuse. You need to run your own instance with JSON format enabled. See: https://docs.searxng.org/admin/installation.html

How do I set up SearXNG?

Docker is the easiest way:

docker run -d -p 8080:8080 searxng/searxng

Then enable JSON in settings.yml:

search:
  formats:
    - html
    - json

Why am I getting "403 Forbidden"?

The JSON API is disabled on your instance. Enable it in settings.yml under search.formats.

What's the API cost for SearXNG?

$0! SearXNG is free and open-source. You only pay for hosting (~$5/month VPS). Unlimited queries.

When should I use SearXNG?

Privacy-sensitive queries: No tracking, no profiling
Budget-conscious: $0 API cost
Diverse results: Aggregates 70+ search engines
Self-hosted requirements: Full control over your search infrastructure
Fallback provider: When paid APIs are rate-limited

Can I limit which search engines SearXNG uses?

Yes! Use --engines google,bing,duckduckgo to specify engines, or configure defaults in config.json.

Provider Selection

Which provider should I use?

Query Type	Best Provider	Why
Shopping ("buy laptop", "cheap shoes")	Serper	Google Shopping, price comparisons, local stores
Research ("how does X work?", "explain Y")	Tavily	Deep research, academic quality, full-page content
Startups/Papers ("companies like X", "arxiv papers")	Exa	Semantic/neural search, startup discovery
RAG/Real-time ("summarize latest", "current events")	You.com	LLM-ready snippets, combined web+news
Privacy ("search without tracking")	SearXNG	No tracking, multi-source, self-hosted

Tip: Enable auto-routing and let the skill choose automatically! 🎯

Do I need all 5 providers?

No! All providers are optional. You can use:

1 provider (e.g., just Serper for everything)
2-3 providers (e.g., Serper + You.com for most needs)
All 5 (maximum flexibility + fallback options)

How much do the APIs cost?

Provider	Free Tier	Paid Plan
Serper	2,500 queries/mo	$50/mo (5,000 queries)
Tavily	1,000 queries/mo	$150/mo (10,000 queries)
Exa	1,000 queries/mo	$1,000/mo (100,000 queries)
You.com	Limited free	~$10/mo (varies by usage)
SearXNG	FREE ✅	Only VPS cost (~$5/mo if self-hosting)

Budget tip: Use SearXNG as primary + others as fallback for specialized queries!

How private is SearXNG really?

Setup	Privacy Level
Self-hosted (your VPS)	⭐⭐⭐⭐⭐ You control everything
Self-hosted (Docker local)	⭐⭐⭐⭐⭐ Fully private
Public instance	⭐⭐⭐ Depends on operator's logging policy

Best practice: Self-host if privacy is critical.

Which provider has the best results?

Metric	Winner
Most accurate for facts	Serper (Google)
Best for research depth	Tavily
Best for semantic queries	Exa
Best for RAG/AI context	You.com
Most diverse sources	SearXNG (70+ engines)
Most private	SearXNG (self-hosted)

Recommendation: Enable multiple providers + auto-routing for best overall experience.

How does auto-routing work?

The skill analyzes your query for keywords and patterns:

"buy cheap laptop"     → Serper (shopping signals)
"how does AI work?"    → Tavily (research/explanation)
"companies like X"     → Exa (semantic/similar)
"summarize latest news" → You.com (RAG/real-time)
"search privately"     → SearXNG (privacy signals)

Confidence threshold: Only routes if confidence > 30%. Otherwise uses default provider.

Override: Use -p provider to force a specific provider.

Production Use

Can I use this in production?

Yes! Web-search-plus is production-ready:

✅ Error handling with automatic fallback
✅ Rate limit protection
✅ Timeout handling (30s per provider)
✅ API key security (.env + config.json gitignored)
✅ 5 providers for redundancy

Tip: Monitor API usage to avoid exceeding free tiers!

What if I run out of API credits?

Fallback chain: Other enabled providers automatically take over
Use SearXNG: Switch to self-hosted (unlimited queries)
Upgrade plan: Paid tiers have higher limits
Rate limit: Use disabled_providers to skip exhausted APIs temporarily

Updates

How do I update to the latest version?

Via ClawHub (recommended):

clawhub update web-search-plus --registry "https://www.clawhub.ai" --no-input

Manually:

cd /path/to/workspace/skills/web-search-plus/
git pull origin main
python3 scripts/setup.py  # Re-run to configure new features

Where can I report bugs or request features?

GitHub Issues: https://github.com/robbyczgw-cla/web-search-plus/issues
ClawHub: https://www.clawhub.ai/skills/web-search-plus

9.1 KiB Raw Blame History