# Changelog - Web Search Plus ## [2.9.2] - 2026-03-27 ### Fixed - Replaced hardcoded temporary cache path examples with portable `$TMP_DIR` placeholders in `TROUBLESHOOTING.md` ## [2.9.0] - 2026-03-12 ### โœจ New Provider: Querit (Multilingual AI Search) [Querit.ai](https://querit.ai) is a Singapore-based multilingual AI search API purpose-built for LLMs and RAG pipelines. 300 billion page index, 20+ countries, 10+ languages. - Added **Querit** as the 7th search provider via `https://api.querit.ai/v1/search` - Configure via `QUERIT_API_KEY` โ€” optional, gracefully skipped if not set - Routing score: `research * 0.65 + rag * 0.35 + recency * 0.45` โ€” favored for multilingual and real-time queries - Handles Querit's quirky `error_code=200` responses as success (not an error) - Handles `IncompleteRead` as transient/retryable failure - Live-tested with 10 benchmark queries โœ… ### ๐Ÿ”ง Fixed: Fallback chain dies on unconfigured provider - `sys.exit(1)` in `validate_api_key()` raised `SystemExit` (inherits from `BaseException`), which bypassed the `except Exception` fallback loop and killed the entire process instead of trying the next provider - Replaced with catchable `ProviderConfigError` โ€” fallback chain now continues correctly through all configured providers ### ๐Ÿ”ง Fixed: Perplexity citations are generic placeholders - Previously extracted citation URLs via regex from the answer text, resulting in generic "Source 1" / "Source 2" labels - Now uses the structured `data["citations"]` array from the Perplexity API response directly โ€” results have readable titles - Regex extraction kept as fallback when API doesn't return a `citations` field ### โœจ Improved: German locale routing patterns - Added German-language signal patterns for local and news queries - Improves auto-routing for queries like `"aktuelle Nachrichten"`, `"beste Restaurants Graz"`, `"KI Regulierung Europa"` ### ๐Ÿ“ Documentation - Added Querit to README provider tables, routing examples, and API key setup section - Added `querit_api_key` to `config.example.json` - Updated `SKILL.md` provider mentions and env metadata - Bumped package version to `2.9.0` ## [2.8.6] - 2026-03-03 ### Changed - Documented Perplexity Sonar Pro usage and refreshed release docs. ## [2.8.5] - 2026-02-20 ### โœจ Feature: Perplexity freshness filter - Added `freshness` parameter to Perplexity provider (`day`, `week`, `month`, `year`) - Maps to Perplexity's native `search_recency_filter` parameter - Example: `python3 scripts/search.py -p perplexity -q "latest AI news" --freshness day` - Consistent with freshness support in Serper and Brave providers ## [2.8.4] - 2026-02-20 ### ๐Ÿ”’ Security Fix: SSRF protection in setup wizard - **Fixed:** `setup.py` SearXNG connection test had no SSRF protection (unlike `search.py`) - **Before:** Operator could be tricked into probing internal networks during setup - **After:** Same IP validation as `search.py` โ€” blocks private IPs, cloud metadata, loopback - **Credit:** ClawHub security scanner ## [2.8.3] - 2026-02-20 ### ๐Ÿ› Critical Fix: Perplexity results empty - **Fixed:** Perplexity provider returned 0 results because the AI-synthesized answer wasn't mapped into the results array - **Before:** Only extracted URLs from the answer text were returned as results (often 0) - **After:** The full answer is now the primary result (title, snippet with cleaned text), extracted source URLs follow as additional results - **Impact:** Perplexity queries now always return at least 1 result with the synthesized answer ## [2.8.0] - 2026-02-20 ### ๐Ÿ†• New Provider: Perplexity (AI-Synthesized Answers) Added Perplexity as the 6th search provider via Kilo Gateway โ€” the first provider that returns **direct answers with citations** instead of just links: #### Features - **AI-Synthesized Answers**: Get a complete answer, not a list of links - **Inline Citations**: Every claim backed by `[1][2][3]` source references - **Real-Time Web Search**: Perplexity searches the web live, reads pages, and summarizes - **Zero Extra Config**: Works through Kilo Gateway with your existing `KILOCODE_API_KEY` - **Model**: `perplexity/sonar-pro` (best quality, supports complex queries) #### Auto-Routing Signals New direct-answer intent detection routes to Perplexity for: - Status queries: "status of", "current state of", "what is the status" - Local info: "events in [city]", "things to do in", "what's happening in" - Direct questions: "what is", "who is", "when did", "how many" - Current affairs: "this week", "this weekend", "right now", "today" #### Usage Examples ```bash # Auto-routed python3 scripts/search.py -q "events in Graz Austria this weekend" # โ†’ Perplexity python3 scripts/search.py -q "what is the current status of Ethereum" # โ†’ Perplexity # Explicit python3 scripts/search.py -p perplexity -q "latest AI regulation news" ``` #### Configuration Requires `KILOCODE_API_KEY` environment variable (Kilo Gateway account). No additional API key needed โ€” Perplexity is accessed through Kilo's unified API. ```bash export KILOCODE_API_KEY="your-kilo-key" ``` ### ๐Ÿ”ง Routing Rebalance Major overhaul of the auto-routing confidence scoring to fix Serper dominance: #### Problem Serper (Google) was winning ~90% of queries due to: - High recency multiplier boosting Serper on any query with dates/years - Default provider priority placing Serper first in ties - Research and discovery signals not strong enough to override #### Changes - **Lowered Serper recency multiplier** โ€” date mentions no longer auto-route to Google - **Strengthened research signals** for Tavily: - Added: "status of", "what happened with", "how does X compare" - Boosted weights for comparison patterns (4.0 โ†’ 5.0) - **Strengthened discovery signals** for Exa: - Added: "events in", "things to do in", "startups similar to" - Boosted weights for local discovery patterns - **Updated provider priority order**: `tavily โ†’ exa โ†’ perplexity โ†’ serper โ†’ you โ†’ searxng` - Serper moved from 1st to 4th in tie-breaking - Research/discovery providers now win on ambiguous queries #### Routing Test Results | Query | Before | After | โœ“ | |-------|--------|-------|---| | "latest OpenClaw version Feb 2026" | Serper | Serper | โœ… | | "Ethereum Pectra upgrade status" | Serper | **Tavily** | โœ… | | "events in Graz this weekend" | Serper | **Perplexity** | โœ… | | "compare SearXNG vs Brave for AI agents" | Serper | **Tavily** | โœ… | | "Sam Altman OpenAI news this week" | Serper | Serper | โœ… | | "find startups similar to Kilo Code" | Serper | **Exa** | โœ… | ### ๐Ÿ“Š Updated Provider Comparison | Feature | Serper | Tavily | Exa | Perplexity | You.com | SearXNG | |---------|:------:|:------:|:---:|:----------:|:-------:|:-------:| | Speed | โšกโšกโšก | โšกโšก | โšกโšก | โšกโšก | โšกโšกโšก | โšก | | Direct Answers | โœ— | โœ— | โœ— | โœ“โœ“ | โœ— | โœ— | | Citations | โœ— | โœ— | โœ— | โœ“ | โœ— | โœ— | | Local Events | โœ“ | โœ— | โœ“ | โœ“โœ“ | โœ— | โœ“ | | Research | โœ— | โœ“โœ“ | โœ“ | โœ“ | โœ“ | โœ— | | Discovery | โœ— | โœ— | โœ“โœ“ | โœ— | โœ— | โœ— | | Self-Hosted | โœ— | โœ— | โœ— | โœ— | โœ— | โœ“ | ## [2.7.0] - 2026-02-14 ### โœจ Added - Provider cooldown tracking in `.cache/provider_health.json` - Exponential cooldown on provider failures: **1m โ†’ 5m โ†’ 25m โ†’ 1h (cap)** - Retry strategy for transient failures (timeout, 429, 503): up to 2 retries with backoff **1s โ†’ 3s โ†’ 9s** - Smarter cache keys hashed from full request context (query/provider/max_results + locale, freshness, time_range, topic, search_engines, include_news, and related params) - Cross-provider result deduplication by normalized URL during fallback merge ### ๐Ÿ”ง Changed - Cooldown providers are skipped in routing while their cooldown is active - Provider health is reset automatically after successful requests - Fallback output now includes dedup metadata: - `deduplicated: true|false` - `metadata.dedup_count` ## [2.6.5] - 2026-02-11 ### ๐Ÿ†• File-Based Result Caching Added local caching to save API costs on repeated searches: #### Features - **Automatic Caching**: Search results cached locally by default - **1-Hour TTL**: Results expire after 3600 seconds (configurable) - **Cache Indicators**: Response includes `cached: true/false` and `cache_age_seconds` - **Zero-Cost Repeats**: Cached requests don't hit APIs #### New CLI Options - `--cache-ttl SECONDS` โ€” Custom cache TTL (default: 3600) - `--no-cache` โ€” Bypass cache, always fetch fresh - `--clear-cache` โ€” Delete all cached results - `--cache-stats` โ€” Show cache statistics (entries, size, age) #### Configuration - **Cache directory**: `.cache/` in skill directory - **Environment variable**: `WSP_CACHE_DIR` to override location - **Cache key**: Based on query + provider + max_results (SHA256) #### Usage Examples ```bash # First request costs API credits python3 scripts/search.py -q "AI startups" # Second request is FREE (uses cache) python3 scripts/search.py -q "AI startups" # Force fresh results python3 scripts/search.py -q "AI startups" --no-cache # View stats python3 scripts/search.py --cache-stats # Clear everything python3 scripts/search.py --clear-cache ``` #### Technical Details - Cache files: JSON with metadata (_cache_timestamp, _cache_key, etc.) - Automatic cleanup of expired entries on access - Graceful handling of corrupted cache files ## [2.6.1] - 2026-02-04 - Privacy cleanup: removed hardcoded paths and personal info from docs ## [2.5.0] - 2026-02-03 ### ๐Ÿ†• New Provider: SearXNG (Privacy-First Meta-Search) Added SearXNG as the 5th search provider, focused on privacy and self-hosted search: #### Features - **Privacy-Preserving**: No tracking, no profiling โ€” your searches stay private - **Multi-Source Aggregation**: Queries 70+ upstream engines (Google, Bing, DuckDuckGo, etc.) - **$0 API Cost**: Self-hosted = unlimited queries with no API fees - **Diverse Results**: Get perspectives from multiple search engines in one query - **Customizable**: Choose which engines to use, set SafeSearch levels, language preferences #### Auto-Routing Signals New privacy/multi-source intent detection routes to SearXNG for: - Privacy queries: "private", "anonymous", "without tracking", "no tracking" - Multi-source: "aggregate results", "multiple sources", "diverse perspectives" - Budget/free: "free search", "no api cost", "self-hosted search" - German: "privat", "anonym", "ohne tracking", "verschiedene quellen" #### Usage Examples ```bash # Auto-routed python3 scripts/search.py -q "search privately without tracking" # โ†’ SearXNG # Explicit python3 scripts/search.py -p searxng -q "linux distros" python3 scripts/search.py -p searxng -q "AI news" --engines "google,bing,duckduckgo" python3 scripts/search.py -p searxng -q "privacy tools" --searxng-safesearch 2 ``` #### Configuration ```json { "searxng": { "instance_url": "https://your-instance.example.com", "safesearch": 0, "engines": null, "language": "en" } } ``` #### Setup SearXNG requires a self-hosted instance with JSON format enabled: ```bash # Docker setup (5 minutes) docker run -d -p 8080:8080 searxng/searxng # Enable JSON in settings.yml: # search: # formats: [html, json] # Set instance URL export SEARXNG_INSTANCE_URL="http://localhost:8080" ``` See: https://docs.searxng.org/admin/installation.html ### ๐Ÿ“Š Updated Provider Comparison | Feature | Serper | Tavily | Exa | You.com | SearXNG | |---------|:------:|:------:|:---:|:-------:|:-------:| | Privacy-First | โœ— | โœ— | โœ— | โœ— | โœ“โœ“ | | Self-Hosted | โœ— | โœ— | โœ— | โœ— | โœ“ | | API Cost | $$ | $$ | $$ | $ | **FREE** | | Multi-Engine | โœ— | โœ— | โœ— | โœ— | โœ“ (70+) | ### ๐Ÿ”ง Technical Changes - Added `search_searxng()` function with full error handling - Added `PRIVACY_SIGNALS` to QueryAnalyzer for auto-routing - Updated setup wizard with SearXNG option (instance URL validation) - Updated config.example.json with searxng section - New CLI args: `--searxng-url`, `--searxng-safesearch`, `--engines`, `--categories` --- ## [2.4.4] - 2026-02-03 ### ๐Ÿ“ Documentation: Provider Count Fix - **Fixed:** "You can use 1, 2, or all 3" โ†’ "1, 2, 3, or all 4" (we have 4 providers now!) - **Impact:** Accurate documentation for setup wizard ## [2.4.3] - 2026-02-03 ### ๐Ÿ“ Documentation: Updated README - **Added:** "NEW in v2.4.2" badge for You.com in SKILL.md - **Impact:** ClawHub README now properly highlights You.com as new feature ## [2.4.2] - 2026-02-03 ### ๐Ÿ› Critical Fix: You.com API Configuration - **Fixed:** Incorrect hostname (`api.ydc-index.io` โ†’ `ydc-index.io`) - **Fixed:** Incorrect header name (`X-API-Key` โ†’ `X-API-KEY` uppercase) - **Impact:** You.com now works correctly - was giving 403 Forbidden before - **Status:** โœ… Fully tested and working ## [2.4.1] - 2026-02-03 ### ๐Ÿ› Bugfix: You.com URL Encoding - **Fixed:** URL encoding for You.com queries - spaces and special characters now properly encoded - **Impact:** Queries with spaces (e.g., "OpenClaw AI framework") work correctly now - **Technical:** Added `urllib.parse.quote` for parameter encoding ## [2.4.0] - 2026-02-03 ### ๐Ÿ†• New Provider: You.com Added You.com as the 4th search provider, optimized for RAG applications and real-time information: #### Features - **LLM-Ready Snippets**: Pre-extracted, query-aware text excerpts perfect for feeding into AI models - **Unified Web + News**: Get both web pages and news articles in a single API call - **Live Crawling**: Fetch full page content on-demand in Markdown format (`--livecrawl`) - **Automatic News Classification**: Intelligently includes news results based on query intent - **Freshness Controls**: Filter by recency (day, week, month, year, or date range) - **SafeSearch Support**: Content filtering (off, moderate, strict) #### Auto-Routing Signals New RAG/Real-time intent detection routes to You.com for: - RAG context queries: "summarize", "key points", "tldr", "context for" - Real-time info: "latest news", "current status", "right now", "what's happening" - Information synthesis: "updates on", "situation", "main takeaways" #### Usage Examples ```bash # Auto-routed python3 scripts/search.py -q "summarize key points about AI regulation" # โ†’ You.com # Explicit python3 scripts/search.py -p you -q "climate change" --livecrawl all python3 scripts/search.py -p you -q "tech news" --freshness week ``` #### Configuration ```json { "you": { "country": "US", "language": "en", "safesearch": "moderate", "include_news": true } } ``` #### API Key Setup ```bash export YOU_API_KEY="your-key" # Get from https://api.you.com ``` ### ๐Ÿ“Š Updated Provider Comparison | Feature | Serper | Tavily | Exa | You.com | |---------|:------:|:------:|:---:|:-------:| | Speed | โšกโšกโšก | โšกโšก | โšกโšก | โšกโšกโšก | | News Integration | โœ“ | โœ— | โœ— | โœ“ | | RAG-Optimized | โœ— | โœ“ | โœ— | โœ“โœ“ | | Full Page Content | โœ— | โœ“ | โœ“ | โœ“ | --- ## [2.1.5] - 2026-01-27 ### ๐Ÿ“ Documentation - Added warning about NOT using Tavily/Serper/Exa in core OpenClaw config - Core OpenClaw only supports `brave` as the built-in provider - This skill's providers must be used via environment variables and scripts, not `openclaw.json` ## [2.1.0] - 2026-01-23 ### ๐Ÿง  Intelligent Multi-Signal Routing Completely overhauled auto-routing with sophisticated query analysis: #### Intent Classification - **Shopping Intent**: Detects price patterns ("how much", "cost of"), purchase signals ("buy", "order"), deal keywords, and product+brand combinations - **Research Intent**: Identifies explanation patterns ("how does", "why does"), analysis signals ("pros and cons", "compare"), learning keywords, and complex multi-clause queries - **Discovery Intent**: Recognizes similarity patterns ("similar to", "alternatives"), company discovery signals, URL/domain detection, and academic patterns #### Linguistic Pattern Detection - "How much" / "price of" โ†’ Shopping (Serper) - "How does" / "Why does" / "Explain" โ†’ Research (Tavily) - "Companies like" / "Similar to" / "Alternatives" โ†’ Discovery (Exa) - Product + Brand name combos โ†’ Shopping (Serper) - URLs and domains in query โ†’ Similar search (Exa) #### Query Analysis Features - **Complexity scoring**: Long, multi-clause queries get routed to research providers - **URL detection**: Automatic detection of URLs/domains triggers Exa similar search - **Brand recognition**: Tech brands (Apple, Samsung, Sony, etc.) with product terms โ†’ shopping - **Recency signals**: "latest", "2026", "breaking" boost news mode #### Confidence Scoring - **HIGH (70-100%)**: Strong signal match, very reliable routing - **MEDIUM (40-69%)**: Good match, should work well - **LOW (0-39%)**: Ambiguous query, using fallback provider - Confidence based on absolute signal strength + relative margin over alternatives #### Enhanced Debug Mode ```bash python3 scripts/search.py --explain-routing -q "your query" ``` Now shows: - Routing decision with confidence level - All provider scores - Top matched signals with weights - Query analysis (complexity, URL detection, recency focus) - All matched patterns per provider ### ๐Ÿ”ง Technical Changes #### QueryAnalyzer Class New `QueryAnalyzer` class with: - `SHOPPING_SIGNALS`: 25+ weighted patterns for shopping intent - `RESEARCH_SIGNALS`: 30+ weighted patterns for research intent - `DISCOVERY_SIGNALS`: 20+ weighted patterns for discovery intent - `LOCAL_NEWS_SIGNALS`: 25+ patterns for local/news queries - `BRAND_PATTERNS`: Tech brand detection regex #### Signal Weighting - Multi-word phrases get higher weights (e.g., "how much" = 4.0 vs "price" = 3.0) - Strong signals: price patterns (4.0), similarity patterns (5.0), URLs (5.0) - Medium signals: product terms (2.5), learning keywords (2.5) - Bonus scoring: Product+brand combo (+3.0), complex query (+2.5) #### Improved Output Format ```json { "routing": { "auto_routed": true, "provider": "serper", "confidence": 0.78, "confidence_level": "high", "reason": "high_confidence_match", "top_signals": [{"matched": "price", "weight": 3.0}], "scores": {"serper": 7.0, "tavily": 0.0, "exa": 0.0} } } ``` ### ๐Ÿ“š Documentation Updates - **SKILL.md**: Complete rewrite with signal tables and confidence scoring guide - **README.md**: Updated with intelligent routing examples and confidence levels - **FAQ**: Updated to explain multi-signal analysis ### ๐Ÿงช Test Results | Query | Provider | Confidence | Signals | |-------|----------|------------|---------| | "how much does iPhone 16 cost" | Serper | 68% | "how much", brand+product | | "how does quantum entanglement work" | Tavily | 86% HIGH | "how does", "what are", "implications" | | "startups similar to Notion" | Exa | 76% HIGH | "similar to", "Series A" | | "companies like stripe.com" | Exa | 100% HIGH | URL detected, "companies like" | | "MacBook Pro M3 specs review" | Serper | 70% HIGH | brand+product, "specs", "review" | | "Tesla" | Serper | 0% LOW | No signals (fallback) | | "arxiv papers on transformers" | Exa | 58% | "arxiv" | | "latest AI news 2026" | Serper | 77% HIGH | "latest", "news", "2026" | --- ## [2.0.0] - 2026-01-23 ### ๐ŸŽ‰ Major Features #### Smart Auto-Routing - **Automatic provider selection** based on query analysis - No need to manually choose provider - just search! - Intelligent keyword matching for routing decisions - Pattern detection for query types (shopping, research, discovery) - Scoring system for provider selection #### User Configuration - **config.json**: Full control over auto-routing behavior - **Configurable keyword mappings**: Add your own routing keywords - **Provider priority**: Set tie-breaker order - **Disable providers**: Turn off providers you don't have API keys for - **Enable/disable auto-routing**: Opt-in or opt-out as needed #### Debugging Tools - **--explain-routing** flag: See exactly why a provider was selected - Detailed routing metadata in JSON responses - Shows matched keywords and routing scores ### ๐Ÿ“š Documentation - **README.md**: Complete auto-routing guide with examples - **SKILL.md**: Detailed routing logic and configuration reference - **FAQ section**: Common questions about auto-routing - **Configuration examples**: Pre-built configs for common use cases --- ## [1.0.x] - Initial Release - Multi-provider search: Serper, Tavily, Exa - Manual provider selection with `-p` flag - Unified JSON output format - Provider-specific options (--depth, --category, --similar-url, etc.) - Domain filtering for Tavily/Exa - Date filtering for Exa