Initial commit with translated description

This commit is contained in:
2026-03-29 08:32:46 +08:00
commit 6637dd34f9
11 changed files with 2613 additions and 0 deletions

13
.clawhubsafe Normal file
View File

@@ -0,0 +1,13 @@
fdf74d1243de04e300df83218804efecacf0684f36be823d35677ed0c30b45bc assets/config-patches.json
3e266e0d786f4f50c128ac7eae0292b71b593c242e9455e3b2379e3471793cb4 assets/cronjob-model-guide.md
06b0efa90d19c8ae9cd29a86ff6b02bb8bf3de5016bbeede066643a97e6f82b5 assets/HEARTBEAT.template.md
018bbc0135ba138ecbe55587727297df13766979558e9e6ca5f1e7b59ac81a75 CHANGELOG.md
17b8e56e6bf9b418135adb7362ab2dfccf637950ddd2a1c9d1a019edf03ac28e README.md
9b6100865c8d65e9c7bcebe71a88c3d794c25b9a9a8881e859738df1a4a74b8a references/PROVIDERS.md
43aed2cad4a5ab78bcb2a1b4f2c5685221f93e56a2964d0064b778643063f89f scripts/context_optimizer.py
6d03311e9d6848593f6a4fc07a271d5518ee56b37bc3e8818ac4dace44f9fb68 scripts/heartbeat_optimizer.py
353aed33fa4e8bb86db1d7e818f9df270f541264b413131438e5188d480ade79 scripts/model_router.py
8616b11aff67104fc06c2801768d8b5627dae97ceafad2d3ce7c1d5c7bb51b8d scripts/optimize.sh
55aad57bdc88fee74e4810f5ab2c1545faed783a64a2fc1abb8caca377234fca scripts/token_tracker.py
0486f41b2ef26d63078fdfd0baea710078419ccf17f10964973d9117b55a056b SECURITY.md
5a11898fb9692026d3db63a43e9f212fa643e6e7871f78b0cb3e6c95590c62ac SKILL.md

83
CHANGELOG.md Normal file
View File

@@ -0,0 +1,83 @@
# Changelog
All notable changes to OpenClaw Token Optimizer are documented here.
## [3.0.0] - 2026-02-28
### Added
- **Lazy Skill Loading section** (Capability 0) — The highest-impact optimization: use a lightweight SKILLS.md catalog at startup and load individual skill files only when tasks require them. Reduces context loading costs 4093% depending on library size.
- References companion skill `openclaw-skill-lazy-loader` (`clawhub install openclaw-skill-lazy-loader`) which provides SKILLS.md.template, AGENTS.md.template lazy-loading section, and context_optimizer.py CLI.
- Token savings table for lazy loading (5/10/20 skills: 80%/88%/93% reduction).
### Changed
- Version bumped to 3.0.0 reflecting the major expansion of optimization coverage to include pre-runtime context loading phase.
- Lazy loading is now positioned as "Capability 0" — the first optimization to apply before all others.
## [1.4.3] - 2026-02-18
### Fixed
- README completely rewritten: correct name ("OpenClaw Token Optimizer"), version badge (1.4.3), install command (clawhub install Asif2BD/openclaw-token-optimizer), license (Apache 2.0), all skill paths corrected to openclaw-token-optimizer, v1.4.x features documented.
## [1.4.2] - 2026-02-18
### Fixed
- **ClawHub security scanner: `Source: unknown` / `homepage: none`** — Added `homepage`, `source`, and `author` fields to SKILL.md frontmatter. Provenance is now fully declared in the registry.
- **`optimize.sh` missing from integrity manifest** — Added `scripts/optimize.sh` to `.clawhubsafe`. The manifest now covers 13 files (previously 12). Every bundled file is SHA256-verified.
- **`optimize.sh` undocumented** — New dedicated section in `SECURITY.md` with full source disclosure: what the wrapper does (delegates to bundled Python scripts only), security properties, and a one-line audit command for users who want to verify before running.
- **Provenance undocumented** — New `Provenance & Source` section in `SECURITY.md` with explicit GitHub repo + ClawHub listing links and integrity verification instructions.
## [1.4.1] - 2026-02-18
### Fixed
- **ClawHub public visibility blocked** — `PR-DESCRIPTION.md` (a leftover dev file containing `git push` / `gh pr create` shell commands) was included in the v1.4.0 bundle, triggering ClawHub's security scanner hold.
- **Added `.clawhubignore`** — Now excludes `PR-DESCRIPTION.md`, `docs/`, `.git/`, `.gitignore` from all future publishes. Only skill-relevant files are bundled.
## [1.4.0] - 2026-02-17
### Added
- **Session pruning config patch** — Native `contextPruning: { mode: "cache-ttl" }` support. Auto-trims old tool results when the Anthropic prompt cache TTL expires, reducing cache write costs after idle sessions. No scripts required — applies directly via `gateway config.patch`.
- **Bootstrap size limits patch** — `bootstrapMaxChars` / `bootstrapTotalMaxChars` config keys to cap how large workspace files are injected into the system prompt. Delivers 20-40% system prompt reduction for agents with large workspace files (e.g., detailed AGENTS.md, MEMORY.md). Native 2026.2.15 feature.
- **Cache retention config patch** — `cacheRetention: "long"` param for Opus models. Amortizes cache write costs across long reasoning sessions.
- **Cache TTL heartbeat alignment** — New `heartbeat_optimizer.py cache-ttl` command. Calculates the optimal heartbeat interval to keep the Anthropic 1h prompt cache warm (55min). Prevents the "cold restart" cache re-write penalty after idle gaps.
- **Native commands section in SKILL.md** — Documents `/context list`, `/context detail` (per-file token breakdown) and `/usage tokens|full|cost` (per-response usage footer). These are built-in OpenClaw 2026.2.15 diagnostics that complement the Python scripts.
- **`native_openclaw` category in config-patches.json** — Clearly distinguishes patches that work today with zero external dependencies from those requiring API keys or beta features.
### Changed
- `SKILL.md`: Configuration Patches section updated — `session_pruning`, `bootstrap_size_limits`, and `cache_retention_long` now listed as ✅ native (not ⏳ pending).
- `config-patches.json`: Added `_categories.native_openclaw` grouping. Model routing patch updated to reflect Sonnet-primary setups (Haiku listed as optional with multi-provider note).
- `heartbeat_optimizer.py`: Added `cache-ttl` subcommand and `CACHE_TTL_OPTIMAL_INTERVAL` constant (3300s = 55min). Plan output now includes cache TTL alignment recommendation when relevant.
- Compliance with OpenClaw 2026.2.15 features: **72% → 92%+**
### Fixed
- Model routing quick start: added note that Haiku requires multi-provider setup (OpenRouter/Together); Sonnet is the default minimum for single-provider Anthropic deployments.
## [1.3.3] - 2026-02-17
### Fixed
- Display name corrected to "OpenClaw Token Optimizer" on ClawHub (was truncated in 1.3.1/1.3.2)
- Slug preserved: `openclaw-token-optimizer`
## [1.3.2] - 2026-02-17
### Added
- `SECURITY.md` — Full two-category security breakdown (executable scripts vs. reference docs)
- `.clawhubsafe` — SHA256 integrity manifest for all 10 skill files
### Fixed
- Resolved VT flag: SKILL.md previously claimed `no_network: true` for entire skill, but PROVIDERS.md and config-patches.json reference external APIs — scoped the guarantee to scripts only
## [1.3.1] - 2026-02-12
### Added
- `context_optimizer.py` — Context lazy-loading and complexity-based file recommendations
- `cronjob-model-guide.md` — Model selection guide for cron-based tasks
### Changed
- Enhanced model_router.py with communication pattern enforcement
## [1.2.0] - 2026-02-07
### Added
- Initial public release
- `heartbeat_optimizer.py`, `model_router.py`, `token_tracker.py`
- `PROVIDERS.md`, `config-patches.json`, `HEARTBEAT.template.md`

192
README.md Normal file
View File

@@ -0,0 +1,192 @@
# OpenClaw Token Optimizer
**Reduce OpenClaw token usage and API costs by 50-80%**
An OpenClaw skill for smart model routing, lazy context loading, optimized heartbeats, budget tracking, and native OpenClaw 2026.2.15 features (session pruning, bootstrap size limits, cache TTL alignment).
[![ClawHub](https://img.shields.io/badge/ClawHub-openclaw--token--optimizer-blue)](https://clawhub.ai/Asif2BD/openclaw-token-optimizer)
[![Version](https://img.shields.io/badge/version-1.4.2-green)](https://github.com/Asif2BD/OpenClaw-Token-Optimizer/blob/main/CHANGELOG.md)
[![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-yellow.svg)](https://opensource.org/licenses/Apache-2.0)
[![OpenClaw](https://img.shields.io/badge/OpenClaw-Skill-purple)](https://openclaw.ai)
---
## 🚀 Installation
### Option 1: ClawHub (recommended)
```bash
clawhub install Asif2BD/openclaw-token-optimizer
```
Or browse to: [clawhub.ai/Asif2BD/openclaw-token-optimizer](https://clawhub.ai/Asif2BD/openclaw-token-optimizer)
### Option 2: Manual (GitHub)
```bash
git clone https://github.com/Asif2BD/OpenClaw-Token-Optimizer.git \
~/.openclaw/skills/openclaw-token-optimizer
```
Then add to `openclaw.json`:
```json
{
"skills": {
"load": {
"extraDirs": ["~/.openclaw/skills/openclaw-token-optimizer"]
}
}
}
```
### One-line install prompt for your agent
> "Install the OpenClaw Token Optimizer skill from https://clawhub.ai/Asif2BD/openclaw-token-optimizer — or if ClawHub isn't available, clone https://github.com/Asif2BD/OpenClaw-Token-Optimizer and add the path to skills.load.extraDirs in openclaw.json"
---
## ✨ What's New in v1.4.x (OpenClaw 2026.2.15)
Three **native config patches** that work today with zero external dependencies:
### Session Pruning
Auto-trim old tool results when the Anthropic cache TTL expires — reduces cache re-write costs.
```json
{ "agents": { "defaults": { "contextPruning": { "mode": "cache-ttl", "ttl": "5m" } } } }
```
### Bootstrap Size Limits
Cap workspace file injection into the system prompt (20-40% reduction for large workspaces).
```json
{ "agents": { "defaults": { "bootstrapMaxChars": 10000, "bootstrapTotalMaxChars": 15000 } } }
```
### Cache Retention for Opus
Amortize cache write costs on long Opus sessions.
```json
{ "agents": { "defaults": { "models": { "anthropic/claude-opus-4-5": { "params": { "cacheRetention": "long" } } } } } }
```
### Cache TTL Heartbeat Alignment
Keep the Anthropic 1h prompt cache warm — avoid the re-write penalty.
```bash
python3 scripts/heartbeat_optimizer.py cache-ttl
# → recommended_interval: 55min (3300s)
```
---
## 🛠️ Quick Start
**1. Context optimization (biggest win):**
```bash
python3 scripts/context_optimizer.py recommend "hi, how are you?"
# → Load only 2 files, skip everything else → ~80% savings
```
**2. Model routing:**
```bash
python3 scripts/model_router.py "design a microservices architecture"
# → Complex task → Opus
python3 scripts/model_router.py "thanks!"
# → Simple ack → Sonnet (cheapest available)
```
**3. Optimized heartbeat:**
```bash
cp assets/HEARTBEAT.template.md ~/.openclaw/workspace/HEARTBEAT.md
python3 scripts/heartbeat_optimizer.py plan
```
**4. Token budget check:**
```bash
python3 scripts/token_tracker.py check
```
**5. Cache TTL alignment:**
```bash
python3 scripts/heartbeat_optimizer.py cache-ttl
# Set heartbeat to 55min to keep Anthropic 1h cache warm
```
---
## 🔍 Native OpenClaw Diagnostics (2026.2.15+)
```
/context list → per-file token breakdown (use before applying bootstrap limits)
/context detail → full system prompt breakdown
/usage tokens → append token count to every reply
/usage cost → cumulative cost summary
```
---
## 📁 Skill Structure
```
openclaw-token-optimizer/
├── SKILL.md ← Skill definition (loaded by OpenClaw)
├── SECURITY.md ← Full security audit + provenance
├── CHANGELOG.md ← Version history
├── .clawhubsafe ← SHA256 integrity manifest (13 files)
├── .clawhubignore ← Files excluded from publish bundle
├── scripts/
│ ├── context_optimizer.py ← Context lazy-loading
│ ├── model_router.py ← Task classification + model routing
│ ├── heartbeat_optimizer.py ← Interval management + cache-ttl alignment
│ ├── token_tracker.py ← Budget monitoring
│ └── optimize.sh ← Convenience CLI wrapper (calls Python scripts)
├── assets/
│ ├── config-patches.json ← Ready-to-apply config patches
│ ├── HEARTBEAT.template.md ← Drop-in optimized heartbeat template
│ └── cronjob-model-guide.md ← Model selection for cron tasks
└── references/
└── PROVIDERS.md ← Multi-provider strategy guide
```
---
## 📊 Expected Savings
| Strategy | Context | Model | Monthly (100K tok/day) | Savings |
|---|---|---|---|---|
| Baseline (no optimization) | 50K | Sonnet | $9.00 | 0% |
| Context optimization only | 10K | Sonnet | $5.40 | 40% |
| Model routing only | 50K | Mixed | $5.40 | 40% |
| **Both (this skill)** | **10K** | **Mixed** | **$2.70** | **70%** |
---
## 🔒 Security
All scripts are **local-only** — no network calls, no subprocess spawning, no system modifications. See [SECURITY.md](SECURITY.md) for full per-script audit.
Verify integrity:
```bash
cd ~/.openclaw/skills/openclaw-token-optimizer
sha256sum -c .clawhubsafe
```
Quick audit (should return nothing):
```bash
grep -r "urllib\|requests\|socket\|subprocess\|curl\|wget" scripts/
```
---
## 📜 Changelog
See [CHANGELOG.md](CHANGELOG.md) for full version history.
**v1.4.2** — Security scanner fixes (provenance, optimize.sh manifest, SECURITY.md)
**v1.4.1**`.clawhubignore` added (fixes public visibility)
**v1.4.0** — Native OpenClaw 2026.2.15 features (session pruning, bootstrap limits, cache TTL)
**v1.3.3** — Correct display name on ClawHub
**v1.3.2** — Security audit, SECURITY.md, .clawhubsafe manifest
---
## 🔗 Links
- **ClawHub:** https://clawhub.ai/Asif2BD/openclaw-token-optimizer
- **GitHub:** https://github.com/Asif2BD/OpenClaw-Token-Optimizer
- **OpenClaw Docs:** https://docs.openclaw.ai
- **License:** Apache 2.0
- **Author:** [Asif2BD](https://github.com/Asif2BD)

314
SECURITY.md Normal file
View File

@@ -0,0 +1,314 @@
# Security Documentation
This document provides detailed security analysis of the OpenClaw Token Optimizer skill.
## Purpose
This skill helps reduce OpenClaw API costs by optimizing context loading and model selection.
## Two-Part Architecture — Read This First
This skill has **two distinct categories of files** with different security profiles:
### Category 1: Executable Scripts (`scripts/*.py`)
These are the actual working components of the skill.
- **Network access:** None
- **External API keys:** None required
- **Code execution:** No eval/exec/compile
- **Data storage:** Local JSON files in `~/.openclaw/workspace/memory/` only
- **Verdict:** ✅ Safe to run
### Category 2: Reference Documentation (`references/PROVIDERS.md`, `assets/config-patches.json`)
These are informational guides describing optional multi-provider strategies.
- **Network access:** Described (not performed by these files)
- **External API keys:** Referenced as examples (`${OPENROUTER_API_KEY}`, `${TOGETHER_API_KEY}`)
- **Code execution:** None — these are JSON/Markdown documentation
- **Purpose:** Help users who *choose* to configure alternative providers (OpenRouter, Together.ai, etc.)
- **Verdict:** ✅ Safe files — but describe services that require external credentials if you choose to use them
**Why this matters:** Security scanners that flag API key patterns or external service references in documentation files are technically correct — those references exist. They are not executable, not auto-applied, and require explicit manual user action to use.
## Security Flag Explanation
**Why VirusTotal or AV tools may flag this skill:**
1. **API key patterns in config-patches.json**`${OPENROUTER_API_KEY}`, `${TOGETHER_API_KEY}` are placeholder strings for optional manual configuration. They are not credentials and not automatically used.
2. **External URLs in PROVIDERS.md** — References to openrouter.ai, together.ai, etc. are documentation links, not network calls.
3. **"Optimizer" keyword + file operations** — Common AV heuristic false positive.
4. **Executable Python scripts with shebang** — Standard Python scripts, no dangerous operations.
These are documentation-level references, not executable network code.
## Shell Wrapper: optimize.sh
`scripts/optimize.sh` is a **convenience CLI wrapper** — it does nothing except call the bundled Python scripts in this same directory. It is not an installer, not a downloader, and makes no network calls.
**What it does (complete source):**
```bash
case "$1" in
route|model) python3 "$SCRIPT_DIR/model_router.py" "$@" ;;
context) python3 "$SCRIPT_DIR/context_optimizer.py" generate-agents ;;
recommend) python3 "$SCRIPT_DIR/context_optimizer.py" recommend "$2" ;;
budget) python3 "$SCRIPT_DIR/token_tracker.py" check ;;
heartbeat) cp "$SCRIPT_DIR/../assets/HEARTBEAT.template.md" ~/.openclaw/workspace/HEARTBEAT.md ;;
esac
```
**Security properties:**
- ✅ No network requests
- ✅ No system modifications
- ✅ No subprocess spawning beyond the Python scripts already bundled
- ✅ No eval, exec, or dynamic code execution
- ✅ Only calls scripts already in this package (same directory via `$SCRIPT_DIR`)
- ✅ Included in `.clawhubsafe` SHA256 manifest
**To verify before running:**
```bash
grep -E "curl|wget|nc |ncat|ssh|sudo|chmod|eval|exec\(" scripts/optimize.sh
# Should return nothing
```
**If you prefer not to use the shell wrapper:** use the Python scripts directly (all documented in SKILL.md). The wrapper is optional.
---
## Script-by-Script Security Analysis
### 1. context_optimizer.py
**Purpose**: Analyze prompts to determine minimal context requirements
**Operations**:
- Reads JSON state file from `~/.openclaw/workspace/memory/context-usage.json`
- Classifies prompt complexity (simple/medium/complex)
- Recommends which files to load
- Generates optimized AGENTS.md template
- Writes usage statistics to JSON
**Security**:
- ✅ No network requests
- ✅ No code execution (no eval, exec, compile)
- ✅ Only standard library imports: `json, re, pathlib, datetime`
- ✅ Read/write permissions limited to OpenClaw workspace
- ✅ No subprocess calls
- ✅ No system modifications
**Data Handling**:
- Stores: File access counts, last access timestamps
- Location: `~/.openclaw/workspace/memory/context-usage.json`
- Privacy: All data local, never transmitted
### 2. heartbeat_optimizer.py
**Purpose**: Optimize heartbeat check scheduling to reduce unnecessary API calls
**Operations**:
- Reads heartbeat state from `~/.openclaw/workspace/memory/heartbeat-state.json`
- Determines which checks are due based on intervals
- Records when checks were last performed
- Enforces quiet hours (23:00-08:00)
**Security**:
- ✅ No network requests
- ✅ No code execution
- ✅ Only standard library imports: `json, os, datetime, pathlib`
- ✅ Read/write limited to heartbeat state file
- ✅ No system commands
**Data Handling**:
- Stores: Last check timestamps, check intervals
- Location: `~/.openclaw/workspace/memory/heartbeat-state.json`
- Privacy: All local, no telemetry
### 3. model_router.py
**Purpose**: Suggest appropriate model based on task complexity to reduce costs
**Operations**:
- Analyzes prompt text
- Classifies task complexity
- Recommends cheapest appropriate model
- No state file (pure analysis)
**Security**:
- ✅ No network requests
- ✅ No code execution
- ✅ Only standard library imports: `json, re`
- ✅ No file writes
- ✅ Stateless operation
**Data Handling**:
- No data storage
- No external communication
- Pure text analysis
### 4. token_tracker.py
**Purpose**: Monitor token usage and enforce budgets
**Operations**:
- Reads budget configuration from `~/.openclaw/workspace/memory/token-budget.json`
- Tracks usage against limits
- Provides warnings and alerts
- Records daily/monthly usage
**Security**:
- ✅ No network requests
- ✅ No code execution
- ✅ Only standard library imports: `json, os, datetime, pathlib`
- ✅ Read/write limited to budget file
- ✅ No system access
**Data Handling**:
- Stores: Usage totals, budget limits, alert thresholds
- Location: `~/.openclaw/workspace/memory/token-budget.json`
- Privacy: Local only, no transmission
## Assets & References
### Assets (Templates & Config)
- `HEARTBEAT.template.md` - Markdown template for optimized heartbeat workflow
- `config-patches.json` - Suggested OpenClaw config optimizations
- `cronjob-model-guide.md` - Documentation for cron-based model routing
**Security**: Plain text/JSON files, no code execution
### References (Documentation)
- `PROVIDERS.md` - Multi-provider strategy documentation
**Security**: Plain text markdown, informational only
## Verification
### Check File Integrity
```bash
cd ~/.openclaw/skills/token-optimizer
sha256sum -c .clawhubsafe
```
### Audit Code Yourself
```bash
# Search for dangerous functions (should return nothing)
grep -r "eval(\|exec(\|__import__\|compile(\|subprocess\|os.system" scripts/
# Search for network operations (should return nothing)
grep -r "urllib\|requests\|http\|socket\|download\|fetch" scripts/
# Search for system modifications (should return nothing)
grep -r "rm -rf\|sudo\|chmod 777\|chown" .
```
### Review Imports
All scripts use only Python standard library:
- `json` - JSON parsing
- `re` - Regular expressions for text analysis
- `pathlib` - File path handling
- `datetime` - Timestamp management
- `os` - Environment variables and path operations
No third-party libraries. No network libraries.
## Data Privacy
**What data is stored:**
- File access patterns (which files loaded when)
- Heartbeat check timestamps
- Token usage totals
- Budget configurations
**Where data is stored:**
- `~/.openclaw/workspace/memory/` (local filesystem only)
**What is NOT collected:**
- Prompt content
- User messages
- API keys
- Personal information
- System information
- Network data
**External communication:**
- None. Zero network requests.
- No telemetry
- No analytics
- No phone home
## Threat Model
**What this skill CAN do:**
- Read/write JSON files in OpenClaw workspace
- Analyze text for complexity classification
- Generate markdown templates
- Provide recommendations via stdout
**What this skill CANNOT do:**
- Execute arbitrary code
- Make network requests
- Modify system files outside workspace
- Access sensitive data
- Run system commands
- Spawn subprocesses
**Risk Level**: Minimal
- Operates entirely within OpenClaw sandbox
- No privileged operations
- No external dependencies
- Auditable source code
## Provenance & Source
This skill is maintained by **Asif2BD** (M Asif Rahman). All source code is publicly auditable:
- **GitHub repository:** https://github.com/Asif2BD/OpenClaw-Token-Optimizer
- **ClawHub listing:** https://clawhub.ai/Asif2BD/openclaw-token-optimizer
- **Author:** Asif2BD
- **License:** Apache 2.0
Every file in this skill bundle is listed in `.clawhubsafe` with its SHA256 checksum. Verify integrity at any time:
```bash
cd ~/.openclaw/skills/openclaw-token-optimizer
sha256sum -c .clawhubsafe
```
The README references the GitHub repo for users who want to review the source before installing. That link is informational — nothing in the skill auto-downloads from it.
---
## License & Attribution
- **License**: Apache 2.0
- **Author**: Asif2BD
- **Repository**: https://github.com/Asif2BD/openclaw-token-optimizer (if published)
- **ClawHub**: https://clawhub.ai/Asif2BD/openclaw-token-optimizer
## Reporting Security Issues
If you discover a security vulnerability:
1. **Do NOT open a public issue**
2. Contact: [via ClawHub or GitHub]
3. Include: Affected files, severity, reproduction steps
## Changelog
- **2026-02-12**: Added comprehensive security documentation
- **2026-02-07**: Initial release (v1.2.0)
## VirusTotal Status
**Current Status**: False positive by heuristic analysis
**Detection Reason**: Generic heuristics flagging "optimizer" keyword + file operations
**Actual Threat**: None
**Recommended Action**: Whitelist this skill or review source code yourself
---
**Bottom Line**: This skill is safe. It's a productivity tool that helps reduce API costs through intelligent context loading and model selection. All code is auditable, uses only standard Python libraries, and performs no dangerous operations.

651
SKILL.md Normal file
View File

@@ -0,0 +1,651 @@
---
name: token-optimizer
description: "通过智能模型路由、心跳优化、预算跟踪和原生2026.2.15功能会话修剪、引导大小限制、缓存TTL对齐降低OpenClaw令牌使用和API成本。在令牌成本高、遇到API速率限制或大规模托管多个代理时使用。4个可执行脚本context_optimizer、model_router、heartbeat_optimizer、token_tracker仅为本地使用——无网络请求、无子进程调用、无系统修改。"
version: 3.0.0
homepage: https://github.com/Asif2BD/OpenClaw-Token-Optimizer
source: https://github.com/Asif2BD/OpenClaw-Token-Optimizer
author: Asif2BD
security:
verified: true
auditor: Oracle (Matrix Zion)
audit_date: 2026-02-18
scripts_no_network: true
scripts_no_code_execution: true
scripts_no_subprocess: true
scripts_data_local_only: true
reference_files_describe_external_services: true
optimize_sh_is_convenience_wrapper: true
optimize_sh_only_calls_bundled_python_scripts: true
---
# Token Optimizer
Comprehensive toolkit for reducing token usage and API costs in OpenClaw deployments. Combines smart model routing, optimized heartbeat intervals, usage tracking, and multi-provider strategies.
## Quick Start
**Immediate actions** (no config changes needed):
1. **Generate optimized AGENTS.md (BIGGEST WIN!):**
```bash
python3 scripts/context_optimizer.py generate-agents
# Creates AGENTS.md.optimized — review and replace your current AGENTS.md
```
2. **Check what context you ACTUALLY need:**
```bash
python3 scripts/context_optimizer.py recommend "hi, how are you?"
# Shows: Only 2 files needed (not 50+!)
```
3. **Install optimized heartbeat:**
```bash
cp assets/HEARTBEAT.template.md ~/.openclaw/workspace/HEARTBEAT.md
```
4. **Enforce cheaper models for casual chat:**
```bash
python3 scripts/model_router.py "thanks!"
# Single-provider Anthropic setup: Use Sonnet, not Opus
# Multi-provider setup (OpenRouter/Together): Use Haiku for max savings
```
5. **Check current token budget:**
```bash
python3 scripts/token_tracker.py check
```
**Expected savings:** 50-80% reduction in token costs for typical workloads (context optimization is the biggest factor!).
## Core Capabilities
### 0. Lazy Skill Loading (NEW in v3.0 — BIGGEST WIN!)
**The single highest-impact optimization available.** Most agents burn 3,00015,000 tokens per session loading skill files they never use. Stop that first.
**The pattern:**
1. Create a lightweight `SKILLS.md` catalog in your workspace (~300 tokens — list of skills + when to load them)
2. Only load individual SKILL.md files when a task actually needs them
3. Apply the same logic to memory files — load MEMORY.md at startup, daily logs only on demand
**Token savings:**
| Library size | Before (eager) | After (lazy) | Savings |
|---|---|---|---|
| 5 skills | ~3,000 tokens | ~600 tokens | **80%** |
| 10 skills | ~6,500 tokens | ~750 tokens | **88%** |
| 20 skills | ~13,000 tokens | ~900 tokens | **93%** |
**Quick implementation in AGENTS.md:**
```markdown
## Skills
At session start: Read SKILLS.md (the index only — ~300 tokens).
Load individual skill files ONLY when a task requires them.
Never load all skills upfront.
```
**Full implementation (with catalog template + optimizer script):**
```bash
clawhub install openclaw-skill-lazy-loader
```
The companion skill `openclaw-skill-lazy-loader` includes a `SKILLS.md.template`, an `AGENTS.md.template` lazy-loading section, and a `context_optimizer.py` CLI that recommends exactly which skills to load for any given task.
**Lazy loading handles context loading costs. The remaining capabilities below handle runtime costs.** Together they cover the full token lifecycle.
---
### 1. Context Optimization (NEW!)
**Biggest token saver** — Only load files you actually need, not everything upfront.
**Problem:** Default OpenClaw loads ALL context files every session:
- SOUL.md, AGENTS.md, USER.md, TOOLS.md, MEMORY.md
- docs/**/*.md (hundreds of files)
- memory/2026-*.md (daily logs)
- Total: Often 50K+ tokens before user even speaks!
**Solution:** Lazy loading based on prompt complexity.
**Usage:**
```bash
python3 scripts/context_optimizer.py recommend "<user prompt>"
```
**Examples:**
```bash
# Simple greeting → minimal context (2 files only!)
context_optimizer.py recommend "hi"
→ Load: SOUL.md, IDENTITY.md
→ Skip: Everything else
→ Savings: ~80% of context
# Standard work → selective loading
context_optimizer.py recommend "write a function"
→ Load: SOUL.md, IDENTITY.md, memory/TODAY.md
→ Skip: docs, old memory, knowledge base
→ Savings: ~50% of context
# Complex task → full context
context_optimizer.py recommend "analyze our entire architecture"
→ Load: SOUL.md, IDENTITY.md, MEMORY.md, memory/TODAY+YESTERDAY.md
→ Conditionally load: Relevant docs only
→ Savings: ~30% of context
```
**Output format:**
```json
{
"complexity": "simple",
"context_level": "minimal",
"recommended_files": ["SOUL.md", "IDENTITY.md"],
"file_count": 2,
"savings_percent": 80,
"skip_patterns": ["docs/**/*.md", "memory/20*.md"]
}
```
**Integration pattern:**
Before loading context for a new session:
```python
from context_optimizer import recommend_context_bundle
user_prompt = "thanks for your help"
recommendation = recommend_context_bundle(user_prompt)
if recommendation["context_level"] == "minimal":
# Load only SOUL.md + IDENTITY.md
# Skip everything else
# Save ~80% tokens!
```
**Generate optimized AGENTS.md:**
```bash
context_optimizer.py generate-agents
# Creates AGENTS.md.optimized with lazy loading instructions
# Review and replace your current AGENTS.md
```
**Expected savings:** 50-80% reduction in context tokens.
### 2. Smart Model Routing (ENHANCED!)
Automatically classify tasks and route to appropriate model tiers.
**NEW: Communication pattern enforcement** — Never waste Opus tokens on "hi" or "thanks"!
**Usage:**
```bash
python3 scripts/model_router.py "<user prompt>" [current_model] [force_tier]
```
**Examples:**
```bash
# Communication (NEW!) → ALWAYS Haiku
python3 scripts/model_router.py "thanks!"
python3 scripts/model_router.py "hi"
python3 scripts/model_router.py "ok got it"
→ Enforced: Haiku (NEVER Sonnet/Opus for casual chat)
# Simple task → suggests Haiku
python3 scripts/model_router.py "read the log file"
# Medium task → suggests Sonnet
python3 scripts/model_router.py "write a function to parse JSON"
# Complex task → suggests Opus
python3 scripts/model_router.py "design a microservices architecture"
```
**Patterns enforced to Haiku (NEVER Sonnet/Opus):**
*Communication:*
- Greetings: hi, hey, hello, yo
- Thanks: thanks, thank you, thx
- Acknowledgments: ok, sure, got it, understood
- Short responses: yes, no, yep, nope
- Single words or very short phrases
*Background tasks:*
- Heartbeat checks: "check email", "monitor servers"
- Cronjobs: "scheduled task", "periodic check", "reminder"
- Document parsing: "parse CSV", "extract data from log", "read JSON"
- Log scanning: "scan error logs", "process logs"
**Integration pattern:**
```python
from model_router import route_task
user_prompt = "show me the config"
routing = route_task(user_prompt)
if routing["should_switch"]:
# Use routing["recommended_model"]
# Save routing["cost_savings_percent"]
```
**Customization:**
Edit `ROUTING_RULES` or `COMMUNICATION_PATTERNS` in `scripts/model_router.py` to adjust patterns and keywords.
### 3. Heartbeat Optimization
Reduce API calls from heartbeat polling with smart interval tracking:
**Setup:**
```bash
# Copy template to workspace
cp assets/HEARTBEAT.template.md ~/.openclaw/workspace/HEARTBEAT.md
# Plan which checks should run
python3 scripts/heartbeat_optimizer.py plan
```
**Commands:**
```bash
# Check if specific type should run now
heartbeat_optimizer.py check email
heartbeat_optimizer.py check calendar
# Record that a check was performed
heartbeat_optimizer.py record email
# Update check interval (seconds)
heartbeat_optimizer.py interval email 7200 # 2 hours
# Reset state
heartbeat_optimizer.py reset
```
**How it works:**
- Tracks last check time for each type (email, calendar, weather, etc.)
- Enforces minimum intervals before re-checking
- Respects quiet hours (23:00-08:00) — skips all checks
- Returns `HEARTBEAT_OK` when nothing needs attention (saves tokens)
**Default intervals:**
- Email: 60 minutes
- Calendar: 2 hours
- Weather: 4 hours
- Social: 2 hours
- Monitoring: 30 minutes
**Integration in HEARTBEAT.md:**
```markdown
## Email Check
Run only if: `heartbeat_optimizer.py check email` → `should_check: true`
After checking: `heartbeat_optimizer.py record email`
```
**Expected savings:** 50% reduction in heartbeat API calls.
**Model enforcement:** Heartbeat should ALWAYS use Haiku — see updated `HEARTBEAT.template.md` for model override instructions.
### 4. Cronjob Optimization (NEW!)
**Problem:** Cronjobs often default to expensive models (Sonnet/Opus) even for routine tasks.
**Solution:** Always specify Haiku for 90% of scheduled tasks.
**See:** `assets/cronjob-model-guide.md` for comprehensive guide with examples.
**Quick reference:**
| Task Type | Model | Example |
|-----------|-------|---------|
| Monitoring/alerts | Haiku | Check server health, disk space |
| Data parsing | Haiku | Extract CSV/JSON/logs |
| Reminders | Haiku | Daily standup, backup reminders |
| Simple reports | Haiku | Status summaries |
| Content generation | Sonnet | Blog summaries (quality matters) |
| Deep analysis | Sonnet | Weekly insights |
| Complex reasoning | Never use Opus for cronjobs |
**Example (good):**
```bash
# Parse daily logs with Haiku
cron add --schedule "0 2 * * *" \
--payload '{
"kind":"agentTurn",
"message":"Parse yesterday error logs and summarize",
"model":"anthropic/claude-haiku-4"
}' \
--sessionTarget isolated
```
**Example (bad):**
```bash
# ❌ Using Opus for simple check (60x more expensive!)
cron add --schedule "*/15 * * * *" \
--payload '{
"kind":"agentTurn",
"message":"Check email",
"model":"anthropic/claude-opus-4"
}' \
--sessionTarget isolated
```
**Savings:** Using Haiku instead of Opus for 10 daily cronjobs = **$17.70/month saved per agent**.
**Integration with model_router:**
```bash
# Test if your cronjob should use Haiku
model_router.py "parse daily error logs"
# → Output: Haiku (background task pattern detected)
```
### 5. Token Budget Tracking
Monitor usage and alert when approaching limits:
**Setup:**
```bash
# Check current daily usage
python3 scripts/token_tracker.py check
# Get model suggestions
python3 scripts/token_tracker.py suggest general
# Reset daily tracking
python3 scripts/token_tracker.py reset
```
**Output format:**
```json
{
"date": "2026-02-06",
"cost": 2.50,
"tokens": 50000,
"limit": 5.00,
"percent_used": 50,
"status": "ok",
"alert": null
}
```
**Status levels:**
- `ok`: Below 80% of daily limit
- `warning`: 80-99% of daily limit
- `exceeded`: Over daily limit
**Integration pattern:**
Before starting expensive operations, check budget:
```python
import json
import subprocess
result = subprocess.run(
["python3", "scripts/token_tracker.py", "check"],
capture_output=True, text=True
)
budget = json.loads(result.stdout)
if budget["status"] == "exceeded":
# Switch to cheaper model or defer non-urgent work
use_model = "anthropic/claude-haiku-4"
elif budget["status"] == "warning":
# Use balanced model
use_model = "anthropic/claude-sonnet-4-5"
```
**Customization:**
Edit `daily_limit_usd` and `warn_threshold` parameters in function calls.
### 6. Multi-Provider Strategy
See `references/PROVIDERS.md` for comprehensive guide on:
- Alternative providers (OpenRouter, Together.ai, Google AI Studio)
- Cost comparison tables
- Routing strategies by task complexity
- Fallback chains for rate-limited scenarios
- API key management
**Quick reference:**
| Provider | Model | Cost/MTok | Use Case |
|----------|-------|-----------|----------|
| Anthropic | Haiku 4 | $0.25 | Simple tasks |
| Anthropic | Sonnet 4.5 | $3.00 | Balanced default |
| Anthropic | Opus 4 | $15.00 | Complex reasoning |
| OpenRouter | Gemini 2.5 Flash | $0.075 | Bulk operations |
| Google AI | Gemini 2.0 Flash Exp | FREE | Dev/testing |
| Together | Llama 3.3 70B | $0.18 | Open alternative |
## Configuration Patches
See `assets/config-patches.json` for advanced optimizations:
**Implemented by this skill:**
- ✅ Heartbeat optimization (fully functional)
- ✅ Token budget tracking (fully functional)
- ✅ Model routing logic (fully functional)
**Native OpenClaw 2026.2.15 — apply directly:**
- ✅ Session pruning (`contextPruning: cache-ttl`) — auto-trims old tool results after Anthropic cache TTL expires
- ✅ Bootstrap size limits (`bootstrapMaxChars` / `bootstrapTotalMaxChars`) — caps workspace file injection size
- ✅ Cache retention long (`cacheRetention: "long"` for Opus) — amortizes cache write costs
**Requires OpenClaw core support:**
- ⏳ Prompt caching (Anthropic API feature — verify current status)
- ⏳ Lazy context loading (use `context_optimizer.py` script today)
- ⏳ Multi-provider fallback (partially supported)
**Apply config patches:**
```bash
# Example: Enable multi-provider fallback
gateway config.patch --patch '{"providers": [...]}'
```
## Native OpenClaw Diagnostics (2026.2.15+)
OpenClaw 2026.2.15 added built-in commands that complement this skill's Python scripts. Use these first for quick diagnostics before reaching for the scripts.
### Context breakdown
```
/context list → token count per injected file (shows exactly what's eating your prompt)
/context detail → full breakdown including tools, skills, and system prompt sections
```
**Use before applying `bootstrap_size_limits`** — see which files are oversized, then set `bootstrapMaxChars` accordingly.
### Per-response usage tracking
```
/usage tokens → append token count to every reply
/usage full → append tokens + cost estimate to every reply
/usage cost → show cumulative cost summary from session logs
/usage off → disable usage footer
```
**Combine with `token_tracker.py`** — `/usage cost` gives session totals; `token_tracker.py` tracks daily budget.
### Session status
```
/status → model, context %, last response tokens, estimated cost
```
---
## Cache TTL Heartbeat Alignment (NEW in v1.4.0)
**The problem:** Anthropic charges ~3.75x more for cache *writes* than cache *reads*. If your agent goes idle and the 1h cache TTL expires, the next request re-writes the entire prompt cache — expensive.
**The fix:** Set heartbeat interval to **55min** (just under the 1h TTL). The heartbeat keeps the cache warm, so every subsequent request pays cache-read rates instead.
```bash
# Get optimal interval for your cache TTL
python3 scripts/heartbeat_optimizer.py cache-ttl
# → recommended_interval: 55min (3300s)
# → explanation: keeps 1h Anthropic cache warm
# Custom TTL (e.g., if you've configured 2h cache)
python3 scripts/heartbeat_optimizer.py cache-ttl 7200
# → recommended_interval: 115min
```
**Apply to your OpenClaw config:**
```json
{
"agents": {
"defaults": {
"heartbeat": {
"every": "55m"
}
}
}
}
```
**Who benefits:** Anthropic API key users only. OAuth profiles already default to 1h heartbeat (OpenClaw smart default). API key profiles default to 30min — bumping to 55min is both cheaper (fewer calls) and cache-warm.
---
## Deployment Patterns
### For Personal Use
1. Install optimized `HEARTBEAT.md`
2. Run budget checks before expensive operations
3. Manually route complex tasks to Opus only when needed
**Expected savings:** 20-30%
### For Managed Hosting (xCloud, etc.)
1. Default all agents to Haiku
2. Route user interactions to Sonnet
3. Reserve Opus for explicitly complex requests
4. Use Gemini Flash for background operations
5. Implement daily budget caps per customer
**Expected savings:** 40-60%
### For High-Volume Deployments
1. Use multi-provider fallback (OpenRouter + Together.ai)
2. Implement aggressive routing (80% Gemini, 15% Haiku, 5% Sonnet)
3. Deploy local Ollama for offline/cheap operations
4. Batch heartbeat checks (every 2-4 hours, not 30 min)
**Expected savings:** 70-90%
## Integration Examples
### Workflow: Smart Task Handling
```bash
# 1. User sends message
user_msg="debug this error in the logs"
# 2. Route to appropriate model
routing=$(python3 scripts/model_router.py "$user_msg")
model=$(echo $routing | jq -r .recommended_model)
# 3. Check budget before proceeding
budget=$(python3 scripts/token_tracker.py check)
status=$(echo $budget | jq -r .status)
if [ "$status" = "exceeded" ]; then
# Use cheapest model regardless of routing
model="anthropic/claude-haiku-4"
fi
# 4. Process with selected model
# (OpenClaw handles this via config or override)
```
### Workflow: Optimized Heartbeat
```markdown
## HEARTBEAT.md
# Plan what to check
result=$(python3 scripts/heartbeat_optimizer.py plan)
should_run=$(echo $result | jq -r .should_run)
if [ "$should_run" = "false" ]; then
echo "HEARTBEAT_OK"
exit 0
fi
# Run only planned checks
planned=$(echo $result | jq -r '.planned[].type')
for check in $planned; do
case $check in
email) check_email ;;
calendar) check_calendar ;;
esac
python3 scripts/heartbeat_optimizer.py record $check
done
```
## Troubleshooting
**Issue:** Scripts fail with "module not found"
- **Fix:** Ensure Python 3.7+ is installed. Scripts use only stdlib.
**Issue:** State files not persisting
- **Fix:** Check that `~/.openclaw/workspace/memory/` directory exists and is writable.
**Issue:** Budget tracking shows $0.00
- **Fix:** `token_tracker.py` needs integration with OpenClaw's `session_status` tool. Currently tracks manually recorded usage.
**Issue:** Routing suggests wrong model tier
- **Fix:** Customize `ROUTING_RULES` in `model_router.py` for your specific patterns.
## Maintenance
**Daily:**
- Check budget status: `token_tracker.py check`
**Weekly:**
- Review routing accuracy (are suggestions correct?)
- Adjust heartbeat intervals based on activity
**Monthly:**
- Compare costs before/after optimization
- Review and update `PROVIDERS.md` with new options
## Cost Estimation
**Example: 100K tokens/day workload**
Without skill:
- 50K context tokens + 50K conversation tokens = 100K total
- All Sonnet: 100K × $3/MTok = **$0.30/day = $9/month**
| Strategy | Context | Model | Daily Cost | Monthly | Savings |
|----------|---------|-------|-----------|---------|---------|
| Baseline (no optimization) | 50K | Sonnet | $0.30 | $9.00 | 0% |
| Context opt only | 10K (-80%) | Sonnet | $0.18 | $5.40 | 40% |
| Model routing only | 50K | Mixed | $0.18 | $5.40 | 40% |
| **Both (this skill)** | **10K** | **Mixed** | **$0.09** | **$2.70** | **70%** |
| Aggressive + Gemini | 10K | Gemini | $0.03 | $0.90 | **90%** |
**Key insight:** Context optimization (50K → 10K tokens) saves MORE than model routing!
**xCloud hosting scenario** (100 customers, 50K tokens/customer/day):
- Baseline (all Sonnet, full context): $450/month
- With token-optimizer: $135/month
- **Savings: $315/month per 100 customers (70%)**
## Resources
### Scripts (4 total)
- **`context_optimizer.py`** — Context loading optimization and lazy loading (NEW!)
- **`model_router.py`** — Task classification, model suggestions, and communication enforcement (ENHANCED!)
- **`heartbeat_optimizer.py`** — Interval management and check scheduling
- **`token_tracker.py`** — Budget monitoring and alerts
### References
- `PROVIDERS.md` — Alternative AI providers, pricing, and routing strategies
### Assets (3 total)
- **`HEARTBEAT.template.md`** — Drop-in optimized heartbeat template with Haiku enforcement (ENHANCED!)
- **`cronjob-model-guide.md`** — Complete guide for choosing models in cronjobs (NEW!)
- **`config-patches.json`** — Advanced configuration examples
## Future Enhancements
Ideas for extending this skill:
1. **Auto-routing integration** — Hook into OpenClaw message pipeline
2. **Real-time usage tracking** — Parse session_status automatically
3. **Cost forecasting** — Predict monthly spend based on recent usage
4. **Provider health monitoring** — Track API latency and failures
5. **A/B testing** — Compare quality across different routing strategies

6
_meta.json Normal file
View File

@@ -0,0 +1,6 @@
{
"ownerId": "kn78dz5s9fev1b0vez1kxz1fj580ckv2",
"slug": "openclaw-token-optimizer",
"version": "3.0.0",
"publishedAt": 1772243871515
}

View File

@@ -0,0 +1,395 @@
#!/usr/bin/env python3
"""
Context optimizer - Analyze and minimize context loading.
Tracks which files are actually needed and creates minimal bundles.
"""
import json
import re
from pathlib import Path
from datetime import datetime, timedelta
STATE_FILE = Path.home() / ".openclaw/workspace/memory/context-usage.json"
# Files that should ALWAYS be loaded (identity/personality)
ALWAYS_LOAD = [
"SOUL.md",
"IDENTITY.md"
]
# Files to load on-demand based on triggers
CONDITIONAL_FILES = {
"AGENTS.md": ["workflow", "process", "how do i", "remember", "what should"],
"USER.md": ["user", "human", "owner", "about you", "who are you helping"],
"TOOLS.md": ["tool", "camera", "ssh", "voice", "tts", "device"],
"MEMORY.md": ["remember", "recall", "history", "past", "before", "last time"],
"HEARTBEAT.md": ["heartbeat", "check", "monitor", "alert"],
}
# Files to NEVER load for simple conversations
SKIP_FOR_SIMPLE = [
"docs/**/*.md", # Documentation
"memory/20*.md", # Old daily logs
"knowledge/**/*", # Knowledge base
"tasks/**/*", # Task tracking
]
def load_usage_state():
"""Load context usage tracking."""
if STATE_FILE.exists():
with open(STATE_FILE, 'r') as f:
return json.load(f)
return {
"file_access_count": {},
"last_accessed": {},
"session_summaries": []
}
def save_usage_state(state):
"""Save context usage tracking."""
STATE_FILE.parent.mkdir(parents=True, exist_ok=True)
with open(STATE_FILE, 'w') as f:
json.dump(state, f, indent=2)
def classify_prompt(prompt):
"""Classify prompt to determine context needs.
Returns:
tuple of (complexity, context_level, reasoning)
complexity: simple | medium | complex
context_level: minimal | standard | full
"""
prompt_lower = prompt.lower()
# Simple conversational patterns (minimal context)
simple_patterns = [
r'^(hi|hey|hello|thanks|thank you|ok|okay|yes|no|sure)\b',
r'^(what|how)\'s (up|it going)',
r'^\w{1,20}$', # Single word
r'^(good|great|nice|cool)',
]
for pattern in simple_patterns:
if re.search(pattern, prompt_lower):
return ("simple", "minimal", "Conversational/greeting pattern")
# Check for file/documentation references
if any(word in prompt_lower for word in ["read", "show", "file", "doc", "content"]):
return ("simple", "standard", "File access request")
# Check for memory/history references
if any(word in prompt_lower for word in ["remember", "recall", "history", "before", "last time"]):
return ("medium", "full", "Memory access needed")
# Check for complex task indicators
complex_indicators = ["design", "architect", "plan", "strategy", "analyze deeply", "comprehensive"]
if any(word in prompt_lower for word in complex_indicators):
return ("complex", "full", "Complex task requiring full context")
# Default to standard for normal work requests
return ("medium", "standard", "Standard work request")
def recommend_context_bundle(prompt, current_files=None):
"""Recommend which files to load for a given prompt.
Args:
prompt: User's message
current_files: List of files currently loaded (optional)
Returns:
dict with recommendations
"""
complexity, context_level, reasoning = classify_prompt(prompt)
prompt_lower = prompt.lower()
# Start with always-load files
recommended = set(ALWAYS_LOAD)
if context_level == "minimal":
# For simple conversations, ONLY identity files
pass
elif context_level == "standard":
# Add conditionally-loaded files based on triggers
for file, triggers in CONDITIONAL_FILES.items():
if any(trigger in prompt_lower for trigger in triggers):
recommended.add(file)
# Add today's memory log only
today = datetime.now().strftime("%Y-%m-%d")
recommended.add(f"memory/{today}.md")
elif context_level == "full":
# Add all conditional files
recommended.update(CONDITIONAL_FILES.keys())
# Add today + yesterday memory logs
today = datetime.now()
yesterday = today - timedelta(days=1)
recommended.add(f"memory/{today.strftime('%Y-%m-%d')}.md")
recommended.add(f"memory/{yesterday.strftime('%Y-%m-%d')}.md")
# Add MEMORY.md for long-term context
recommended.add("MEMORY.md")
# Calculate savings
if current_files:
current_count = len(current_files)
recommended_count = len(recommended)
savings_percent = ((current_count - recommended_count) / current_count) * 100
else:
savings_percent = None
return {
"complexity": complexity,
"context_level": context_level,
"reasoning": reasoning,
"recommended_files": sorted(list(recommended)),
"file_count": len(recommended),
"savings_percent": savings_percent,
"skip_patterns": SKIP_FOR_SIMPLE if context_level == "minimal" else []
}
def record_file_access(file_path):
"""Record that a file was accessed."""
state = load_usage_state()
# Increment access count
state["file_access_count"][file_path] = state["file_access_count"].get(file_path, 0) + 1
# Update last accessed timestamp
state["last_accessed"][file_path] = datetime.now().isoformat()
save_usage_state(state)
def get_usage_stats():
"""Get file usage statistics.
Returns:
dict with frequently/rarely accessed files
"""
state = load_usage_state()
# Sort by access count
sorted_files = sorted(
state["file_access_count"].items(),
key=lambda x: x[1],
reverse=True
)
total_accesses = sum(state["file_access_count"].values())
# Classify files
frequent = [] # Top 20% of accesses
occasional = [] # Middle 60%
rare = [] # Bottom 20%
if sorted_files:
threshold_frequent = total_accesses * 0.2
threshold_rare = total_accesses * 0.8
cumulative = 0
for file, count in sorted_files:
cumulative += count
if cumulative <= threshold_frequent:
frequent.append({"file": file, "count": count})
elif cumulative <= threshold_rare:
occasional.append({"file": file, "count": count})
else:
rare.append({"file": file, "count": count})
return {
"total_accesses": total_accesses,
"unique_files": len(sorted_files),
"frequent": frequent,
"occasional": occasional,
"rare": rare,
"recommendation": f"Consider loading frequently accessed files upfront, lazy-load rare files"
}
def generate_optimized_agents_md():
"""Generate an optimized AGENTS.md with lazy loading instructions.
Returns:
str with new AGENTS.md content
"""
return """# AGENTS.md - Token-Optimized Workspace
## 🎯 Context Loading Strategy (OPTIMIZED)
**Default: Minimal context, load on-demand**
### Every Session (Always Load)
1. Read `SOUL.md` — Who you are (identity/personality)
2. Read `IDENTITY.md` — Your role/name
**Stop there.** Don't load anything else unless needed.
### Load On-Demand Only
**When user mentions memory/history:**
- Read `MEMORY.md`
- Read `memory/YYYY-MM-DD.md` (today only)
**When user asks about workflows/processes:**
- Read `AGENTS.md` (this file)
**When user asks about tools/devices:**
- Read `TOOLS.md`
**When user asks about themselves:**
- Read `USER.md`
**Never load automatically:**
- ❌ Documentation (`docs/**/*.md`) — load only when explicitly referenced
- ❌ Old memory logs (`memory/2026-01-*.md`) — load only if user mentions date
- ❌ Knowledge base (`knowledge/**/*`) — load only when user asks about specific topic
- ❌ Task files (`tasks/**/*`) — load only when user references task
### Context by Conversation Type
**Simple conversation** (hi, thanks, yes, quick question):
- Load: SOUL.md, IDENTITY.md
- Skip: Everything else
- **Token savings: ~80%**
**Standard work request** (write code, check file):
- Load: SOUL.md, IDENTITY.md, memory/TODAY.md
- Conditionally load: TOOLS.md (if mentions tools)
- Skip: docs, old memory logs
- **Token savings: ~50%**
**Complex task** (design system, analyze history):
- Load: SOUL.md, IDENTITY.md, MEMORY.md, memory/TODAY.md, memory/YESTERDAY.md
- Conditionally load: Relevant docs/knowledge
- Skip: Unrelated documentation
- **Token savings: ~30%**
## 🔥 Model Selection (ENFORCED)
**Simple conversations → HAIKU ONLY**
- Greetings, acknowledgments, simple questions
- Never use Sonnet/Opus for casual chat
- Override: `session_status model=haiku-4`
**Standard work → SONNET**
- Code writing, file edits, explanations
- Default model for most work
**Complex reasoning → OPUS**
- Architecture design, deep analysis
- Use sparingly, only when explicitly needed
## 💾 Memory (Lazy Loading)
**Daily notes:** `memory/YYYY-MM-DD.md`
- ✅ Load TODAY when user asks about recent work
- ❌ Don't load YESTERDAY unless explicitly needed
- ❌ Don't load older logs automatically
**Long-term:** `MEMORY.md`
- ✅ Load when user mentions "remember", "history", "before"
- ❌ Don't load for simple conversations
## 📊 Heartbeats (Optimized)
Use `heartbeat_optimizer.py` from token-optimizer skill:
- Check only what needs checking (not everything every time)
- Skip during quiet hours (23:00-08:00)
- Return `HEARTBEAT_OK` when nothing to report
## 🎨 Skills (Lazy Loading)
**Don't pre-read skill documentation.**
When skill triggers:
1. Read only the SKILL.md
2. Read only the specific reference files you need
3. Skip examples/assets unless explicitly needed
## 🚫 Anti-Patterns (What NOT to Do)
❌ Loading all docs at session start
❌ Re-reading unchanged files
❌ Using Opus for simple chat
❌ Checking everything in every heartbeat
❌ Loading full conversation history for simple questions
✅ Load minimal context by default
✅ Read files only when referenced
✅ Use cheapest model for the task
✅ Batch heartbeat checks intelligently
✅ Keep context focused on current task
## 📈 Monitoring
Track your savings:
```bash
python3 scripts/context_optimizer.py stats
python3 scripts/token_tracker.py check
```
## Integration
Run context optimizer before responding:
```bash
# Get recommendations
context_optimizer.py recommend "<user prompt>"
# Only load recommended files
# Skip everything else
```
---
**This optimized approach reduces token usage by 50-80% for typical workloads.**
"""
def main():
"""CLI interface for context optimizer."""
import sys
if len(sys.argv) < 2:
print("Usage: context_optimizer.py [recommend|record|stats|generate-agents]")
sys.exit(1)
command = sys.argv[1]
if command == "recommend":
if len(sys.argv) < 3:
print("Usage: context_optimizer.py recommend '<prompt>' [current_files]")
sys.exit(1)
prompt = sys.argv[2]
current_files = sys.argv[3:] if len(sys.argv) > 3 else None
result = recommend_context_bundle(prompt, current_files)
print(json.dumps(result, indent=2))
elif command == "record":
if len(sys.argv) < 3:
print("Usage: context_optimizer.py record <file_path>")
sys.exit(1)
file_path = sys.argv[2]
record_file_access(file_path)
print(f"Recorded access: {file_path}")
elif command == "stats":
result = get_usage_stats()
print(json.dumps(result, indent=2))
elif command == "generate-agents":
content = generate_optimized_agents_md()
output_path = Path.home() / ".openclaw/workspace/AGENTS.md.optimized"
output_path.write_text(content)
print(f"Generated optimized AGENTS.md at: {output_path}")
print("\nReview and replace your current AGENTS.md with this version.")
else:
print(f"Unknown command: {command}")
sys.exit(1)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,298 @@
#!/usr/bin/env python3
"""
Heartbeat optimizer - Manages efficient heartbeat intervals and batched checks.
Reduces API calls by tracking check timestamps and batching operations.
v1.4.0: Added cache-ttl alignment — recommends 55min intervals to keep
Anthropic's 1h prompt cache warm between heartbeats (avoids cache re-write penalty).
"""
import json
import os
from datetime import datetime, timedelta
from pathlib import Path
STATE_FILE = Path.home() / ".openclaw/workspace/memory/heartbeat-state.json"
# Optimal interval to keep Anthropic's 1h prompt cache warm.
# Set just under 1h so the cache never expires between heartbeats.
# Anthropic API key users should use this as their default heartbeat interval.
CACHE_TTL_OPTIMAL_INTERVAL = 3300 # 55 minutes in seconds
CACHE_TTL_WINDOW = 3600 # Anthropic default cache TTL = 1 hour
DEFAULT_INTERVALS = {
"email": 3600, # 1 hour
"calendar": 7200, # 2 hours
"weather": 14400, # 4 hours
"social": 7200, # 2 hours
"monitoring": 1800 # 30 minutes
}
QUIET_HOURS = {
"start": 23, # 11 PM
"end": 8 # 8 AM
}
def load_state():
"""Load heartbeat tracking state."""
if STATE_FILE.exists():
with open(STATE_FILE, 'r') as f:
return json.load(f)
return {
"lastChecks": {},
"intervals": DEFAULT_INTERVALS.copy(),
"skipCount": 0
}
def save_state(state):
"""Save heartbeat tracking state."""
STATE_FILE.parent.mkdir(parents=True, exist_ok=True)
with open(STATE_FILE, 'w') as f:
json.dump(state, f, indent=2)
def is_quiet_hours(hour=None):
"""Check if current time is during quiet hours."""
if hour is None:
hour = datetime.now().hour
start = QUIET_HOURS["start"]
end = QUIET_HOURS["end"]
if start > end: # Wraps midnight
return hour >= start or hour < end
else:
return start <= hour < end
def should_check(check_type, force=False):
"""Determine if a check should run based on interval.
Args:
check_type: Type of check (email, calendar, etc.)
force: Force check regardless of interval
Returns:
dict with decision and reasoning
"""
if force:
return {
"should_check": True,
"reason": "Forced check",
"next_check": None
}
# Skip all checks during quiet hours
if is_quiet_hours():
return {
"should_check": False,
"reason": "Quiet hours (23:00-08:00)",
"next_check": "08:00"
}
state = load_state()
now = datetime.now()
# Get last check time
last_check_ts = state["lastChecks"].get(check_type)
if not last_check_ts:
# Never checked before
return {
"should_check": True,
"reason": "First check",
"next_check": None
}
last_check = datetime.fromisoformat(last_check_ts)
interval = state["intervals"].get(check_type, DEFAULT_INTERVALS.get(check_type, 3600))
next_check = last_check + timedelta(seconds=interval)
if now >= next_check:
return {
"should_check": True,
"reason": f"Interval elapsed ({interval}s)",
"next_check": None
}
else:
remaining = (next_check - now).total_seconds()
return {
"should_check": False,
"reason": f"Too soon ({int(remaining / 60)}min remaining)",
"next_check": next_check.strftime("%H:%M")
}
def record_check(check_type):
"""Record that a check was performed."""
state = load_state()
state["lastChecks"][check_type] = datetime.now().isoformat()
save_state(state)
def plan_heartbeat(checks=None):
"""Plan which checks should run in next heartbeat.
Args:
checks: List of check types to consider (default: all)
Returns:
dict with planned checks and skip decision
"""
if checks is None:
checks = list(DEFAULT_INTERVALS.keys())
planned = []
skipped = []
for check in checks:
decision = should_check(check)
if decision["should_check"]:
planned.append({
"type": check,
"reason": decision["reason"]
})
else:
skipped.append({
"type": check,
"reason": decision["reason"],
"next_check": decision["next_check"]
})
result = {
"planned": planned,
"skipped": skipped,
"should_run": len(planned) > 0,
"can_skip": len(planned) == 0
}
# Add cache TTL alignment recommendation
result["cache_ttl_tip"] = (
"Tip: Set your OpenClaw heartbeat interval to 55min (3300s) "
"to keep the Anthropic 1h prompt cache warm. "
"Run: heartbeat_optimizer.py cache-ttl for details."
)
return result
def get_cache_ttl_recommendation(cache_ttl_seconds=None):
"""Calculate optimal heartbeat interval for Anthropic cache TTL warmup.
Anthropic prompt caching has a 1h TTL by default on API key profiles.
Setting heartbeat interval just under the TTL prevents the cache from
expiring between heartbeats — avoiding the cache re-write penalty.
Args:
cache_ttl_seconds: Your cache TTL in seconds (default: 3600 = 1h)
Returns:
dict with recommended interval and explanation
"""
if cache_ttl_seconds is None:
cache_ttl_seconds = CACHE_TTL_WINDOW
# Use 92% of TTL as the safe warmup interval (5min buffer)
buffer_seconds = 300 # 5 minute buffer
recommended = cache_ttl_seconds - buffer_seconds
return {
"cache_ttl_seconds": cache_ttl_seconds,
"cache_ttl_human": f"{cache_ttl_seconds // 60}min",
"recommended_interval_seconds": recommended,
"recommended_interval_human": f"{recommended // 60}min",
"buffer_seconds": buffer_seconds,
"explanation": (
f"With a {cache_ttl_seconds // 60}min Anthropic cache TTL, set your heartbeat "
f"to {recommended // 60}min ({recommended}s). This keeps the prompt cache warm "
f"between heartbeats — preventing the cache re-write penalty when the TTL expires."
),
"how_to_configure": (
"In openclaw.json: agents.defaults.heartbeat.every = \"55m\"\n"
"Or use the config patch from assets/config-patches.json (heartbeat_optimization)"
),
"cost_impact": (
"Cache writes cost ~3.75x more than cache reads (Anthropic pricing). "
"Without warmup: every heartbeat after an idle hour triggers a full cache re-write. "
"With warmup: cache reads only — significantly cheaper for long-running agents."
),
"note": (
"This applies to Anthropic API key users only. "
"OAuth profiles use a 1h heartbeat by default (OpenClaw smart default). "
"API key profiles default to 30min heartbeat — consider bumping to 55min."
)
}
def update_interval(check_type, new_interval_seconds):
"""Update check interval for a specific check type.
Args:
check_type: Type of check
new_interval_seconds: New interval in seconds
"""
state = load_state()
state["intervals"][check_type] = new_interval_seconds
save_state(state)
return {
"check_type": check_type,
"old_interval": DEFAULT_INTERVALS.get(check_type),
"new_interval": new_interval_seconds
}
def main():
"""CLI interface for heartbeat optimizer."""
import sys
if len(sys.argv) < 2:
print("Usage: heartbeat_optimizer.py [plan|check|record|interval|cache-ttl|reset]")
sys.exit(1)
command = sys.argv[1]
if command == "plan":
# Plan next heartbeat
checks = sys.argv[2:] if len(sys.argv) > 2 else None
result = plan_heartbeat(checks)
print(json.dumps(result, indent=2))
elif command == "check":
# Check if specific type should run
if len(sys.argv) < 3:
print("Usage: heartbeat_optimizer.py check <type>")
sys.exit(1)
check_type = sys.argv[2]
force = len(sys.argv) > 3 and sys.argv[3] == "--force"
result = should_check(check_type, force)
print(json.dumps(result, indent=2))
elif command == "record":
# Record that a check was performed
if len(sys.argv) < 3:
print("Usage: heartbeat_optimizer.py record <type>")
sys.exit(1)
check_type = sys.argv[2]
record_check(check_type)
print(f"Recorded check: {check_type}")
elif command == "interval":
# Update interval
if len(sys.argv) < 4:
print("Usage: heartbeat_optimizer.py interval <type> <seconds>")
sys.exit(1)
check_type = sys.argv[2]
interval = int(sys.argv[3])
result = update_interval(check_type, interval)
print(json.dumps(result, indent=2))
elif command == "cache-ttl":
# Show cache TTL alignment recommendation
cache_ttl = int(sys.argv[2]) if len(sys.argv) > 2 else None
result = get_cache_ttl_recommendation(cache_ttl)
print(json.dumps(result, indent=2))
elif command == "reset":
# Reset state
if STATE_FILE.exists():
STATE_FILE.unlink()
print("Heartbeat state reset.")
else:
print(f"Unknown command: {command}")
print("Available: plan | check <type> | record <type> | interval <type> <seconds> | cache-ttl [ttl_seconds] | reset")
sys.exit(1)
if __name__ == "__main__":
main()

438
scripts/model_router.py Normal file
View File

@@ -0,0 +1,438 @@
#!/usr/bin/env python3
"""
Smart model router - routes tasks to appropriate models based on complexity.
Supports multiple providers: Anthropic, OpenAI, Google, OpenRouter.
Helps reduce token costs by using cheaper models for simpler tasks.
Version: 1.1.0
"""
import re
import os
import json
# ============================================================================
# PROVIDER CONFIGURATION
# ============================================================================
# Detect primary provider from environment (default: anthropic)
def detect_provider():
"""Detect which provider to use based on available API keys."""
if os.environ.get("ANTHROPIC_API_KEY"):
return "anthropic"
elif os.environ.get("OPENAI_API_KEY"):
return "openai"
elif os.environ.get("GOOGLE_API_KEY"):
return "google"
elif os.environ.get("OPENROUTER_API_KEY"):
return "openrouter"
# Default to anthropic
return "anthropic"
# Model tiers per provider
PROVIDER_MODELS = {
"anthropic": {
"cheap": "anthropic/claude-haiku-4",
"balanced": "anthropic/claude-sonnet-4-5",
"smart": "anthropic/claude-opus-4",
"costs": { # $/MTok (input)
"cheap": 0.25,
"balanced": 3.00,
"smart": 15.00
}
},
"openai": {
"cheap": "openai/gpt-4.1-nano",
"balanced": "openai/gpt-4.1-mini",
"smart": "openai/gpt-4.1",
"premium": "openai/gpt-5",
"costs": {
"cheap": 0.10,
"balanced": 0.40,
"smart": 2.00,
"premium": 10.00
}
},
"google": {
"cheap": "google/gemini-2.0-flash",
"balanced": "google/gemini-2.5-flash",
"smart": "google/gemini-2.5-pro",
"costs": {
"cheap": 0.075,
"balanced": 0.15,
"smart": 1.25
}
},
"openrouter": {
"cheap": "google/gemini-2.0-flash",
"balanced": "anthropic/claude-sonnet-4-5",
"smart": "anthropic/claude-opus-4",
"costs": {
"cheap": 0.075,
"balanced": 3.00,
"smart": 15.00
}
}
}
# Tier mapping for cross-provider compatibility
TIER_ALIASES = {
"haiku": "cheap",
"sonnet": "balanced",
"opus": "smart",
"nano": "cheap",
"mini": "balanced",
"flash": "cheap",
"pro": "smart"
}
# ============================================================================
# TASK CLASSIFICATION PATTERNS
# ============================================================================
# Communication patterns that should ALWAYS use cheap tier (never balanced/smart)
COMMUNICATION_PATTERNS = [
r'^(hi|hey|hello|yo|sup)\b',
r'^(thanks|thank you|thx)\b',
r'^(ok|okay|sure|got it|understood)\b',
r'^(yes|yeah|yep|yup|no|nope)\b',
r'^(good|great|nice|cool|awesome)\b',
r"^(what|how)'s (up|it going)",
r'^\w{1,15}$', # Single short word
r'^(lol|haha|lmao)\b',
]
# Background/routine tasks that should ALWAYS use cheap tier
BACKGROUND_TASK_PATTERNS = [
# Heartbeat checks
r'heartbeat',
r'check\s+(email|calendar|weather|monitoring)',
r'monitor\s+',
r'poll\s+',
# Cronjob/scheduled tasks
r'cron',
r'scheduled\s+task',
r'periodic\s+check',
r'reminder',
# Document parsing/extraction
r'parse\s+(document|file|log|csv|json|xml)',
r'extract\s+(text|data|content)\s+from',
r'read\s+(log|logs)',
r'scan\s+(file|document)',
r'process\s+(csv|json|xml|yaml)',
]
# Model routing rules with tier-based approach
ROUTING_RULES = {
"cheap": {
"patterns": [
r"read\s+file",
r"list\s+files",
r"show\s+(me\s+)?the\s+contents?",
r"what's\s+in",
r"cat\s+",
r"get\s+status",
r"check\s+(if|whether)",
r"is\s+\w+\s+(running|active|enabled)"
],
"keywords": ["read", "list", "show", "status", "check", "get"],
"cost_multiplier": 0.083 # vs balanced
},
"balanced": {
"patterns": [
r"write\s+\w+",
r"create\s+\w+",
r"edit\s+\w+",
r"fix\s+\w+",
r"debug\s+\w+",
r"explain\s+\w+",
r"how\s+(do|can)\s+i"
],
"keywords": ["write", "create", "edit", "update", "fix", "debug", "explain"],
"cost_multiplier": 1.0
},
"smart": {
"patterns": [
r"complex\s+\w+",
r"design\s+\w+",
r"architect\w+",
r"analyze\s+deeply",
r"comprehensive\s+\w+"
],
"keywords": ["design", "architect", "complex", "comprehensive", "deep"],
"cost_multiplier": 5.0
}
}
# Legacy tier names for backwards compatibility
LEGACY_TIER_MAP = {
"haiku": "cheap",
"sonnet": "balanced",
"opus": "smart"
}
# ============================================================================
# CORE FUNCTIONS
# ============================================================================
def classify_task(prompt):
"""Classify task complexity based on prompt text.
Args:
prompt: User's message/request
Returns:
tuple of (tier, confidence, reasoning)
tier is one of: cheap, balanced, smart
"""
prompt_lower = prompt.lower()
# FIRST: Check if this is simple communication (ALWAYS cheap)
for pattern in COMMUNICATION_PATTERNS:
if re.search(pattern, prompt_lower):
return ("cheap", 1.0, "Simple communication - use cheapest model")
# SECOND: Check if this is a background/routine task (ALWAYS cheap)
for pattern in BACKGROUND_TASK_PATTERNS:
if re.search(pattern, prompt_lower):
return ("cheap", 1.0, "Background task (heartbeat/cron/parsing) - use cheapest model")
# Score each tier
scores = {}
for tier, rules in ROUTING_RULES.items():
score = 0
matches = []
# Pattern matching
for pattern in rules["patterns"]:
if re.search(pattern, prompt_lower):
score += 2
matches.append(f"pattern:{pattern}")
# Keyword matching
for keyword in rules["keywords"]:
if keyword in prompt_lower:
score += 1
matches.append(f"keyword:{keyword}")
scores[tier] = {
"score": score,
"matches": matches
}
# Determine best tier
best_tier = max(scores.items(), key=lambda x: x[1]["score"])
if best_tier[1]["score"] == 0:
# Default to balanced if unclear
return ("balanced", 0.5, "No clear indicators, defaulting to balanced model")
confidence = min(best_tier[1]["score"] / 5.0, 1.0) # Cap at 1.0
reasoning = f"Matched: {', '.join(best_tier[1]['matches'][:3])}"
return (best_tier[0], confidence, reasoning)
def normalize_tier(tier):
"""Normalize tier name to standard format (cheap/balanced/smart)."""
tier_lower = tier.lower()
# Check legacy mappings
if tier_lower in LEGACY_TIER_MAP:
return LEGACY_TIER_MAP[tier_lower]
# Check aliases
if tier_lower in TIER_ALIASES:
return TIER_ALIASES[tier_lower]
# Already standard or unknown
if tier_lower in ["cheap", "balanced", "smart", "premium"]:
return tier_lower
return "balanced" # Default
def get_model_for_tier(tier, provider=None):
"""Get the specific model name for a tier and provider.
Args:
tier: cheap, balanced, smart, or premium
provider: anthropic, openai, google, openrouter (or None to auto-detect)
Returns:
Model identifier string
"""
if provider is None:
provider = detect_provider()
provider_config = PROVIDER_MODELS.get(provider, PROVIDER_MODELS["anthropic"])
# Normalize tier
tier = normalize_tier(tier)
# Get model (fallback to balanced if tier not available)
return provider_config.get(tier, provider_config.get("balanced"))
def route_task(prompt, current_model=None, force_tier=None, provider=None):
"""Route a task to appropriate model.
Args:
prompt: User's message/request
current_model: Current model being used (optional)
force_tier: Override classification (cheap/balanced/smart or haiku/sonnet/opus)
provider: Force specific provider (anthropic/openai/google/openrouter)
Returns:
dict with routing decision
"""
# Auto-detect provider if not specified
if provider is None:
provider = detect_provider()
# Set default current model
if current_model is None:
current_model = get_model_for_tier("balanced", provider)
if force_tier:
tier = normalize_tier(force_tier)
confidence = 1.0
reasoning = "User-specified tier"
else:
tier, confidence, reasoning = classify_task(prompt)
recommended_model = get_model_for_tier(tier, provider)
# Calculate cost savings
provider_config = PROVIDER_MODELS.get(provider, PROVIDER_MODELS["anthropic"])
base_cost = provider_config["costs"].get("balanced", 1.0)
tier_cost = provider_config["costs"].get(tier, base_cost)
cost_savings = (1.0 - (tier_cost / base_cost)) * 100
return {
"provider": provider,
"current_model": current_model,
"recommended_model": recommended_model,
"tier": tier,
"tier_display": {
"cheap": "Cheap (Haiku/Nano/Flash)",
"balanced": "Balanced (Sonnet/Mini/Flash)",
"smart": "Smart (Opus/GPT-4.1/Pro)",
"premium": "Premium (GPT-5)"
}.get(tier, tier),
"confidence": confidence,
"reasoning": reasoning,
"cost_savings_percent": max(0, cost_savings),
"should_switch": recommended_model != current_model,
"all_providers": {
p: get_model_for_tier(tier, p) for p in PROVIDER_MODELS.keys()
}
}
def get_model_comparison():
"""Get a comparison of all models across providers.
Returns:
dict with provider -> tier -> model mapping
"""
result = {}
for provider, config in PROVIDER_MODELS.items():
result[provider] = {
tier: {
"model": model,
"cost_per_mtok": config["costs"].get(tier, "N/A")
}
for tier, model in config.items()
if tier != "costs"
}
return result
# ============================================================================
# CLI INTERFACE
# ============================================================================
def main():
"""CLI interface for model router."""
import sys
if len(sys.argv) < 2:
print("Usage: model_router.py <command> [args]")
print("")
print("Commands:")
print(" route '<prompt>' [current_model] [force_tier] [provider]")
print(" compare — Show all models across providers")
print(" providers — List available providers")
print(" detect — Show auto-detected provider")
print("")
print("Examples:")
print(" model_router.py route 'thanks!'")
print(" model_router.py route 'design an architecture' --provider openai")
print(" model_router.py compare")
sys.exit(1)
command = sys.argv[1]
# Known commands
known_commands = ["route", "compare", "providers", "detect"]
if command == "route" or command not in known_commands:
# Route a prompt (either explicit "route" command or shorthand)
if command == "route":
if len(sys.argv) < 3:
print("Usage: model_router.py route '<prompt>'")
sys.exit(1)
prompt = sys.argv[2]
start_idx = 3
else:
# Shorthand: first arg is the prompt
prompt = command
start_idx = 2
# Parse remaining args
current_model = None
force_tier = None
provider = None
i = start_idx
while i < len(sys.argv):
arg = sys.argv[i]
if arg.startswith("--provider="):
provider = arg.split("=")[1]
elif arg.startswith("--tier="):
force_tier = arg.split("=")[1]
elif arg == "--provider" and i+1 < len(sys.argv):
provider = sys.argv[i+1]
i += 1
elif arg == "--tier" and i+1 < len(sys.argv):
force_tier = sys.argv[i+1]
i += 1
elif arg.startswith("--"):
pass # Skip unknown flags
elif current_model is None and "/" in arg:
current_model = arg
elif force_tier is None:
force_tier = arg
i += 1
result = route_task(prompt, current_model, force_tier, provider)
print(json.dumps(result, indent=2))
elif command == "compare":
result = get_model_comparison()
print(json.dumps(result, indent=2))
elif command == "providers":
print("Available providers:")
for provider in PROVIDER_MODELS.keys():
detected = " (detected)" if provider == detect_provider() else ""
print(f" - {provider}{detected}")
elif command == "detect":
provider = detect_provider()
print(f"Auto-detected provider: {provider}")
print(f"Models: {json.dumps(PROVIDER_MODELS[provider], indent=2)}")
else:
print(f"Unknown command: {command}")
sys.exit(1)
if __name__ == "__main__":
main()

67
scripts/optimize.sh Normal file
View File

@@ -0,0 +1,67 @@
#!/bin/bash
# optimize.sh - Quick CLI wrapper for token optimization tools
#
# Usage:
# ./optimize.sh route "your prompt here" # Route to appropriate model
# ./optimize.sh context # Generate optimized AGENTS.md
# ./optimize.sh recommend "prompt" # Recommend context files
# ./optimize.sh budget # Check token budget
# ./optimize.sh heartbeat # Install optimized heartbeat
#
# Examples:
# ./optimize.sh route "thanks!" # → cheap tier (Haiku)
# ./optimize.sh route "design an API" # → smart tier (Opus)
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
case "$1" in
route|model)
shift
python3 "$SCRIPT_DIR/model_router.py" "$@"
;;
context|agents)
python3 "$SCRIPT_DIR/context_optimizer.py" generate-agents
;;
recommend|ctx)
shift
python3 "$SCRIPT_DIR/context_optimizer.py" recommend "$@"
;;
budget|tokens|check)
python3 "$SCRIPT_DIR/token_tracker.py" check
;;
heartbeat|hb)
DEST="${HOME}/.openclaw/workspace/HEARTBEAT.md"
cp "$SCRIPT_DIR/../assets/HEARTBEAT.template.md" "$DEST"
echo "✅ Installed optimized heartbeat to: $DEST"
;;
providers)
python3 "$SCRIPT_DIR/model_router.py" providers
;;
detect)
python3 "$SCRIPT_DIR/model_router.py" detect
;;
help|--help|-h|"")
echo "Token Optimizer CLI"
echo ""
echo "Usage: ./optimize.sh <command> [args]"
echo ""
echo "Commands:"
echo " route <prompt> Route prompt to appropriate model tier"
echo " context Generate optimized AGENTS.md"
echo " recommend <prompt> Recommend context files for prompt"
echo " budget Check current token budget"
echo " heartbeat Install optimized heartbeat"
echo " providers List available providers"
echo " detect Show auto-detected provider"
echo ""
echo "Examples:"
echo " ./optimize.sh route 'thanks!' # → cheap tier"
echo " ./optimize.sh route 'design an API' # → smart tier"
echo " ./optimize.sh budget # → current usage"
;;
*)
echo "Unknown command: $1"
echo "Run './optimize.sh help' for usage"
exit 1
;;
esac

156
scripts/token_tracker.py Normal file
View File

@@ -0,0 +1,156 @@
#!/usr/bin/env python3
"""
Token usage tracker with budget alerts.
Monitors API usage and warns when approaching limits.
"""
import json
import os
from datetime import datetime, timedelta
from pathlib import Path
STATE_FILE = Path.home() / ".openclaw/workspace/memory/token-tracker-state.json"
def load_state():
"""Load tracking state from file."""
if STATE_FILE.exists():
with open(STATE_FILE, 'r') as f:
return json.load(f)
return {
"daily_usage": {},
"alerts_sent": [],
"last_reset": datetime.now().isoformat()
}
def save_state(state):
"""Save tracking state to file."""
STATE_FILE.parent.mkdir(parents=True, exist_ok=True)
with open(STATE_FILE, 'w') as f:
json.dump(state, f, indent=2)
def get_usage_from_session_status():
"""Parse session status to extract token usage.
Returns dict with input_tokens, output_tokens, and cost.
"""
# This would integrate with OpenClaw's session_status tool
# For now, returns placeholder structure
return {
"input_tokens": 0,
"output_tokens": 0,
"total_cost": 0.0,
"model": "anthropic/claude-sonnet-4-5"
}
def check_budget(daily_limit_usd=5.0, warn_threshold=0.8):
"""Check if usage is approaching daily budget.
Args:
daily_limit_usd: Daily spending limit in USD
warn_threshold: Fraction of limit to trigger warning (default 80%)
Returns:
dict with status, usage, limit, and alert message if applicable
"""
state = load_state()
today = datetime.now().date().isoformat()
# Reset if new day
if today not in state["daily_usage"]:
state["daily_usage"] = {today: {"cost": 0.0, "tokens": 0}}
state["alerts_sent"] = []
usage = state["daily_usage"][today]
percent_used = (usage["cost"] / daily_limit_usd) * 100
result = {
"date": today,
"cost": usage["cost"],
"tokens": usage["tokens"],
"limit": daily_limit_usd,
"percent_used": percent_used,
"status": "ok"
}
# Check thresholds
if percent_used >= 100:
result["status"] = "exceeded"
result["alert"] = f"⚠️ Daily budget exceeded! ${usage['cost']:.2f} / ${daily_limit_usd:.2f}"
elif percent_used >= (warn_threshold * 100):
result["status"] = "warning"
result["alert"] = f"⚠️ Approaching daily limit: ${usage['cost']:.2f} / ${daily_limit_usd:.2f} ({percent_used:.0f}%)"
return result
def suggest_cheaper_model(current_model, task_type="general"):
"""Suggest cheaper alternative models based on task type.
Args:
current_model: Currently configured model
task_type: Type of task (general, simple, complex)
Returns:
dict with suggestion and cost savings
"""
# Cost per 1M tokens (input/output average)
model_costs = {
"anthropic/claude-opus-4": 15.0,
"anthropic/claude-sonnet-4-5": 3.0,
"anthropic/claude-haiku-4": 0.25,
"google/gemini-2.0-flash-exp": 0.075,
"openai/gpt-4o": 2.5,
"openai/gpt-4o-mini": 0.15
}
suggestions = {
"simple": [
("anthropic/claude-haiku-4", "12x cheaper, great for file reads, routine checks"),
("google/gemini-2.0-flash-exp", "40x cheaper via OpenRouter, good for simple tasks")
],
"general": [
("anthropic/claude-sonnet-4-5", "Balanced performance and cost"),
("google/gemini-2.0-flash-exp", "Much cheaper, decent quality")
],
"complex": [
("anthropic/claude-opus-4", "Best reasoning, use sparingly"),
("anthropic/claude-sonnet-4-5", "Good balance for most complex tasks")
]
}
return {
"current": current_model,
"current_cost": model_costs.get(current_model, "unknown"),
"suggestions": suggestions.get(task_type, suggestions["general"])
}
def main():
"""CLI interface for token tracker."""
import sys
if len(sys.argv) < 2:
print("Usage: token_tracker.py [check|suggest|reset]")
sys.exit(1)
command = sys.argv[1]
if command == "check":
result = check_budget()
print(json.dumps(result, indent=2))
elif command == "suggest":
task = sys.argv[2] if len(sys.argv) > 2 else "general"
current = sys.argv[3] if len(sys.argv) > 3 else "anthropic/claude-sonnet-4-5"
result = suggest_cheaper_model(current, task)
print(json.dumps(result, indent=2))
elif command == "reset":
state = load_state()
state["daily_usage"] = {}
state["alerts_sent"] = []
save_state(state)
print("Token tracker state reset.")
else:
print(f"Unknown command: {command}")
sys.exit(1)
if __name__ == "__main__":
main()