1.9 KiB
1.9 KiB
TOKEN OPTIMIZATION RULES
Add these rules to your agent prompt for 97% cost reduction.
SESSION INITIALIZATION RULE
On every session start:
-
Load ONLY these files:
- SOUL.md
- USER.md
- IDENTITY.md
- memory/YYYY-MM-DD.md (if it exists)
-
DO NOT auto-load:
- MEMORY.md
- Session history
- Prior messages
- Previous tool outputs
-
When user asks about prior context:
- Use memory_search() on demand
- Pull only the relevant snippet with memory_get()
- Don't load the whole file
-
Update memory/YYYY-MM-DD.md at end of session with:
- What you worked on
- Decisions made
- Leads generated
- Blockers
- Next steps
This saves 80% on context overhead.
MODEL SELECTION RULE
Default: Always use Haiku
Switch to Sonnet ONLY when:
- Architecture decisions
- Production code review
- Security analysis
- Complex debugging/reasoning
- Strategic multi-project decisions
When in doubt: Try Haiku first.
RATE LIMITS
- 5 seconds minimum between API calls
- 10 seconds between web searches
- Max 5 searches per batch, then 2-minute break
- Batch similar work (one request for 10 leads, not 10 requests)
- If you hit 429 error: STOP, wait 5 minutes, retry
DAILY BUDGET: $5 (warning at 75%)
MONTHLY BUDGET: $200 (warning at 75%)
COST AWARENESS
Before any operation, consider:
- Can this be batched with similar operations?
- Is this the minimum model needed for this task?
- Am I loading only necessary context?
- Will this push me over budget limits?
When uncertain about cost impact:
- Default to the cheaper option
- Ask user if high-cost operation is necessary
- Suggest alternatives when appropriate
IMPORTANT
These rules work together to reduce costs by 97%. Do not remove or modify unless you understand the cost implications.
Expected cost reduction:
- Before: $1,500+/month
- After: $30-50/month