1275 lines
55 KiB
Markdown
1275 lines
55 KiB
Markdown
# Moltbot Memory Architecture — Design Document
|
||
|
||
> *"Memory is where the spirit rests."*
|
||
> Version: 0.1-draft | Date: 2026-02-02
|
||
|
||
---
|
||
|
||
## 1. Philosophy
|
||
|
||
Human memory is not a filing cabinet. It's a living system that encodes, consolidates, decays, and reconstructs. This architecture mirrors those properties:
|
||
|
||
- **Encoding** happens during conversation, triggered by natural language ("remember this", "don't forget")
|
||
- **Consolidation** happens during idle time, like the brain during sleep — extracting patterns, pruning noise, strengthening connections
|
||
- **Decay** is a feature, not a bug — unaccessed memories fade gracefully, keeping retrieval sharp
|
||
- **Reconstruction** means memory isn't playback; it's active interpretation through the agent's current understanding
|
||
- **Accountability** means every change is tracked — who made it, why, and when. The agent's cognitive evolution is auditable, revertable, and transparent.
|
||
|
||
The system is built on four cognitive stores, a keyword-triggered interface, LLM-powered routing, graph-structured semantics, and a sleep-time reflection cycle with human-in-the-loop approval.
|
||
|
||
---
|
||
|
||
## 2. Architecture Overview
|
||
|
||
```
|
||
┌─────────────────────────────────────────────────────┐
|
||
│ CONTEXT WINDOW │
|
||
│ ┌──────────────┐ ┌────────────┐ ┌─────────────┐ │
|
||
│ │ System │ │ Core │ │ Conversation│ │
|
||
│ │ Prompts │ │ Memory │ │ + Tools │ │
|
||
│ │ ~4-5K tokens │ │ ~3K tokens│ │ ~185K+ │ │
|
||
│ └──────────────┘ └─────┬──────┘ └─────────────┘ │
|
||
└───────────────────────────┼─────────────────────────┘
|
||
│ always loaded
|
||
▼
|
||
┌─────────────────────────────────────────────────────┐
|
||
│ MEMORY STORES │
|
||
│ │
|
||
│ ┌─────────┐ ┌──────────┐ ┌──────────┐ │
|
||
│ │Episodic │ │ Semantic │ │Procedural│ │
|
||
│ │(chrono) │ │ (graph) │ │(patterns)│ │
|
||
│ └────┬────┘ └────┬─────┘ └────┬─────┘ │
|
||
│ │ │ │ │
|
||
│ └─────────────┼─────────────┘ │
|
||
│ ▼ │
|
||
│ ┌─────────────────┐ │
|
||
│ │ Vector Index │ │
|
||
│ │ + BM25 Search │ │
|
||
│ └─────────────────┘ │
|
||
└─────────────────────────────────────────────────────┘
|
||
▲ │
|
||
│ retrieval on demand │ periodic
|
||
│ ▼
|
||
┌─────────────────┐ ┌─────────────────────┐
|
||
│ TRIGGER ENGINE │ │ REFLECTION ENGINE │
|
||
│ remember/forget │ │ consolidate/prune │
|
||
│ keyword detect │ │ + user approval │
|
||
│ + LLM routing │ └─────────┬───────────┘
|
||
└────────┬────────┘ │
|
||
│ │
|
||
└──────────┬───────────────────┘
|
||
│ all mutations
|
||
▼
|
||
┌─────────────────────┐
|
||
│ AUDIT SYSTEM │
|
||
│ git + audit.log │
|
||
│ rollback, alerts │
|
||
└─────────────────────┘
|
||
```
|
||
|
||
---
|
||
|
||
## 3. File Structure
|
||
|
||
```
|
||
workspace/
|
||
├── MEMORY.md # CORE MEMORY — always in context (~3K tokens)
|
||
│ # Blocks: [identity] [context] [persona] [critical]
|
||
│
|
||
├── memory/
|
||
│ ├── episodes/ # EPISODIC — chronological, append-only
|
||
│ │ ├── 2026-02-01.md
|
||
│ │ ├── 2026-02-02.md
|
||
│ │ └── ...
|
||
│ │
|
||
│ ├── graph/ # SEMANTIC — knowledge graph
|
||
│ │ ├── index.md # Graph topology: entities → relationships → entities
|
||
│ │ ├── entities/ # One file per major entity
|
||
│ │ │ ├── person--alex.md
|
||
│ │ │ ├── project--moltbot-memory.md
|
||
│ │ │ └── concept--oauth2-pkce.md
|
||
│ │ └── relations.md # Edge definitions and relationship types
|
||
│ │
|
||
│ ├── procedures/ # PROCEDURAL — learned workflows
|
||
│ │ ├── how-to-deploy.md
|
||
│ │ ├── code-review-pattern.md
|
||
│ │ └── morning-briefing.md
|
||
│ │
|
||
│ ├── vault/ # PINNED — user-protected, never auto-decayed
|
||
│ │ └── ...
|
||
│ │
|
||
│ └── meta/ # SYSTEM — memory about memory
|
||
│ ├── decay-scores.json # Relevance scores and access tracking
|
||
│ ├── reflection-log.md # History of consolidation cycles
|
||
│ ├── pending-reflection.md # Current reflection proposal awaiting approval
|
||
│ ├── pending-memories.md # Sub-agent memory proposals awaiting commit
|
||
│ ├── evolution.md # Long-term philosophical evolution tracker
|
||
│ └── audit.log # System-wide audit trail (all file mutations)
|
||
│
|
||
├── .audit/ # AUDIT SNAPSHOTS — git-managed
|
||
│ └── (git repository tracking all workspace files)
|
||
```
|
||
|
||
---
|
||
|
||
## 4. Core Memory — MEMORY.md
|
||
|
||
Always loaded into context. Hard-capped at **3,000 tokens**. Divided into four blocks:
|
||
|
||
```markdown
|
||
# MEMORY.md — Core Memory
|
||
|
||
<!-- TOKEN BUDGET: ~3,000 tokens. Rewritten during reflection. -->
|
||
|
||
## Identity
|
||
<!-- ~500 tokens — Who is the user? What matters most about them? -->
|
||
- Name: [User Name]
|
||
- Role: [What they do]
|
||
- Communication style: [Direct, casual, formal, etc.]
|
||
- Key preferences: [Dark mode, Vim, TypeScript, etc.]
|
||
- Timezone: [TZ]
|
||
|
||
## Active Context
|
||
<!-- ~1,000 tokens — What's happening RIGHT NOW? Current projects, open decisions. -->
|
||
- Currently working on: [Project X — building memory architecture for moltbot]
|
||
- Open decisions: [Graph structure for semantic store, decay function parameters]
|
||
- Recent important events: [Completed research phase, chose hybrid architecture]
|
||
- Blockers/waiting on: [User approval of reflection proposal]
|
||
|
||
## Persona
|
||
<!-- ~500 tokens — How should I behave with this user? -->
|
||
- Relationship tenure: [Since YYYY-MM-DD]
|
||
- Interaction patterns: [Evening chats, deep technical discussions]
|
||
- Things I've learned about working with them: [Appreciates brainstorming, wants options before decisions]
|
||
- Emotional context: [Currently excited about the memory project]
|
||
|
||
## Critical Facts
|
||
<!-- ~1,000 tokens — Things I must NEVER forget, even if they haven't come up recently. -->
|
||
- [Fact 1 — high importance, pinned]
|
||
- [Fact 2 — high importance, pinned]
|
||
- ...
|
||
```
|
||
|
||
**Rules:**
|
||
- The agent can self-edit core memory mid-conversation when it learns something clearly important
|
||
- The reflection engine rewrites core memory during consolidation to keep it maximally relevant
|
||
- Users can pin items to Critical Facts to prevent decay
|
||
- If core memory exceeds 3K tokens after an edit, the agent must summarize/prune before continuing
|
||
|
||
---
|
||
|
||
## 5. Episodic Store — Chronological Event Memory
|
||
|
||
Each day gets an append-only log. Entries are timestamped and tagged.
|
||
|
||
```markdown
|
||
# 2026-02-02 — Episode Log
|
||
|
||
## 14:30 | decision | confidence:high | tags:[memory, architecture]
|
||
Discussed memory architecture directions with user. Chose hybrid approach:
|
||
multi-store cognitive model + Letta-style core memory always in context.
|
||
User decisions: LLM routing, decay forgetting, full consolidation, graph semantics.
|
||
|
||
## 15:45 | preference | confidence:medium | tags:[workflow]
|
||
User prefers brainstorming before implementation. Wants multiple options
|
||
presented with trade-offs before committing to a direction.
|
||
|
||
## 16:00 | task | confidence:high | tags:[memory, design]
|
||
Created comprehensive architecture document for the memory system.
|
||
Next: user review and iteration on specific components.
|
||
```
|
||
|
||
**Entry metadata schema:**
|
||
| Field | Type | Purpose |
|
||
|-------|------|---------|
|
||
| `timestamp` | ISO 8601 | When it happened |
|
||
| `type` | enum | `decision`, `fact`, `preference`, `task`, `event`, `emotion`, `correction` |
|
||
| `confidence` | enum | `high`, `medium`, `low` |
|
||
| `tags` | string[] | Topical tags for retrieval |
|
||
| `source` | string | `conversation`, `reflection`, `user-explicit` |
|
||
|
||
**Lifecycle:**
|
||
- Written during conversation when trigger keywords fire or when the agent detects memorable content
|
||
- Read by the reflection engine during consolidation
|
||
- Older episodes have their key facts extracted into the semantic graph
|
||
- Episodes themselves are never edited, only appended (append-only log)
|
||
- Subject to decay: episodes older than N days with no access have their search relevance reduced
|
||
|
||
---
|
||
|
||
## 6. Semantic Store — Knowledge Graph
|
||
|
||
This is where extracted, decontextualized knowledge lives. Organized as a lightweight graph in Markdown.
|
||
|
||
### 6.1 Graph Index (`graph/index.md`)
|
||
|
||
The topology file — maps all entities and their connections:
|
||
|
||
```markdown
|
||
# Semantic Graph Index
|
||
|
||
<!-- Auto-generated during reflection. Manual edits will be overwritten. -->
|
||
|
||
## Entity Registry
|
||
| ID | Type | Label | File | Decay Score |
|
||
|----|------|-------|------|-------------|
|
||
| person--alex | person | Alex | entities/person--alex.md | 1.00 (pinned) |
|
||
| project--moltbot-memory | project | Moltbot Memory System | entities/project--moltbot-memory.md | 0.95 |
|
||
| concept--oauth2-pkce | concept | OAuth2 PKCE Flow | entities/concept--oauth2-pkce.md | 0.72 |
|
||
| tool--openclaw | tool | OpenClaw/Moltbot | entities/tool--openclaw.md | 0.98 |
|
||
|
||
## Edges
|
||
| From | Relation | To | Confidence | First Seen | Last Accessed |
|
||
|------|----------|----|------------|------------|---------------|
|
||
| person--alex | develops | project--moltbot-memory | high | 2026-01-15 | 2026-02-02 |
|
||
| project--moltbot-memory | uses | tool--openclaw | high | 2026-01-15 | 2026-02-02 |
|
||
| project--moltbot-memory | decided-on | concept--oauth2-pkce | medium | 2026-01-20 | 2026-01-20 |
|
||
| person--alex | prefers | concept--brainstorm-first | high | 2026-02-02 | 2026-02-02 |
|
||
```
|
||
|
||
### 6.2 Entity Files (`graph/entities/*.md`)
|
||
|
||
Each entity gets a dedicated file with structured facts:
|
||
|
||
```markdown
|
||
# project--moltbot-memory
|
||
|
||
<!-- Type: project | Created: 2026-01-15 | Last updated: 2026-02-02 -->
|
||
<!-- Decay score: 0.95 | Access count: 14 | Pinned: no -->
|
||
|
||
## Summary
|
||
Building an intelligent memory system for Moltbot/OpenClaw agent. Goal is
|
||
human-like memory with natural language triggers, graph-structured semantics,
|
||
decay-based forgetting, and sleep-time consolidation.
|
||
|
||
## Facts
|
||
- Architecture: hybrid multi-store (episodic + semantic graph + procedural + core)
|
||
- Routing: LLM-classified (not keyword heuristic)
|
||
- Forgetting: decay model (not hard delete)
|
||
- Consolidation: full-memory audit during off-peak, token-capped
|
||
- Semantic store: graph-structured, not flat files
|
||
- Core memory budget: ~3,000 tokens
|
||
|
||
## Timeline
|
||
- 2026-01-15: Initial research into memory architectures began
|
||
- 2026-01-20: Reviewed Letta/MemGPT, Mem0, MIRIX papers
|
||
- 2026-02-02: Architecture direction chosen, design document drafted
|
||
|
||
## Open Questions
|
||
- Decay function parameters (half-life, floor)
|
||
- Reflection token budget cap
|
||
- Graph traversal depth for retrieval
|
||
|
||
## Relations
|
||
- Developed by: [[person--alex]]
|
||
- Built on: [[tool--openclaw]]
|
||
- Inspired by: [[concept--letta-sleep-time]], [[concept--cognitive-memory-systems]]
|
||
```
|
||
|
||
### 6.3 Relation Types (`graph/relations.md`)
|
||
|
||
Defines the vocabulary of edges:
|
||
|
||
```markdown
|
||
# Relation Types
|
||
|
||
## Structural
|
||
- `develops` — person → project
|
||
- `uses` / `used-by` — project ↔ tool/concept
|
||
- `part-of` / `contains` — hierarchical nesting
|
||
- `depends-on` — dependency relationship
|
||
|
||
## Temporal
|
||
- `decided-on` — a choice was made (with date)
|
||
- `supersedes` — newer fact replaces older
|
||
- `preceded-by` / `followed-by` — sequence
|
||
|
||
## Qualitative
|
||
- `prefers` — user preference
|
||
- `avoids` — user anti-preference
|
||
- `confident-about` / `uncertain-about` — epistemic status
|
||
- `relates-to` — general association
|
||
```
|
||
|
||
---
|
||
|
||
## 7. Procedural Store — Learned Workflows
|
||
|
||
Patterns the agent has learned for *how* to do things. These are templates, not events.
|
||
|
||
```markdown
|
||
# how-to-deploy.md
|
||
|
||
<!-- Type: procedure | Learned: 2026-01-25 | Last used: 2026-01-30 -->
|
||
<!-- Decay score: 0.85 | Access count: 3 -->
|
||
|
||
## Trigger
|
||
When user asks to deploy, push to production, or ship.
|
||
|
||
## Steps
|
||
1. Run test suite first (user insists on this)
|
||
2. Check for uncommitted changes
|
||
3. Use `git tag` for versioning (not just branch)
|
||
4. Deploy to staging before prod
|
||
5. Send notification to Slack #deployments channel
|
||
|
||
## Notes
|
||
- User prefers verbose deploy logs
|
||
- Always confirm before prod deploy (never auto-deploy)
|
||
|
||
## Learned From
|
||
- Episode 2026-01-25 14:30 — first deployment discussion
|
||
- Episode 2026-01-30 09:15 — refined after staging incident
|
||
```
|
||
|
||
---
|
||
|
||
## 8. Trigger System — Remember & Forget
|
||
|
||
### 8.1 Keyword Detection
|
||
|
||
The agent monitors conversation for trigger phrases. This runs as a lightweight check on every user message.
|
||
|
||
**Remember triggers** (write to memory):
|
||
```
|
||
"remember that..."
|
||
"don't forget..."
|
||
"keep in mind..."
|
||
"note that..."
|
||
"important:..."
|
||
"for future reference..."
|
||
"save this..."
|
||
"FYI for later..."
|
||
```
|
||
|
||
**Forget triggers** (decay/archive):
|
||
```
|
||
"forget about..."
|
||
"never mind about..."
|
||
"disregard..."
|
||
"that's no longer relevant..."
|
||
"scratch that..."
|
||
"ignore what I said about..."
|
||
"remove from memory..."
|
||
"delete the memory about..."
|
||
```
|
||
|
||
**Reflection triggers** (manual consolidation request):
|
||
```
|
||
"reflect on..."
|
||
"consolidate your memories..."
|
||
"what do you remember about...?" (triggers search, not write)
|
||
"review your memories..."
|
||
"clean up your memory..."
|
||
```
|
||
|
||
### 8.2 LLM Routing — Classification Prompt
|
||
|
||
When a remember trigger fires, the agent makes a classification call to determine *where* the memory goes:
|
||
|
||
```markdown
|
||
## Memory Router — Classification Prompt
|
||
|
||
You are classifying a piece of information for storage. Given the content below,
|
||
determine:
|
||
|
||
1. **Store**: Which memory store is most appropriate?
|
||
- `core` — Critical, always-relevant information (identity, active priorities, key preferences)
|
||
- `episodic` — A specific event, decision, or interaction worth logging chronologically
|
||
- `semantic` — A fact, concept, or relationship that should be indexed in the knowledge graph
|
||
- `procedural` — A workflow, pattern, or "how-to" that the agent should learn
|
||
- `vault` — User explicitly wants this permanently protected from decay
|
||
|
||
2. **Entity extraction** (if semantic): What entities and relationships are present?
|
||
- Entities: name, type (person/project/concept/tool/place)
|
||
- Relations: subject → relation → object
|
||
|
||
3. **Tags**: 2-5 topical tags for retrieval
|
||
|
||
4. **Confidence**: How confident are we this is worth storing?
|
||
- `high` — User explicitly asked us to remember, or it's clearly important
|
||
- `medium` — Seems useful based on context
|
||
- `low` — Might be relevant, uncertain
|
||
|
||
5. **Core-worthy?**: Should this also update MEMORY.md?
|
||
- Only if it changes the user's identity, active context, or critical facts
|
||
|
||
Return as structured output:
|
||
{
|
||
"store": "semantic",
|
||
"entities": [{"name": "OAuth2 PKCE", "type": "concept"}],
|
||
"relations": [{"from": "project--moltbot", "relation": "uses", "to": "concept--oauth2-pkce"}],
|
||
"tags": ["auth", "security", "mobile"],
|
||
"confidence": "high",
|
||
"core_update": false,
|
||
"summary": "Decided to use OAuth2 PKCE flow for mobile client auth."
|
||
}
|
||
```
|
||
|
||
### 8.3 Forget Processing
|
||
|
||
When a forget trigger fires:
|
||
|
||
1. **Identify target**: LLM extracts what the user wants to forget
|
||
2. **Find matches**: Search across all stores for matching content
|
||
3. **Present matches**: Show user what will be affected ("I found 3 memories about X. Should I archive all of them?")
|
||
4. **On confirmation**:
|
||
- Set decay score to `0.0` (effectively hidden from search)
|
||
- Move to `_archived` status in decay-scores.json
|
||
- Remove from graph index (but don't delete entity file — soft archive)
|
||
- If in core memory, remove from MEMORY.md
|
||
5. **Hard delete option**: User can explicitly say "permanently delete" to remove from disk
|
||
|
||
---
|
||
|
||
## 9. Decay Model — Intelligent Forgetting
|
||
|
||
Every memory entry has a **relevance score** that decays over time unless reinforced by access.
|
||
|
||
### 9.1 Decay Function
|
||
|
||
```
|
||
relevance(t) = base_relevance × e^(-λ × days_since_last_access) × log2(access_count + 1) × type_weight
|
||
```
|
||
|
||
Where:
|
||
- `base_relevance`: Initial importance (1.0 for explicit "remember", 0.7 for auto-detected, 0.5 for inferred)
|
||
- `λ` (lambda): Decay rate constant (recommended: **0.03** → half-life of ~23 days)
|
||
- `days_since_last_access`: Calendar days since the memory was last retrieved or referenced
|
||
- `access_count`: Total number of times this memory has been accessed
|
||
- `type_weight`: Multiplier by memory type:
|
||
- Core: 1.5 (slow decay — these are important by definition)
|
||
- Episodic: 0.8 (faster decay — events become less relevant)
|
||
- Semantic: 1.2 (moderate — facts tend to persist)
|
||
- Procedural: 1.0 (neutral — workflows either stay relevant or don't)
|
||
- Vault/Pinned: ∞ (never decays)
|
||
|
||
### 9.2 Decay Thresholds
|
||
|
||
| Score Range | Status | Behavior |
|
||
|-------------|--------|----------|
|
||
| 1.0 - 0.5 | **Active** | Fully searchable, normal ranking |
|
||
| 0.5 - 0.2 | **Fading** | Searchable but deprioritized in results |
|
||
| 0.2 - 0.05 | **Dormant** | Only returned if explicitly searched or during full consolidation |
|
||
| < 0.05 | **Archived** | Hidden from search. Flagged for review during next consolidation |
|
||
|
||
### 9.3 Decay Scores File (`meta/decay-scores.json`)
|
||
|
||
```json
|
||
{
|
||
"version": 1,
|
||
"last_updated": "2026-02-02T16:00:00Z",
|
||
"entries": {
|
||
"episode:2026-02-02:14:30": {
|
||
"store": "episodic",
|
||
"base_relevance": 1.0,
|
||
"created": "2026-02-02T14:30:00Z",
|
||
"last_accessed": "2026-02-02T16:00:00Z",
|
||
"access_count": 2,
|
||
"type_weight": 0.8,
|
||
"current_score": 0.92,
|
||
"status": "active",
|
||
"pinned": false
|
||
},
|
||
"entity:concept--oauth2-pkce": {
|
||
"store": "semantic",
|
||
"base_relevance": 0.7,
|
||
"created": "2026-01-20T10:00:00Z",
|
||
"last_accessed": "2026-01-20T10:00:00Z",
|
||
"access_count": 1,
|
||
"type_weight": 1.2,
|
||
"current_score": 0.52,
|
||
"status": "active",
|
||
"pinned": false
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
### 9.4 Reinforcement
|
||
|
||
Memories are reinforced (access_count incremented, last_accessed updated) when:
|
||
- The memory is returned in a search result AND used in a response
|
||
- The user explicitly references the memory content
|
||
- The reflection engine identifies the memory as still-relevant during consolidation
|
||
- A new episode references or connects to the memory
|
||
|
||
---
|
||
|
||
## 10. Reflection Engine — Sleep-Time Consolidation
|
||
|
||
The most cognitively rich part of the system. Modeled on human sleep consolidation.
|
||
|
||
### 10.1 Trigger Conditions
|
||
|
||
Reflection runs when:
|
||
- **Scheduled**: Cron job during off-peak hours (e.g., 3:00 AM local time)
|
||
- **Session end**: When a long conversation concludes
|
||
- **Manual**: User says "reflect on your memories" or "consolidate"
|
||
- **Threshold**: When episodic store exceeds N unprocessed entries since last reflection
|
||
|
||
### 10.2 Token Budget
|
||
|
||
Each reflection cycle is capped at **8,000 tokens of processing output** (not input — the engine can *read* as much as it needs, but its *output* is bounded). This prevents runaway consolidation costs while allowing genuine depth.
|
||
|
||
### 10.3 Reflection Process
|
||
|
||
```
|
||
Phase 1: SURVEY (read everything, plan what to focus on)
|
||
│ Read: core memory, recent episodes, graph index, decay scores
|
||
│ Output: prioritized list of areas to consolidate
|
||
│
|
||
Phase 2: META-REFLECTION (philosophical review)
|
||
│ Read: reflection-log.md (all past reflections), evolution.md
|
||
│ Consider:
|
||
│ - Patterns recurring across reflections
|
||
│ - How understanding of the user has evolved
|
||
│ - Assumptions that have been revised
|
||
│ - Persistent questions spanning multiple reflections
|
||
│ Output: insights about cognitive evolution, guidance for this reflection
|
||
│
|
||
Phase 3: CONSOLIDATE (extract, connect, prune — informed by meta-reflection)
|
||
│ For each priority area:
|
||
│ - Extract new facts from episodes → create/update graph entities
|
||
│ - Identify new relationships → add edges to graph
|
||
│ - Detect contradictions → flag for user review
|
||
│ - Identify fading memories → propose archival
|
||
│ - Identify patterns → create/update procedures
|
||
│ - Note how changes relate to evolving understanding
|
||
│
|
||
Phase 4: REWRITE CORE (update MEMORY.md)
|
||
│ Rewrite core memory to reflect current state:
|
||
│ - Update Active Context with latest priorities
|
||
│ - Promote frequently-accessed facts to Critical
|
||
│ - Demote stale items from core → archival
|
||
│ - Evolve Persona section based on accumulated insights
|
||
│ - Ensure total stays under 3K token cap
|
||
│
|
||
Phase 5: SUMMARIZE (present to user for approval)
|
||
│ Generate a human-readable reflection summary:
|
||
│ - New facts learned
|
||
│ - Connections discovered
|
||
│ - Memories proposed for archival
|
||
│ - Contradictions found
|
||
│ - Core memory changes
|
||
│ - Philosophical evolution insights
|
||
│ - Questions for the user
|
||
│
|
||
▼
|
||
Output: pending-reflection.md (awaits user approval)
|
||
evolution.md updated (after approval)
|
||
```
|
||
|
||
### 10.4 Meta-Reflection — Philosophical Evolution
|
||
|
||
The meta-reflection phase enables the agent's understanding to deepen over time by reviewing the full history of past reflections before consolidating new memories.
|
||
|
||
**What it reads:**
|
||
- `reflection-log.md` — summaries of all past reflections
|
||
- `evolution.md` — accumulated philosophical insights and active threads
|
||
|
||
**What it considers:**
|
||
1. **Patterns across reflections** — recurring themes, types of knowledge extracted
|
||
2. **Evolution of understanding** — how perception of the user has changed
|
||
3. **Revised assumptions** — beliefs that have been corrected
|
||
4. **Persistent questions** — inquiries spanning multiple reflections
|
||
5. **Emergent insights** — patterns only visible across the full arc
|
||
|
||
**Output:**
|
||
- Guidance for the current reflection cycle
|
||
- Insights to add to `evolution.md`
|
||
- Context for how new memories relate to accumulated understanding
|
||
|
||
**Evolution Milestones:**
|
||
| Reflection # | Action |
|
||
|--------------|--------|
|
||
| 10 | First evolution summary — identify initial patterns |
|
||
| 25 | Consolidate evolution.md threads |
|
||
| 50 | Major synthesis — what has fundamentally changed? |
|
||
| 100 | Deep retrospective |
|
||
|
||
### 10.5 Reflection Summary Format (`meta/pending-reflection.md`)
|
||
|
||
```markdown
|
||
# Reflection Summary — 2026-02-02
|
||
|
||
## 🧠 New Knowledge Extracted
|
||
- Learned that Alex prefers hybrid approaches over pure implementations
|
||
- Extracted architectural decision: decay model for forgetting (not hard delete)
|
||
- New entity: concept--sleep-time-compute (connected to project--moltbot-memory)
|
||
|
||
## 🔗 New Connections
|
||
- person--alex → prefers → concept--brainstorm-first (NEW)
|
||
- project--moltbot-memory → inspired-by → concept--letta-sleep-time (NEW)
|
||
|
||
## 📦 Proposed Archival (decay score < 0.05)
|
||
- Episode 2025-12-15: discussion about unrelated CSS bug (score: 0.03)
|
||
- Entity: concept--old-api-key-rotation (score: 0.04, last accessed 45 days ago)
|
||
|
||
## ⚠️ Contradictions Detected
|
||
- None this cycle
|
||
|
||
## ✏️ Core Memory Changes
|
||
```diff
|
||
## Active Context
|
||
- Currently working on: [research phase of memory architecture]
|
||
+ Currently working on: [design document for memory architecture — research complete]
|
||
+ Open decisions: [decay parameters, reflection token budget, implementation order]
|
||
```
|
||
|
||
## 🌱 Philosophical Evolution
|
||
### What I've Learned About Learning
|
||
This reflection continues a pattern from Reflection #3: Alex values systematic
|
||
approaches but wants flexibility within structure.
|
||
|
||
### Evolving Understanding
|
||
My understanding of Alex's work style has deepened — they think in architectures
|
||
and systems, preferring to establish foundations before building features.
|
||
|
||
### Emergent Theme
|
||
Across 5 reflections, I notice Alex consistently chooses "both/and" over "either/or"
|
||
solutions (hybrid memory model, soft migration, gated write access).
|
||
|
||
## ❓ Questions for You
|
||
- Should I pin the memory architecture decisions to the vault? They seem foundational.
|
||
- The OAuth2 PKCE fact hasn't been accessed in 13 days. Still relevant?
|
||
|
||
---
|
||
**Reflection #**: 5
|
||
**Token budget used**: 5,200 / 8,000
|
||
**Memories processed**: 23 episodes, 8 entities, 3 procedures
|
||
**Reflections reviewed**: 4 past reflections
|
||
**Next scheduled reflection**: 2026-02-03 03:00
|
||
|
||
> Reply with `approve`, `approve with changes`, or `reject` to apply this reflection.
|
||
```
|
||
|
||
### 10.6 User Approval Flow
|
||
|
||
1. Agent presents `pending-reflection.md` summary
|
||
2. User can:
|
||
- **`approve`** — All changes applied immediately
|
||
- **`approve with changes`** — User specifies modifications ("don't archive the CSS bug, I might need it")
|
||
- **`reject`** — Nothing applied, agent notes the rejection for learning
|
||
- **`partial approve`** — Accept some changes, reject others
|
||
3. Approved changes are applied atomically and logged in `reflection-log.md`
|
||
4. `evolution.md` is updated with this reflection's philosophical insights
|
||
5. If no response within 24 hours, reflection remains pending (never auto-applied)
|
||
|
||
---
|
||
|
||
## 11. Retrieval — How the Agent Remembers
|
||
|
||
When the agent needs to recall information:
|
||
|
||
### 11.1 Retrieval Strategy by Query Type
|
||
|
||
| Query Type | Primary Store | Strategy |
|
||
|------------|---------------|----------|
|
||
| "When did we...?" | Episodic | Temporal scan + keyword |
|
||
| "What do you know about X?" | Semantic graph | Entity lookup → traverse edges |
|
||
| "How do I usually...?" | Procedural | Pattern match on trigger |
|
||
| "What's the latest on...?" | Episodic + Core | Recent episodes + active context |
|
||
| General context | Core memory | Already in context — no retrieval needed |
|
||
|
||
### 11.2 Graph Traversal for Semantic Queries
|
||
|
||
When a semantic query fires:
|
||
1. **Entity resolution**: Map the query to a graph entity (fuzzy match on names/aliases)
|
||
2. **Direct lookup**: Read the entity file for immediate facts
|
||
3. **1-hop traversal**: Follow edges to related entities (depth 1)
|
||
4. **2-hop traversal**: If needed, follow edges to entities related to related entities (depth 2, capped)
|
||
5. **Assemble context**: Combine entity facts + relationship context into a retrieval snippet
|
||
|
||
Example: "What do you know about the memory project?"
|
||
→ Resolve to `project--moltbot-memory`
|
||
→ Read entity file (summary, facts, timeline)
|
||
→ 1-hop: person--alex (develops), tool--openclaw (built on), concept--letta-sleep-time (inspired by)
|
||
→ Return: structured context about the project + its connections
|
||
|
||
### 11.3 Hybrid Search
|
||
|
||
For ambiguous queries, run both:
|
||
- **Vector search** (semantic similarity via embeddings) across all stores
|
||
- **BM25 keyword search** (exact token matching for IDs, names, code symbols)
|
||
- **Graph traversal** (for relationship-aware queries)
|
||
|
||
Merge results, deduplicate, rank by relevance score × decay score.
|
||
|
||
---
|
||
|
||
## 12. Audit Trail — System-Wide Change Tracking
|
||
|
||
Every mutation to any system file is tracked. This covers the entire agent workspace — not just memory stores, but persona files, configuration, identity, and tools.
|
||
|
||
### 12.1 Scope — What Gets Tracked
|
||
|
||
| File | Change Frequency | Typical Actor | Sensitivity |
|
||
|------|-----------------|---------------|-------------|
|
||
| SOUL.md | Rare | Human only | 🔴 Critical — behavioral constitution |
|
||
| IDENTITY.md | Rare | Human / first-run | 🔴 Critical — agent identity |
|
||
| USER.md | Occasional | Reflection engine (approved) | 🟡 High — human context |
|
||
| TOOLS.md | Occasional | Human / system | 🟡 High — capability definitions |
|
||
| MEMORY.md | Frequent | Bot, reflection, user triggers | 🟢 Standard — dynamic working memory |
|
||
| memory/episodes/* | Frequent | Bot (append-only) | 🟢 Standard — chronological logs |
|
||
| memory/graph/* | Frequent | Bot, reflection | 🟢 Standard — knowledge graph |
|
||
| memory/procedures/* | Occasional | Bot, reflection | 🟢 Standard — learned workflows |
|
||
| memory/vault/* | Rare | Human only (pins) | 🟡 High — protected memories |
|
||
| memory/meta/* | Frequent | System, reflection | 🟢 Standard — system metadata |
|
||
| Config (moltbot.json) | Rare | Human only | 🔴 Critical — system configuration |
|
||
|
||
### 12.2 Dual-Layer Architecture
|
||
|
||
The audit system uses two layers — git for ground truth, and a lightweight log for fast querying.
|
||
|
||
```
|
||
┌─────────────────────────────────────────────────────┐
|
||
│ AUDIT SYSTEM │
|
||
│ │
|
||
│ Layer 1: Git (ground truth) │
|
||
│ ┌────────────────────────────────────────────────┐ │
|
||
│ │ Every mutation = git commit │ │
|
||
│ │ Full diff history, revertable, blameable │ │
|
||
│ │ Author tag identifies actor │ │
|
||
│ └────────────────────────────────────────────────┘ │
|
||
│ │
|
||
│ Layer 2: Audit Log (queryable summary) │
|
||
│ ┌────────────────────────────────────────────────┐ │
|
||
│ │ memory/meta/audit.log │ │
|
||
│ │ One-line-per-mutation, compact format │ │
|
||
│ │ Searchable by bot without parsing git │ │
|
||
│ │ Periodically pruned / summarized │ │
|
||
│ └────────────────────────────────────────────────┘ │
|
||
│ │
|
||
│ Alerts │
|
||
│ ┌────────────────────────────────────────────────┐ │
|
||
│ │ ⚠️ Unexpected edits to critical files │ │
|
||
│ │ Flag SOUL.md / IDENTITY.md / config changes │ │
|
||
│ └────────────────────────────────────────────────┘ │
|
||
└─────────────────────────────────────────────────────┘
|
||
```
|
||
|
||
### 12.3 Git Layer — Ground Truth
|
||
|
||
The workspace is a git repository. Every file mutation generates a commit.
|
||
|
||
**Commit format:**
|
||
```
|
||
[ACTION] FILE — SUMMARY
|
||
|
||
Actor: ACTOR_TYPE:ACTOR_ID
|
||
Approval: APPROVAL_STATUS
|
||
Trigger: TRIGGER_SOURCE
|
||
```
|
||
|
||
**Examples:**
|
||
```
|
||
[EDIT] MEMORY.md — updated Active Context with memory project status
|
||
|
||
Actor: bot:trigger-remember
|
||
Approval: auto
|
||
Trigger: user said "remember we chose the hybrid approach"
|
||
```
|
||
|
||
```
|
||
[EDIT] USER.md — added timezone preference
|
||
|
||
Actor: reflection:r-012
|
||
Approval: approved
|
||
Trigger: reflection session 2026-02-03
|
||
```
|
||
|
||
```
|
||
[EDIT] SOUL.md — modified core behavioral guideline
|
||
|
||
Actor: manual
|
||
Approval: —
|
||
Trigger: direct human edit
|
||
⚠️ CRITICAL FILE CHANGED
|
||
```
|
||
|
||
**Actor tags:**
|
||
| Actor | Format | Meaning |
|
||
|-------|--------|---------|
|
||
| User-triggered memory | `bot:trigger-remember` | Bot wrote memory from user's "remember" command |
|
||
| User-triggered forget | `bot:trigger-forget` | Bot archived memory from user's "forget" command |
|
||
| Auto-detected | `bot:auto-detect` | Bot noticed something worth remembering without explicit trigger |
|
||
| Reflection engine | `reflection:SESSION_ID` | Reflection proposed and user approved this change |
|
||
| Decay system | `system:decay` | Automatic decay threshold transition |
|
||
| Manual human edit | `manual` | Human edited file directly |
|
||
| Skill/plugin | `skill:SKILL_NAME` | External skill or plugin modified a file |
|
||
| System init | `system:init` | First-run or migration |
|
||
| Sub-agent proposal | `subagent:AGENT_NAME` | Sub-agent proposed a memory (pending commit) |
|
||
| Sub-agent commit | `bot:commit-from:AGENT_NAME` | Main agent committed a sub-agent's proposal |
|
||
|
||
### 12.4 Audit Log — Queryable Summary
|
||
|
||
`memory/meta/audit.log` is a compact, one-line-per-entry log the bot can search quickly without shelling out to git.
|
||
|
||
**Format:**
|
||
```
|
||
TIMESTAMP | ACTION | FILE | ACTOR | APPROVAL | SUMMARY
|
||
```
|
||
|
||
**Example entries:**
|
||
```
|
||
2026-02-02T15:30Z | EDIT | MEMORY.md | bot:trigger-remember | auto | added "hybrid approach chosen" to Active Context
|
||
2026-02-02T15:31Z | CREATE | memory/graph/entities/concept--hybrid-arch.md | bot:trigger-remember | auto | new entity from user "remember" command
|
||
2026-02-02T16:00Z | APPEND | memory/episodes/2026-02-02.md | bot:auto-detect | auto | logged architecture discussion
|
||
2026-02-03T03:00Z | EDIT | MEMORY.md | reflection:r-012 | approved | rewrote Active Context and Critical Facts
|
||
2026-02-03T03:00Z | EDIT | USER.md | reflection:r-012 | approved | added timezone preference to Context
|
||
2026-02-03T03:00Z | MERGE | memory/graph/entities/* | reflection:r-012 | approved | consolidated 3 duplicate entities
|
||
2026-02-03T03:01Z | DECAY | memory/meta/decay-scores.json | system:decay | auto | 2 entries transitioned: fading→dormant
|
||
2026-02-05T10:00Z | EDIT | SOUL.md | manual | — | ⚠️ CRITICAL: behavioral guideline modified
|
||
2026-02-06T12:00Z | REVERT | MEMORY.md | manual | — | user reverted to commit abc1234
|
||
```
|
||
|
||
**Actions vocabulary:**
|
||
| Action | Meaning |
|
||
|--------|---------|
|
||
| CREATE | New file created |
|
||
| EDIT | Existing file modified |
|
||
| APPEND | Content added without modifying existing content (episode logs) |
|
||
| DELETE | File removed from disk (hard delete) |
|
||
| ARCHIVE | File soft-deleted (decay score zeroed, removed from indices) |
|
||
| MERGE | Multiple files/entries consolidated into one |
|
||
| REVERT | File restored to a previous version |
|
||
| DECAY | Decay system transitioned a memory's status |
|
||
| RENAME | File moved or renamed |
|
||
|
||
### 12.5 Critical File Alerts
|
||
|
||
Files marked 🔴 Critical in the scope table receive special treatment:
|
||
|
||
1. **Any edit triggers an alert** — the bot should surface the change to the user at the start of the next conversation: "Heads up — SOUL.md was modified on [date]. Here's what changed: [diff summary]. Was this intentional?"
|
||
|
||
2. **Unauthorized edit detection** — if a critical file changes and the actor is not `manual` (human) or an approved reflection, the bot should flag it immediately as a potential integrity issue.
|
||
|
||
3. **Checksum validation** — on startup, the bot can compare critical file checksums against the last known good state to detect tampering between sessions.
|
||
|
||
**Alert format in audit.log:**
|
||
```
|
||
2026-02-05T10:00Z | EDIT | SOUL.md | manual | — | ⚠️ CRITICAL: behavioral guideline modified
|
||
2026-02-05T10:01Z | ALERT | SOUL.md | system:audit | — | Critical file change detected. Pending user acknowledgment.
|
||
```
|
||
|
||
### 12.6 Retention & Pruning
|
||
|
||
The audit log grows continuously. To prevent bloat:
|
||
|
||
- **Git history**: Retained indefinitely (it's compressed and cheap). This is the permanent record.
|
||
- **Audit log file**: Rolling 90-day window. Entries older than 90 days are summarized into `memory/meta/audit-archive.md` (monthly digests) and pruned from the active log.
|
||
- **Monthly digest format**:
|
||
|
||
```markdown
|
||
# Audit Digest — January 2026
|
||
|
||
## Summary
|
||
- 142 total mutations across 18 files
|
||
- 12 reflection sessions (10 approved, 1 partial, 1 rejected)
|
||
- 0 critical file changes
|
||
- 34 decay transitions, 8 archival events
|
||
|
||
## Notable Events
|
||
- 2026-01-15: Memory system project initiated
|
||
- 2026-01-20: 5 new entities added after research session
|
||
- 2026-01-25: First procedural memory created (deployment workflow)
|
||
```
|
||
|
||
### 12.7 Querying the Audit Trail
|
||
|
||
The bot can answer audit questions by searching the log:
|
||
|
||
| User Question | Query Strategy |
|
||
|---------------|----------------|
|
||
| "What changed recently?" | Tail the audit.log, last N entries |
|
||
| "Why did you forget about X?" | Search audit.log for ARCHIVE/DECAY actions matching X |
|
||
| "What happened during the last reflection?" | Filter by actor = `reflection:*`, last session |
|
||
| "Has SOUL.md ever been changed?" | `grep SOUL.md audit.log` or `git log SOUL.md` |
|
||
| "Revert my memory to yesterday" | `git log --before=yesterday`, identify commit, `git checkout` |
|
||
| "Who changed USER.md?" | `git blame USER.md` or search audit.log for USER.md |
|
||
|
||
### 12.8 Rollback Procedure
|
||
|
||
Because git tracks everything, any change can be reverted:
|
||
|
||
1. **Single file rollback**: `git checkout <commit> -- <file>` to restore one file to a previous state
|
||
2. **Full session rollback**: Revert all changes from a specific reflection session by reverting its commits
|
||
3. **Point-in-time rollback**: Restore the entire workspace to a specific date/time
|
||
|
||
After any rollback:
|
||
- A new audit entry is logged with action `REVERT`
|
||
- The decay-scores.json is recalculated to match the restored state
|
||
- The graph index is rebuilt if semantic files were affected
|
||
|
||
---
|
||
|
||
## 13. Multi-Agent Memory Access
|
||
|
||
Moltbot uses multiple sub-agents (e.g., researcher, coder, reviewer). This section defines how they interact with the shared memory system.
|
||
|
||
### 13.1 Access Model: Shared Read, Gated Write
|
||
|
||
```
|
||
┌─────────────────────────────────────────────────────────────┐
|
||
│ MEMORY STORES │
|
||
│ (Episodic, Semantic, Procedural, Core, Vault) │
|
||
└─────────────────────────────────────────────────────────────┘
|
||
▲ │
|
||
│ READ (all agents) │ WRITE (main agent only)
|
||
│ │
|
||
┌────────┴────────────────────────────────────────────────────┐
|
||
│ │
|
||
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
|
||
│ │ Main │ │ Research │ │ Coder │ │ Reviewer │ │
|
||
│ │ Agent │ │ Agent │ │ Agent │ │ Agent │ │
|
||
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
|
||
│ │ │ │ │ │
|
||
│ │ COMMIT └─────────────┴─────────────┘ │
|
||
│ │ │ │
|
||
│ │ │ PROPOSE │
|
||
│ │ ▼ │
|
||
│ │ ┌─────────────────────┐ │
|
||
│ │ │ pending-memories │ │
|
||
│ │ │ (staging area) │ │
|
||
│ │ └─────────────────────┘ │
|
||
│ │ │ │
|
||
│ └─────────────────────────┘ │
|
||
│ review & commit │
|
||
└─────────────────────────────────────────────────────────────┘
|
||
```
|
||
|
||
**Rules:**
|
||
- **All agents can READ** all memory stores (core, episodic, semantic, procedural, vault)
|
||
- **Only the main agent can WRITE** directly to memory stores
|
||
- **Sub-agents PROPOSE** memories by appending to `memory/meta/pending-memories.md`
|
||
- **Main agent REVIEWS** proposals and commits approved ones to the actual stores
|
||
- **Reflection engine** can also process pending memories during consolidation
|
||
|
||
### 13.2 Pending Memories Format
|
||
|
||
Sub-agents write proposals to `memory/meta/pending-memories.md`:
|
||
|
||
```markdown
|
||
# Pending Memory Proposals
|
||
|
||
<!-- Sub-agents append proposals here. Main agent reviews and commits. -->
|
||
|
||
---
|
||
## Proposal #1
|
||
- **From**: researcher
|
||
- **Timestamp**: 2026-02-03T10:00:00Z
|
||
- **Trigger**: auto-detect during research task
|
||
- **Suggested store**: semantic
|
||
- **Content**: User prefers academic sources over blog posts for technical topics
|
||
- **Entities**: [preference--source-quality]
|
||
- **Confidence**: medium
|
||
- **Core-worthy**: no
|
||
- **Status**: pending
|
||
|
||
---
|
||
## Proposal #2
|
||
- **From**: coder
|
||
- **Timestamp**: 2026-02-03T10:15:00Z
|
||
- **Trigger**: user said "remember this pattern"
|
||
- **Suggested store**: procedural
|
||
- **Content**: When refactoring, user wants tests written before changing implementation
|
||
- **Entities**: [procedure--refactoring-workflow]
|
||
- **Confidence**: high
|
||
- **Core-worthy**: no
|
||
- **Status**: pending
|
||
```
|
||
|
||
### 13.3 Main Agent Commit Flow
|
||
|
||
When the main agent processes pending memories:
|
||
|
||
1. **Review** each pending proposal
|
||
2. **Validate** — is this worth storing? Is the classification correct?
|
||
3. **Decide**:
|
||
- `commit` — write to the suggested store (or override to a different store)
|
||
- `reject` — remove from pending, optionally log reason
|
||
- `defer` — leave for reflection engine to handle
|
||
4. **Execute** — write to store, update decay scores, update graph if needed
|
||
5. **Audit** — log with actor `bot:commit-from:AGENT_NAME`
|
||
6. **Clear** — remove committed/rejected proposals from pending file
|
||
|
||
### 13.4 Automatic vs. Manual Review
|
||
|
||
| Mode | Behavior | When to use |
|
||
|------|----------|-------------|
|
||
| **Auto-commit** | High-confidence proposals from trusted sub-agents are committed immediately | Stable system, trusted agents |
|
||
| **Batch review** | Main agent reviews all pending at session start or end | Default recommended mode |
|
||
| **Manual review** | User reviews proposals (like reflection) | High-stakes or sensitive context |
|
||
|
||
**Recommended default: Batch review** — main agent processes pending memories at the start of each session or when explicitly triggered.
|
||
|
||
### 13.5 Sub-Agent Instructions
|
||
|
||
Each sub-agent should include in their system prompt:
|
||
|
||
```markdown
|
||
## Memory Access
|
||
|
||
You have READ access to all memory stores:
|
||
- MEMORY.md (core) — always in your context
|
||
- memory/episodes/* — chronological event logs
|
||
- memory/graph/* — knowledge graph entities and relationships
|
||
- memory/procedures/* — learned workflows
|
||
- memory/vault/* — pinned memories
|
||
|
||
You do NOT have direct WRITE access. To remember something:
|
||
1. Append a proposal to `memory/meta/pending-memories.md`
|
||
2. Use this format:
|
||
---
|
||
## Proposal #N
|
||
- **From**: [your agent name]
|
||
- **Timestamp**: [ISO 8601]
|
||
- **Trigger**: [what triggered this — user command or auto-detect]
|
||
- **Suggested store**: [episodic | semantic | procedural | vault]
|
||
- **Content**: [the actual memory content]
|
||
- **Entities**: [if semantic, list entity IDs]
|
||
- **Confidence**: [high | medium | low]
|
||
- **Core-worthy**: [yes | no]
|
||
- **Status**: pending
|
||
3. The main agent will review and commit approved proposals
|
||
|
||
Do NOT attempt to write directly to memory stores. Your proposals will be
|
||
reviewed to ensure memory coherence across all agents.
|
||
```
|
||
|
||
### 13.6 Conflict Resolution
|
||
|
||
When multiple sub-agents propose conflicting memories:
|
||
|
||
1. **Detection** — main agent or reflection engine identifies contradiction
|
||
2. **Flagging** — both proposals marked with `⚠️ CONFLICT` status
|
||
3. **Resolution options**:
|
||
- Main agent decides which is correct
|
||
- Both are stored with `confidence: low` and linked as contradictory
|
||
- User is asked to resolve during next interaction
|
||
4. **Audit** — conflict and resolution logged
|
||
|
||
Example conflict flag in pending-memories.md:
|
||
```markdown
|
||
## Proposal #3 ⚠️ CONFLICT with #4
|
||
- **From**: researcher
|
||
- **Content**: Project deadline is March 15
|
||
- **Status**: conflict — see #4
|
||
|
||
## Proposal #4 ⚠️ CONFLICT with #3
|
||
- **From**: coder
|
||
- **Content**: Project deadline is March 30
|
||
- **Status**: conflict — see #3
|
||
```
|
||
|
||
### 13.7 Audit Trail for Multi-Agent
|
||
|
||
Sub-agent memory operations are fully tracked:
|
||
|
||
```
|
||
2026-02-03T10:00Z | PROPOSE | memory/meta/pending-memories.md | subagent:researcher | pending | "User prefers academic sources"
|
||
2026-02-03T10:15Z | PROPOSE | memory/meta/pending-memories.md | subagent:coder | pending | "Refactoring workflow"
|
||
2026-02-03T10:30Z | COMMIT | memory/graph/entities/... | bot:commit-from:researcher | auto | accepted proposal #1
|
||
2026-02-03T10:30Z | COMMIT | memory/procedures/... | bot:commit-from:coder | auto | accepted proposal #2
|
||
2026-02-03T10:31Z | REJECT | memory/meta/pending-memories.md | bot:main | auto | rejected proposal #5 — duplicate
|
||
```
|
||
|
||
---
|
||
|
||
## 14. AGENTS.md Instructions
|
||
|
||
Add to your AGENTS.md for agent behavior:
|
||
|
||
```markdown
|
||
## Memory System
|
||
|
||
### Always-Loaded Context
|
||
Your MEMORY.md (core memory) is always in your context window. Use it as your
|
||
primary awareness of who the user is and what matters right now. You don't need
|
||
to search for information that's already in your core memory.
|
||
|
||
### Trigger Detection
|
||
Monitor every user message for memory trigger phrases:
|
||
|
||
**Remember triggers**: "remember", "don't forget", "keep in mind", "note that",
|
||
"important:", "for future reference", "save this", "FYI for later"
|
||
→ Action: Classify via LLM routing prompt, write to appropriate store, update
|
||
decay scores. If core-worthy, also update MEMORY.md.
|
||
|
||
**Forget triggers**: "forget about", "never mind", "disregard", "no longer relevant",
|
||
"scratch that", "ignore what I said about", "remove from memory", "delete memory"
|
||
→ Action: Identify target, find matches, confirm with user, set decay to 0.
|
||
|
||
**Reflection triggers**: "reflect on", "consolidate memories", "review memories",
|
||
"clean up memory"
|
||
→ Action: Run reflection cycle, present summary for approval.
|
||
|
||
### Memory Writes
|
||
When writing a memory:
|
||
1. Call the routing classifier to determine store + metadata
|
||
2. Write to the appropriate file
|
||
3. Update decay-scores.json with new entry
|
||
4. If the memory creates a new entity or relationship, update graph/index.md
|
||
5. If core-worthy, update MEMORY.md (respecting 3K token cap)
|
||
|
||
### Memory Reads
|
||
Before answering questions about prior work, decisions, people, preferences:
|
||
1. Check core memory first (it's already in context)
|
||
2. If not found, run memory_search across all stores
|
||
3. For relationship queries, use graph traversal
|
||
4. For temporal queries ("when did we..."), scan episodes
|
||
5. If low confidence after search, say you checked but aren't sure
|
||
|
||
### Self-Editing Core Memory
|
||
You may update MEMORY.md mid-conversation when:
|
||
- You learn something clearly important about the user
|
||
- The active context has shifted significantly
|
||
- A critical fact needs correction
|
||
Always respect the 3K token cap. If an addition would exceed it, summarize or
|
||
remove the least-relevant item.
|
||
|
||
### Reflection
|
||
During scheduled reflection or when manually triggered:
|
||
- Follow the 4-phase process (Survey → Consolidate → Rewrite Core → Summarize)
|
||
- Stay within the 8,000 token output budget
|
||
- NEVER apply changes without user approval
|
||
- Present the summary in the pending-reflection.md format
|
||
- Log all approved changes in reflection-log.md
|
||
|
||
### Audit Trail
|
||
Every file mutation must be tracked. When writing, editing, or deleting any file:
|
||
1. Commit the change to git with a structured message (actor, approval, trigger)
|
||
2. Append a one-line entry to `memory/meta/audit.log`
|
||
3. If the changed file is SOUL.md, IDENTITY.md, or config — flag as ⚠️ CRITICAL
|
||
|
||
On session start:
|
||
- Check if any critical files changed since last session
|
||
- If yes, alert the user: "SOUL.md was modified on [date]. Was this intentional?"
|
||
|
||
When user asks about memory changes:
|
||
- Search audit.log for relevant entries
|
||
- For detailed diffs, use git history
|
||
- Support rollback requests via git checkout
|
||
|
||
### Multi-Agent Memory (for sub-agents)
|
||
If you are a sub-agent (not the main orchestrator):
|
||
- You have READ access to all memory stores
|
||
- You do NOT have direct WRITE access
|
||
- To remember something, append a proposal to `memory/meta/pending-memories.md`:
|
||
```
|
||
---
|
||
## Proposal #N
|
||
- **From**: [your agent name]
|
||
- **Timestamp**: [ISO 8601]
|
||
- **Trigger**: [user command or auto-detect]
|
||
- **Suggested store**: [episodic | semantic | procedural | vault]
|
||
- **Content**: [the memory content]
|
||
- **Entities**: [entity IDs if semantic]
|
||
- **Confidence**: [high | medium | low]
|
||
- **Core-worthy**: [yes | no]
|
||
- **Status**: pending
|
||
```
|
||
- The main agent will review and commit approved proposals
|
||
|
||
### Multi-Agent Memory (for main agent)
|
||
At session start or when triggered:
|
||
1. Check `memory/meta/pending-memories.md` for proposals
|
||
2. Review each pending proposal
|
||
3. For each: commit (write to store), reject (remove), or defer (leave for reflection)
|
||
4. Log commits with actor `bot:commit-from:AGENT_NAME`
|
||
5. Clear processed proposals from pending file
|
||
```
|
||
|
||
---
|
||
|
||
## 15. Implementation Roadmap
|
||
|
||
### Phase 1: Foundation (Week 1-2)
|
||
- [ ] Create file structure (all directories and template files)
|
||
- [ ] Initialize git repository in workspace root
|
||
- [ ] Implement audit log writer (append to `memory/meta/audit.log`)
|
||
- [ ] Implement git auto-commit on file mutation (with structured message format)
|
||
- [ ] Implement trigger keyword detection in AGENTS.md
|
||
- [ ] Build LLM routing classifier prompt
|
||
- [ ] Implement basic episodic logging (append to daily files)
|
||
- [ ] Wire up MEMORY.md as always-loaded core memory
|
||
|
||
### Phase 2: Semantic Graph (Week 3-4)
|
||
- [ ] Design entity file template
|
||
- [ ] Build graph/index.md auto-generation
|
||
- [ ] Implement entity extraction from episodes
|
||
- [ ] Build graph traversal for retrieval (1-hop and 2-hop)
|
||
- [ ] Integrate graph search with existing vector search
|
||
|
||
### Phase 3: Decay System (Week 5)
|
||
- [ ] Implement decay-scores.json tracking
|
||
- [ ] Build decay function calculator
|
||
- [ ] Add access tracking (increment on retrieval)
|
||
- [ ] Implement status transitions (active → fading → dormant → archived)
|
||
- [ ] Add pinning mechanism for vault items
|
||
|
||
### Phase 4: Reflection Engine (Week 6-8)
|
||
- [ ] Build reflection trigger (cron + manual + threshold)
|
||
- [ ] Implement 4-phase reflection process
|
||
- [ ] Build pending-reflection.md generation
|
||
- [ ] Implement user approval flow (approve/reject/partial)
|
||
- [ ] Build core memory rewriting with token cap enforcement
|
||
- [ ] Test with real conversation data
|
||
|
||
### Phase 5: Multi-Agent Support (Week 9-10)
|
||
- [ ] Create pending-memories.md staging file and format
|
||
- [ ] Implement sub-agent proposal writing (append to staging)
|
||
- [ ] Build main agent review flow (commit/reject/defer)
|
||
- [ ] Add conflict detection for contradictory proposals
|
||
- [ ] Integrate pending memory processing into reflection engine
|
||
- [ ] Update sub-agent system prompts with memory access instructions
|
||
- [ ] Test with all 4 sub-agents
|
||
|
||
### Phase 6: Polish & Iterate (Week 11+)
|
||
- [ ] Tune decay parameters with real usage data
|
||
- [ ] Optimize graph traversal performance
|
||
- [ ] Add contradiction detection
|
||
- [ ] Implement critical file alert system (session-start checksum validation)
|
||
- [ ] Build audit log pruning + monthly digest generation
|
||
- [ ] Build memory health dashboard (optional)
|
||
- [ ] Write comprehensive SKILL.md for community sharing
|
||
|
||
---
|
||
|
||
## 16. Key Parameters — Quick Reference
|
||
|
||
| Parameter | Recommended | Tunable? | Notes |
|
||
|-----------|-------------|----------|-------|
|
||
| Core memory cap | 3,000 tokens | Yes | Trade-off: more context vs. window space |
|
||
| Decay lambda (λ) | 0.03 | Yes | Higher = faster forgetting. 0.03 → ~23 day half-life |
|
||
| Decay archive threshold | 0.05 | Yes | Below this, memory is hidden from search |
|
||
| Reflection token budget | 8,000 tokens | Yes | Output cap per reflection cycle |
|
||
| Reflection frequency | Daily + session-end | Yes | More frequent = more current, but more expensive |
|
||
| Graph traversal depth | 2 hops | Yes | Deeper = richer context, slower retrieval |
|
||
| Max search results | 20 | Yes | Per the existing memorySearch config |
|
||
| Min search score | 0.3 | Yes | Per the existing memorySearch config |
|
||
| Audit log retention | 90 days | Yes | Older entries summarized into monthly digests |
|
||
| Critical file alerts | On | Yes | Alert on SOUL.md, IDENTITY.md, config changes |
|
||
| Git commit on mutation | Always | No | Every file change = one atomic commit |
|
||
|
||
---
|
||
|
||
## 17. Open Design Decisions
|
||
|
||
These emerged during this design phase and need resolution during implementation:
|
||
|
||
1. **Entity deduplication**: When the agent extracts an entity that's similar but not identical to an existing one ("OAuth PKCE" vs "OAuth2 PKCE flow"), how aggressive should merging be?
|
||
|
||
2. **Cross-session episode boundaries**: Should a single long conversation be one episode entry or broken into topic-based chunks?
|
||
|
||
3. **Graph size limits**: Should there be a cap on total entities/edges? At what point does the graph become too large for the reflection engine to survey?
|
||
|
||
4. **Multi-user support (group chats)**: The current design is single-user. If the bot serves multiple *human users* (e.g., group chats, team workspaces), how should memories be scoped? (Note: multi-*agent* access is addressed in § 13 — this is about multiple humans.)
|
||
|
||
5. **Memory import**: Should there be a mechanism to bulk-import knowledge (e.g., "read this PDF and add it to your semantic memory")?
|
||
|
||
---
|
||
|
||
*This is a living document. It will evolve as implementation reveals what works and what doesn't.*
|