|
| 1 | +Below is a clean, compact memory dump for Codex/LLM agents that captures the key ideas of a “human‑readable memory bank” and “when to start a vector index.” Use as a system prompt, agent README, or memory-bank insert. |
| 2 | + |
| 3 | +--- |
| 4 | + |
| 5 | +# 📦 Codex Memory Model — Dump |
| 6 | + |
| 7 | +## 1. Big Picture |
| 8 | + |
| 9 | +Project memory should be human‑readable, durable, and agent‑friendly. |
| 10 | +Primary format: Markdown + YAML frontmatter. |
| 11 | +Top directory: |
| 12 | + |
| 13 | +``` |
| 14 | +/ai/ |
| 15 | + memory/ # invariants, ADR/decisions, architecture |
| 16 | + log/ # project diary |
| 17 | + README.md # how to use this |
| 18 | +``` |
| 19 | + |
| 20 | +--- |
| 21 | + |
| 22 | +## 2. Memory Record Format (Knowledge Fragments) |
| 23 | + |
| 24 | +Every memory fragment is Markdown with a YAML frontmatter. |
| 25 | + |
| 26 | +```markdown |
| 27 | +--- |
| 28 | +id: inv-raft-001 |
| 29 | +type: invariant # invariant | decision | fact | gotcha | todo | note |
| 30 | +scope: planner |
| 31 | +tags: ["raft", "sharding"] |
| 32 | +updated_at: "2025-11-29" |
| 33 | +importance: 0.95 # 0..1 — how critical it is to follow |
| 34 | +--- |
| 35 | + |
| 36 | +# Invariant: The Planner Operates Within a Single Raft Group |
| 37 | + |
| 38 | +(human‑readable explanation) |
| 39 | +``` |
| 40 | + |
| 41 | +Why this format: |
| 42 | + |
| 43 | +- readable as documentation, |
| 44 | +- easy to parse by agents, |
| 45 | +- YAML provides the structure, |
| 46 | +- Markdown provides narrative and context. |
| 47 | + |
| 48 | +--- |
| 49 | + |
| 50 | +## 3. Memory Types |
| 51 | + |
| 52 | +Use a small fixed set: |
| 53 | + |
| 54 | +- `invariant` — a principle that must not be violated |
| 55 | +- `decision` — an architectural decision (ADR‑lite) |
| 56 | +- `fact` — important information about the system |
| 57 | +- `gotcha` — pitfalls and caveats |
| 58 | +- `todo` — long‑lived improvements, not tasks |
| 59 | +- `note` — useful observations |
| 60 | + |
| 61 | +This helps agents: |
| 62 | +for architecture — look at `invariant` and `decision`, |
| 63 | +for debugging — `gotcha`, |
| 64 | +for system analysis — `fact`. |
| 65 | + |
| 66 | +--- |
| 67 | + |
| 68 | +## 4. Separate Memory Files |
| 69 | + |
| 70 | +Example layout: |
| 71 | + |
| 72 | +``` |
| 73 | +/ai/memory/index.md |
| 74 | +/ai/memory/invariants.md |
| 75 | +/ai/memory/architecture.md |
| 76 | +/ai/memory/components/planner.md |
| 77 | +/ai/memory/components/storage.md |
| 78 | +/ai/memory/decisions/0001-sharding-model.md |
| 79 | +/ai/memory/decisions/0002-datafusion-integration.md |
| 80 | +``` |
| 81 | + |
| 82 | +`index.md` describes structure and reading order. |
| 83 | + |
| 84 | +--- |
| 85 | + |
| 86 | +## 5. Instructions for LLM Agents |
| 87 | + |
| 88 | +Agents must: |
| 89 | + |
| 90 | +1. Read `/ai/memory/index.md` first. |
| 91 | +2. Before architecture answers, load: |
| 92 | + - `architecture.md` |
| 93 | + - relevant components |
| 94 | + - all `invariant` with `importance >= 0.8` |
| 95 | +3. Before code generation, honor all invariants in `/ai/memory`. |
| 96 | +4. If an invariant would be violated, name it and propose an alternative. |
| 97 | +5. Avoid stale notes (`updated_at` too old or a `deprecated` flag, if present). |
| 98 | + |
| 99 | +--- |
| 100 | + |
| 101 | +## 6. When to Start a Vector Index |
| 102 | + |
| 103 | +Two layers of indexing: |
| 104 | + |
| 105 | +### Layer A — Vector Index of Memory (`/ai/memory`) |
| 106 | + |
| 107 | +Start early, after ~10–20 meaningful notes. Cheap, stable, useful. |
| 108 | + |
| 109 | +### Layer B — Vector Index of Source Code |
| 110 | + |
| 111 | +Start when 2–3 conditions hold: |
| 112 | + |
| 113 | +- multiple subsystems/crates, |
| 114 | +- architecture relatively stabilized, |
| 115 | +- “where is X implemented?” requires significant search, |
| 116 | +- context does not fit a single prompt. |
| 117 | + |
| 118 | +Initial scope only: |
| 119 | + |
| 120 | +1. public APIs, |
| 121 | +2. key modules (planner/storage/executor), |
| 122 | +3. doc comments and gotchas. |
| 123 | + |
| 124 | +Grow beyond this as the project scales. |
| 125 | + |
| 126 | +--- |
| 127 | + |
| 128 | +## 7. Why Not Start Too Early |
| 129 | + |
| 130 | +- code structure changes rapidly → index rots quickly; |
| 131 | +- noise outweighs signal; |
| 132 | +- maintenance cost > value early on; |
| 133 | +- while the project is small, LSP/grep is enough. |
| 134 | + |
| 135 | +--- |
| 136 | + |
| 137 | +## 8. Practical Timeline |
| 138 | + |
| 139 | +### Phase 1: first weeks |
| 140 | + |
| 141 | +Create `/ai/memory/*.md` with no index yet. |
| 142 | + |
| 143 | +### Phase 2: 10–20 memory fragments exist |
| 144 | + |
| 145 | +Start the memory vector index (still not code). |
| 146 | + |
| 147 | +### Phase 3: architecture stabilized, project grew |
| 148 | + |
| 149 | +Start the source code index. |
| 150 | + |
| 151 | +--- |
| 152 | + |
| 153 | +## 9. Project Log (Optional) |
| 154 | + |
| 155 | +In `/ai/log/YYYY-MM-DD.md` keep human‑readable diaries: |
| 156 | + |
| 157 | +- key decisions, |
| 158 | +- observations, |
| 159 | +- issues. |
| 160 | + |
| 161 | +This is a RAG data source, but not part of invariants. |
| 162 | + |
| 163 | +--- |
| 164 | + |
| 165 | +# ✔ Recommended Standard of Memory for Codex/LLM Agents |
| 166 | + |
| 167 | +Optionally, we can: |
| 168 | + |
| 169 | +- generate templates (memory template generator), |
| 170 | +- provide JSON Schema for frontmatter validation, |
| 171 | +- generate an example `/ai/memory` for your project (pg_fusion / picodata), |
| 172 | +- suggest an index refresh strategy (pre‑commit hook + partial refresh). |
0 commit comments