Decisions Are Not Memories. Most Agent Memory Systems Confuse Them.
'I learned we use Postgres' and 'we decided Postgres over MySQL because of JSON support' are different data. One is a fact. The other is a contract. Mixing them is why agent memory drifts.
When I started building persistent memory for Claude Code, I thought memory was one kind of thing. A blob of stuff the agent remembers about the project. A database table with id, content, tags.
That worked for two weeks, and then it didn’t.
Here’s the moment it broke. I had a memory entry that said “We use Zod for validation.” Later in the same project, I had another entry that said “We decided to use Zod instead of Joi after evaluating both, because Zod composes with TypeScript inferred types and Joi doesn’t, 2025-09-12.” These were different sessions, and the audit agent had captured both.
Both entries were technically memories. Both were about Zod. Both were true. But they were different kinds of truth, and treating them as the same kind broke the system.
The difference between a fact and a contract
A memory is an observation. It answers the question “what is true about this project?”
- “We use Postgres.”
- “The main entry point is
server/index.ts.” - “User prefers
async defoverdefin handlers that call external services.”
A decision is a contract. It answers the question “what did we commit to, and why?”
- “We chose Postgres over MongoDB on 2025-06-03, because we need relational joins for the billing logic, and we already run Postgres for auth.”
- “We decided Zod over Joi on 2025-09-12, because Zod composes with TypeScript inferred types.”
- “We chose to NOT version the API before beta on 2025-10-18, because we don’t have enough external consumers yet to justify the cost.”
Facts and contracts look the same at the surface. They are both prose. They both describe the state of the project. A naive memory system stores them in the same table. But they have different lifecycles, different expiry rules, and different enforcement semantics. Confusing them leads to specific, predictable bugs.
Bug #1: contradictions without history
In the Zod example, my memory store had two entries:
memory-034: We use Zod for validation.
memory-097: We decided to use Zod instead of Joi because...
These were not contradictions. But consider what happened three months later when I briefly considered switching to a different validation library. During that session, I said “should we look at Valibot?” The agent searched memory, found memory-034 (“We use Zod”), and responded “you currently use Zod, but Valibot has smaller bundle size.”
What I actually needed to know was: why did we pick Zod in the first place? That context was in memory-097, which the agent had retrieved but weighed equally with memory-034. It didn’t know that memory-097 was the one I actually needed, because the memory store had no notion that one of them was a contract I had committed to.
If I had treated decisions as a separate data type, the response would have been: “You currently use Zod. This was an active decision: [memory-097]. Want to review the decision before considering alternatives?”
The information was the same. The framing was different. The framing difference is the whole value.
Bug #2: no enforcement semantics
Here’s another shape. I had a memory: “never use git push --force.” Cool. That’s a safety preference. I captured it as a memory.
But which “never” did I mean? Was this:
- (a) A reminder to the agent to be careful. If the agent has a good reason, force push is still possible.
- (b) A hard rule. Force push is forbidden regardless of reason. The agent cannot run this command.
Memories don’t carry that distinction. A memory entry is informational. If you want it to be enforceable, you have to layer enforcement on top, and now you need metadata on the memory to say “this one is enforceable at level X.”
At which point you have reinvented decisions. Decisions are memories with enforcement metadata: an enforce level (required or advisory), a source, a rationale, a date, and optionally a supersede link.
I ended up with a schema that looked like this for decisions:
id: D-021
title: Never force push to protected branches
enforce: required
decision: >
Agent must never run git push --force, git push --force-with-lease,
or any variant with explicit or implicit force semantics on main,
develop, or release branches.
reasoning: >
Force push has lost us work twice, both times under pressure from
real or perceived urgency. A pre-tool-use hook blocks the command
at enforcement time. The hook is the only layer that matters,
prompts alone have been overridden.
date: 2025-10-03
source: session-0f8c4a
supersedes: null
And then, critically, a pre-tool-use hook reads the required decisions and enforces them. If the agent tries to run a blocked command, the hook rejects it. The decision is not a suggestion. It is a fence.
None of that works if you call it a memory. “Memory” implies “informational, weight this in your response.” “Decision” implies “enforce this.”
Bug #3: supersede becomes impossible
The third bug is about time.
Memories are append-only but they don’t have a notion of “this used to be true but isn’t anymore.” If I wrote “we use SQLite” in session 3 and “we migrated to Postgres” in session 47, both entries exist. A naive memory lookup can return either, and the newer one wins only if the sort is by date.
But what if I want to answer the question “when did we migrate?” or “why did we migrate?” Neither memory entry holds that. I have just two observations with no link between them.
Decisions, by contrast, have a supersede chain. D-007: use SQLite gets superseded by D-038: migrate to Postgres, supersedes D-007. Now I can see the history: SQLite was deliberate, then we changed our minds, here’s when and why.
This is not theoretical. In a long-running project, “why did we make this choice” is the most valuable question the memory store can answer. It is the question that saves an hour of archaeology when a junior developer asks “why don’t we just use SQLite, it would be simpler.” You can point at the decision chain. You can show the reasoning. You can show what changed.
Memories can’t do this. Decisions can.
How the separation actually looks
In AXME Code, the separation is enforced at the data layer. There are two stores in .axme-code/:
.axme-code/
├── memory/
│ ├── user_preferences.md
│ ├── patterns/
│ ├── gotchas/
│ └── ...
└── decisions/
├── D-001-use-postgres.md
├── D-002-zod-over-joi.md
├── D-003-never-force-push.md
└── ...
The audit agent that runs at session close reads the transcript and decides: is this new information a memory (observation, preference, fact) or a decision (commitment with reasoning, affects future work in an enforceable way)? It writes to the right store.
For decisions specifically, the agent also extracts:
- enforce level: required or advisory, based on language (“must never” vs “prefer”)
- rationale: the sentence that explains why
- source session: which session created this
And optionally, at session start, if a new decision contradicts an older one, the audit agent adds a supersedes link.
The one-line takeaway
If your memory system can’t distinguish “I observed this” from “we committed to this with reasoning,” it will drift. Decisions are not memories. Give them their own store, their own metadata, their own enforcement. The investment is small and the payoff grows with project age.
I did not start with this separation. I started with one table called memory. I refactored into two stores after six weeks, and my memory system immediately became less confusing to use. Learn from my six weeks and start with two stores.