Skip to main content
Version: MVP

Memory Diffs

4 min readFor memory operatorsUpdated 2026-05-19

What you'll learn

  • What a memory diff captures across a task.
  • The shape of craik.memory_diff and how it ties to handoffs.
  • How to use diffs in review and audit.

Memory diff

A structured summary of what changed in memory during a single task — proposals created, proposals approved or rejected, facts written, write failures, and facts read into the task's working context.

Why diffs?

A task's effect on memory is rarely "one fact got written." A typical run proposes several facts, some of which get approved, some of which get rejected, some of which collide with existing facts. The diff is the durable record of all of that — what was attempted, what landed, what was read along the way.

This matters because:

  • Reviewers need a single object to inspect when deciding whether a run's memory effect was acceptable.
  • Handoffs carry diff references so the next agent can see what memory looked like at the run's boundary.
  • Audit depends on being able to ask: "What did this task change?" without piecing it together from raw fact tables.
Show the memory diff for one task
craik memory diff task_review_docs

The diff records:

Proposals created

Facts the task proposed during execution, with evidence references and intended scope.

Proposals approved

Proposals a reviewer accepted, with reviewer identity and approval timestamp.

Proposals rejected

Proposals declined, with reason and reviewer identity.

Facts written

Direct writes — distinct from proposals that got promoted. Requires memory.write grant.

Write failures

Writes the runtime attempted that did not land — scope violation, backend error, redaction failure.

Facts read

Facts pulled into the task's case-file context. The "what the task was reasoning over" side of the diff.

Today's coverage

The v0.1.x implementation derives proposal activity from local state. This means proposals created, approved, and rejected during the task are fully recorded.

Direct writes, write failures, and fact reads currently come from local state only. As runner integrations and the Stigmem write path mature, those edges attach to the same craik.memory_diff contract — the shape won't change, just the completeness.

note

If you're auditing today, treat the proposal sections as authoritative and the read/write sections as "best-effort, growing." The diff's shape won't change as coverage expands; new entries will simply start appearing.

Where diffs live

Diffs persist in the local store under $CRAIK_HOME/state/. They're addressable by task id and joinable into handoffs and receipts:

  • Handoffs reference diffs in their memory_proposals and context_debt sections.
  • Receipts for memory.write and memory.propose capabilities reference the same diff in their result metadata.
  • The work graph projects diffs as memory_proposal and fact nodes connected to the task that produced them.

Read a diff in review

A practical review pass looks like:

  1. Inspect the diff. craik memory diff <task-id> to see the whole story.

  2. Spot-check proposals against evidence. Each proposal should cite evidence references. Anything that looks unsupported is a candidate for rejection.

  3. Look for write failures. A run that tried to write outside the approved scope is a stronger signal than the proposals that did land.

  4. Confirm scope visibility. Memory writes are scoped. Make sure the scope the run wrote into matches the scope the reviewer authorized.

What's next