Skip to main content
Version: MVP

Writing Handoffs

5 min readFor operators & runnersUpdated 2026-05-19

What you'll do

Write a handoff that lets the next agent — model or human — pick up your work without re-deriving everything. By the end you'll know the required shape, the patterns that make handoffs actually useful, and the self-audit fields you should never skip.

Before you write

Make sure the task has a current case file:

Refresh the case file if needed
craik case build task_review_docs

The handoff writer derives assumptions and context_debt from the latest case file when available. It also derives receipts from persisted receipts for the task — anything you produced during the run will be referenced automatically.

1 · The minimum viable handoff

A handoff with the basics
craik handoff create task_review_docs \
--summary "Reviewed docs against implementation." \
--agent agent:codex \
--completed-action "Compared README and docs against runtime behavior." \
--test-run pytest \
--next-step "Review memory backend assumptions."

This is a valid handoff — but it's the floor, not the ceiling.

2 · Render it back

Structured JSON
craik handoff show task_review_docs
Markdown rendering for review
craik handoff show task_review_docs --markdown

The structured handoff is the durable source of truth. The Markdown is a readable rendering of the same record — useful for code review, release notes, or pasting into a Slack thread.

3 · Use status to be honest

When the run didn't complete, declare it:

Status
When to use
What it tells the next agent
completed
default
Work landed; verification passed. Self-audit fields should reflect this.
incomplete
partial
Useful progress, but the objective isn't fully met. next_steps should be concrete.
blocked
stop condition hit
External dependency, missing capability grant, or a contradiction needing review.
failed
work could not proceed
Runtime error, hard policy denial, or unrecoverable assumption mismatch.

Pass the status explicitly when relevant:

A blocked handoff
craik handoff create task_review_docs \
--status blocked \
--summary "Stopped at memory.write — no grant in current envelope." \
--next-step "Operator must approve a memory.write grant or convert to a memory proposal."

What good handoffs include

What changed

Files touched, artifacts produced, side effects performed. Concrete.

What was validated

Tests run, linters passed, policies confirmed. Use --test-run and --completed-action liberally.

Open assumptions

Carry forward the assumptions the case file flagged that the run did not resolve.

Receipt references

The handoff writer auto-attaches receipts produced under the task. Skim them before sealing.

Policy exceptions

Any fail-open paths, denied capabilities, or boundary hits. Disclosure is the rule.

Context debt

Sources omitted, deferred, or unavailable. The next run gets to inherit this.

Memory proposals

Facts the run wants to land. The reviewer's queue.

Concrete next steps

What should happen, who should do it, what to verify first. The most-read field.

Patterns that hurt the next agent

Do

  • Use concrete --completed-action entries: "Updated docs/architecture.md to match runtime".
  • Cite receipt ids in the summary when they prove a claim.
  • Carry forward assumptions verbatim — don't paraphrase.
  • Write --next-step in the imperative: "Review memory backend assumptions".

Avoid

  • Vague summaries: "Reviewed and fixed." Fix what? With what evidence?
  • Promoting assumptions to "fact" in the handoff without an evidence reference.
  • Hiding fail-open paths by leaving policy_exceptions empty.
  • Skipping --test-run when validation did happen — silence reads as "didn't run."

The self-audit checklist

Every handoff includes a self-audit. The next agent (and the next operator) read it first. Don't lie to the audit:

  1. Schema validated. The runtime checked records against declared schemas — leave this true.

  2. Redaction reviewed. Confirm receipts and handoffs are free of unredacted secrets. If you didn't review, mark it false.

  3. Receipts reviewed. Were the produced receipts inspected and joined back to the actions they describe?

  4. Assumptions reviewed. Were open assumptions promoted, refuted, or explicitly carried forward?

  5. Validation recorded. Are the commands run and outcomes captured in commands_and_validation?

  6. Policy exceptions disclosed. Are all fail-open paths and boundary hits called out?

Incomplete handoffs are valid. Dishonest handoffs are not — they contaminate the next run's context.

What's next