Skip to main content
Version: MVP

Quality scores

2 min readReferenceUpdated 2026-05-19

What you'll find here

The two derived review records that help agents decide whether a handoff or evidence set is ready to rely on — craik.handoff_quality_score and craik.evidence_coverage_score.

Not proof of truth.

Quality scores help agents decide whether to rely on a handoff or evidence set. They are not permission to skip policy review.

Handoff quality

craik.handoff_quality_score summarizes whether a handoff is useful for continuation.

Inputs:

Summary & completed actions

Validation records

Linked receipts

Evidence-bearing artifacts

Adjudications · receipts.

Context debt

Unresolved risks & disagreements

Next steps

Self-audit checklist

Score bands (normalized 0.01.0):

BandRange
excellent0.85 to 1.0
adequate0.60 to less than 0.85
poorless than 0.60

Poor scores name the work.

Poor handoff scores must include blocking_reasons so a resuming agent knows what to repair.

Evidence coverage

craik.evidence_coverage_score compares evidence ids present in a handoff or output with evidence ids required by the caller. Missing ids and weak claims are preserved explicitly.

Coverage ≠ certainty.

A complete set of links means the expected references are present — not that the underlying claims are true.

What's next