Quality scores
What you'll find here
The two derived review records that help agents decide whether a
handoff or evidence set is ready to rely on —
craik.handoff_quality_score and craik.evidence_coverage_score.
Not proof of truth.
Quality scores help agents decide whether to rely on a handoff or evidence set. They are not permission to skip policy review.
Handoff quality
craik.handoff_quality_score summarizes whether a handoff is useful
for continuation.
Inputs:
Summary & completed actions
Validation records
Linked receipts
Evidence-bearing artifacts
Adjudications · receipts.
Context debt
Unresolved risks & disagreements
Next steps
Self-audit checklist
Score bands (normalized 0.0 – 1.0):
| Band | Range |
|---|---|
excellent | 0.85 to 1.0 |
adequate | 0.60 to less than 0.85 |
poor | less than 0.60 |
Poor scores name the work.
Poor handoff scores must include blocking_reasons so a
resuming agent knows what to repair.
Evidence coverage
craik.evidence_coverage_score compares evidence ids present in a
handoff or output with evidence ids required by the caller. Missing
ids and weak claims are preserved explicitly.
Coverage ≠ certainty.
A complete set of links means the expected references are present — not that the underlying claims are true.