Skip to main content
Version: MVP

Skill replay

2 min readReferenceUpdated 2026-05-19

What you'll find here

The three records that compose skill replay — fixture, observation, result — and the boundary that keeps replays reproducible without raw payloads.

Failures block promotion.

Failed replay results block skill promotion until reviewed.

Records

Record
Captures
Fields
SkillReplayFixture
fixture
Fixture id · skill package id · fixture name · input references · expected outcome · expected output references · evidence ids · redaction status · redacted metadata.
SkillReplayObservation
observed
Current behavior for a fixture — outcome · output references · validation signal ids · telemetry id · receipt ids · redacted metadata.
SkillReplayResult
result
From replay_skill_fixture — pass/fail status · reason · expected and observed outcome · missing output refs · unexpected output refs · telemetry id · receipt ids · timestamp.

Fixture expectations

Redacted and evidence-backed.

Replay fixtures reference case files, worker results, telemetry, receipts, and evidence by id — never raw prompts, outputs, traces, stdout, stderr, payloads, responses, or credentials.

What's next