Skip to main content
Version: MVP

Release Readiness Validation

5 min readFor maintainersUpdated 2026-05-26

What you'll find here

The repository-owned readiness record for Craik releases. The current pre-release gate is 0.12.9; historical sign-offs remain below for audit continuity.

v0.12.9 TUI Cutover

Gateway backend cleanup follow-up is in progress.

The current post-cutover goal creates a local Gateway session boundary for audited prompt execution, JSONL stdio transport for TUI clients, CLI mirrors for raw prompt execution, structured model profiles, and regression tests for Claude Code marker routing and progress events. This work starts from checkpoint commit c6cd81d checkpoint: pre-backend-cleanup.

Gateway session
ready after this PR
runtime/backend/session.py owns raw prompt execution for provider and Claude Code marker paths and emits normalized lifecycle, working-state, progress, receipt, output, completion, and error events.
Event history
ready after this PR
Each audited prompt run persists normalized Gateway events as a redacted craik.run_output artifact, giving clients replayable model, lifecycle, receipt, output, and completion evidence.
Textual Gateway client
ready after this PR
Textual raw prompt submission now uses a local Gateway client and updates the activity panel from normalized Gateway events, while final transcript text preserves the model output for copy/export.
JSONL protocol
ready after this PR
craik tui-backend --jsonl accepts session.status, prompt.submit, slash.submit, model.set, approval.decide, run.interrupt, and close messages over stdio for Textual/Rust/frontend evaluation.
Slash/CLI mirrors
ready after this PR
/run <prompt> and craik run prompt <prompt> share the audited Gateway path; backend-affecting slash mirrors are covered by regression tests.
Model profiles
ready after this PR
craik model set keeps legacy selectors while persisting provider/model profile metadata, display labels, backend preference, common provider options, and provider-specific passthrough knobs. Gateway prompt runs now pass the active profile options into provider runtime requests for OpenAI, Anthropic, Chat Completions, and Gemini payloads.
TUI evaluation fixtures
ready after this PR
Gateway JSONL replay fixtures and summary helpers provide a shared evaluation contract for Textual and Rust clients. The Rust ratatui replay prototype under crates/craik-tui-rs parses the same fixture and verifies lifecycle, working-state, run, task, receipt, and progress rendering.

Gateway Cleanup Validation Commands

uv run pytest tests/test_backend_gateway_session.py tests/test_backend_jsonl.py tests/test_slash_cli_mirrors.py
uv run pytest tests/test_gateway_replay.py
cargo test --manifest-path crates/craik-tui-rs/Cargo.toml
uv run pytest tests/test_v010_agent_shell.py tests/test_v011_tui.py tests/test_v0122_slash_inline_execution.py tests/test_v0122_textual_app.py tests/test_v0123_multiline_input_methods.py tests/test_v0125_tui_polish.py tests/test_v0127_anthropic_claude_cli.py tests/test_provider_runner.py tests/test_provider_runtime.py tests/test_cli.py
uv run python scripts/generate_cli_reference.py --check

v0.12.9 completes the runtime consumption path for the CLI/TUI command contract.

0.12.9 wires the three TUI entry points to the contract dispatcher, routes Textual slash output through the new renderer pipeline, replaces the legacy dispatcher implementation with a compatibility shim, replaces the hand-maintained slash metadata tuple with the live AutoSlashRegistry, routes prompt-backed commands through canonical modals, and adds runtime-consumption CI guards so future contract infrastructure cannot ship without live TUI use.

v0.12.9 Acceptance Status

Runtime dispatch
ready
textual_app.py, tui.py, and agent_shell.py dispatch slash commands through craik.runtime.contract.dispatch.
Registry consumption
ready
The TUI runtime consumes a cached AutoSlashRegistry from the live Typer app plus shell-only slash built-ins.
Renderer consumption
ready
Textual slash output and inline actions render through format_command_result(..., kind="tui").
Canonical modals
ready
/auth login, /auth logout, and /receipts detail use canonical-composed modal screens under runtime/shell/modals/; the legacy textual_modals.py file is removed.
Prompt runtime
ready
interactive_prompts metadata now drives runtime behavior by intercepting typer.confirm and typer.prompt during contract-dispatch callback invocation.
Regression guards
ready
check_contract_dispatch_consumed.py, check_no_legacy_modal_pushes.py, check_no_slash_command_specs_consumption.py, and check_interactive_prompts_runtime_consumed.py verify runtime imports, block legacy cutover APIs, and assert contract-dispatch invocation across TUI entry points.

v0.12.9 Validation Commands

Run the standard release gate from a clean checkout before signed tag creation:

uv run python scripts/check_version_consistency.py
uv run python scripts/check_release_version.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_cli_tui_contract.py
uv run python scripts/check_command_result_return.py
uv run python scripts/check_no_direct_stdout.py
uv run python scripts/check_contract_dispatch_consumed.py
uv run python scripts/check_no_legacy_modal_pushes.py
uv run python scripts/check_no_slash_command_specs_consumption.py
uv run python scripts/check_interactive_prompts_runtime_consumed.py
uv run python scripts/check_modal_screen_mappings.py
uv run python scripts/check_modal_screen_security.py
uv run python scripts/check_payload_shape_validity.py
uv run python scripts/check_next_actions_validity.py
uv run python scripts/check_format_flag_coverage.py
uv run python scripts/generate_snapshots.py /status --name status --width 80 --check
uv run python scripts/check_snapshot_coverage.py
uv run python scripts/check_dead_code.py
uv run python scripts/check_oauth_callback_safety.py
uv run python scripts/check_dock_bottom_snapshot_coverage.py
uv run python scripts/check_text_selection_wiring.py
python3 scripts/check_codebase_brand_hygiene.py
uv run python scripts/check_slash_command_registry.py
uv run python scripts/check_changed_file_strictness.py
uv run python scripts/check_public_docs_hygiene.py
uv run python scripts/check_doc_links.py
uv run python scripts/generate_cli_reference.py --check
uv run ruff check
uv run mypy
uv run pytest
uv run pytest --cov=craik --cov-report=term --cov-fail-under=80
(cd docs && npm run build)

v0.12.8 CLI/TUI Contract

v0.12.8 makes the shared command contract release-ready across CLI, slash, JSON, and TUI surfaces.

0.12.8 keeps migrated commands on the CommandResult contract, prevents strictly migrated shared callbacks from writing directly to stdout during slash dispatch, adds canonical modal metadata for prompt-backed commands, expands slash snapshots across the standard width matrix, and adds CI gates so future migrations stay aligned with the canonical renderer flow.

v0.12.8 Acceptance Status

Command contract
ready
Migrated CLI/TUI callbacks declare @craik_command, return CommandResult, and use shared renderers.
Interactive prompts
ready
Current prompt-backed commands declare interactive_prompts metadata and modal mapping guards reject drift.
Snapshot coverage
ready
Table, card, and card-list slash command snapshots cover widths 60, 80, 100, 120, 160, and 200.
Structural guards
ready
Payload shape, NextAction target, format coverage, direct stdout, modal, return annotation, CLI/TUI metadata, and snapshot guards run in CI.
Legacy command posture
ready
Commands that cannot safely return CommandResult are documented with explicit legacy markers.

v0.12.8 Validation Commands

Run the standard release gate from a clean checkout before signed tag creation:

uv run python scripts/check_version_consistency.py
uv run python scripts/check_release_version.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_cli_tui_contract.py
uv run python scripts/check_command_result_return.py
uv run python scripts/check_no_direct_stdout.py
uv run python scripts/check_modal_screen_mappings.py
uv run python scripts/check_modal_screen_security.py
uv run python scripts/check_payload_shape_validity.py
uv run python scripts/check_next_actions_validity.py
uv run python scripts/check_format_flag_coverage.py
uv run python scripts/generate_snapshots.py /status --name status --width 80 --check
uv run python scripts/check_snapshot_coverage.py
uv run python scripts/check_dead_code.py
uv run python scripts/check_oauth_callback_safety.py
uv run python scripts/check_dock_bottom_snapshot_coverage.py
uv run python scripts/check_text_selection_wiring.py
python3 scripts/check_codebase_brand_hygiene.py
uv run python scripts/check_slash_command_registry.py
uv run python scripts/check_changed_file_strictness.py
uv run python scripts/check_public_docs_hygiene.py
uv run python scripts/check_doc_links.py
uv run python scripts/generate_cli_reference.py --check
uv run ruff check
uv run mypy
uv run pytest
uv run pytest --cov=craik --cov-report=term --cov-fail-under=80
(cd docs && npm run build)

v0.12.7 Provider OAuth Suite

v0.12.7 completes the provider-specific OAuth adoption path that is usable without private client registrations.

0.12.7 adds OAuth auth-profile contracts, one-shot loopback PKCE helpers, callback-safety CI coverage, OpenAI browser PKCE OAuth, Anthropic Claude CLI delegation, Gemini/Vertex ADC and service-account login through google-auth, provider-specific header handling, OAuth status metadata, billing-surface status metadata, and explicit craik auth login <provider> --mode=api-key|oauth|claude-cli selection.

v0.12.7 Validation Commands

Run the standard release gate from a clean checkout before signed tag creation:

uv run python scripts/check_version_consistency.py
uv run python scripts/check_release_version.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_cli_tui_contract.py
uv run python scripts/check_command_result_return.py
uv run python scripts/check_no_direct_stdout.py
uv run python scripts/generate_snapshots.py /status --name status --width 80 --check
uv run python scripts/check_snapshot_coverage.py
uv run python scripts/check_dead_code.py
uv run python scripts/check_oauth_callback_safety.py
uv run python scripts/check_dock_bottom_snapshot_coverage.py
uv run python scripts/check_text_selection_wiring.py
python3 scripts/check_codebase_brand_hygiene.py
uv run python scripts/check_slash_command_registry.py
uv run python scripts/check_changed_file_strictness.py
uv run python scripts/check_public_docs_hygiene.py
uv run python scripts/check_doc_links.py
uv run python scripts/generate_cli_reference.py --check
uv run ruff check
uv run mypy
uv run pytest
uv run pytest --cov=craik --cov-report=term --cov-fail-under=80
(cd docs && npm run build)

v0.12.6 Inherited Surface Sweep

v0.12.6 closes the inherited-surface sweep for channel ingress, slash-command payload shape, and TUI text selection.

0.12.6 sanitizes normalized messaging-channel text before runtime boundaries, extends the slash-command registry guard to execute structured payload smoke commands, and keeps TUI text-selection behavior documented, styled, and covered by a release-readiness guard.

v0.12.6 Validation Commands

Run the standard release gate from a clean checkout before signed tag creation:

uv run python scripts/check_version_consistency.py
uv run python scripts/check_release_version.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_dead_code.py
uv run python scripts/check_dock_bottom_snapshot_coverage.py
uv run python scripts/check_text_selection_wiring.py
python3 scripts/check_codebase_brand_hygiene.py
uv run python scripts/check_slash_command_registry.py
uv run python scripts/check_changed_file_strictness.py
uv run python scripts/check_public_docs_hygiene.py
uv run python scripts/check_doc_links.py
uv run ruff check
uv run mypy
uv run pytest
uv run pytest --cov=craik --cov-report=term --cov-fail-under=80
(cd docs && npm run build)

v0.12.5 TUI Polish and Release Guards

v0.12.5 completes the terminal UI polish, receipt verification, and release-guard hardening pass for the v0.12 train.

0.12.5 ships bottom-stack TUI ordering, text-selection support, inline subcommand listings, TUI-shaped next-action guidance, standalone receipt verification, coverage publishing, positioning documentation, AST-bound dock-bottom snapshot coverage, all-theme TCSS dock scanning, and an 80% coverage floor for release coverage publication.

v0.12.5 Validation Commands

Run the standard release gate from a clean checkout before signed tag creation:

uv run python scripts/check_version_consistency.py
uv run python scripts/check_release_version.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_dead_code.py
uv run python scripts/check_dock_bottom_snapshot_coverage.py
python3 scripts/check_codebase_brand_hygiene.py
uv run python scripts/check_slash_command_registry.py
uv run python scripts/check_changed_file_strictness.py
uv run python scripts/check_public_docs_hygiene.py
uv run python scripts/check_doc_links.py
uv run ruff check
uv run mypy
uv run pytest
uv run pytest --cov=craik --cov-report=term --cov-fail-under=80
(cd docs && npm run build)

v0.12.4 Command Surface Ergonomics

v0.12.4 completes the slash-command ergonomics and TUI hardening pass for the v0.12 train.

0.12.4 ships centralized slash-command schema metadata, structured inline result rendering, argument-aware help and validation, current-session transcript search, receipt detail modals with integrity status, bounded toast notifications, destructive-action confirmations, inline action-key dispatch, and release guards for command metadata and TUI brand hygiene.

v0.12.4 Validation Commands

Run the standard release gate from a clean checkout before signed tag creation:

uv run python scripts/check_version_consistency.py
uv run python scripts/check_release_version.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_dead_code.py
uv run python scripts/check_dock_bottom_snapshot_coverage.py
python3 scripts/check_codebase_brand_hygiene.py
uv run python scripts/check_slash_command_registry.py
uv run python scripts/check_changed_file_strictness.py
uv run ruff check
uv run mypy src/craik
uv run pytest
(cd docs && npm run build)

v0.12.3 Interactive Shell Refinement

v0.12.3 completes the interactive shell refinement pass for the v0.12 train.

0.12.3 ships terminal status-bar usage and quota indicators, reverse history search, multi-line input alternatives, audited ! shell invocations, session naming, theme controls, MCP discovery, CLI mistype recovery, and TUI aesthetic polish.

v0.12.3 Validation Commands

Run the standard release gate from a clean checkout before signed tag creation:

uv run python scripts/check_version_consistency.py
uv run python scripts/check_release_version.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_dead_code.py
python3 scripts/check_codebase_brand_hygiene.py
uv run ruff check
uv run mypy
uv run pytest
(cd docs && npm run build)

v0.12.2 Canonical TUI Release

v0.12.2 is the canonical interactive-runtime release for the v0.12 train.

0.12.2 ships the Textual-based TUI, slash-command inline execution, history and completion support, modal auth and approval flows, privacy documentation, and the codebase brand-hygiene release guard.

v0.12.2 Validation Commands

Run the standard release gate from a clean checkout before signed tag creation:

uv run python scripts/check_version_consistency.py
uv run python scripts/check_release_version.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_dead_code.py
python3 scripts/check_codebase_brand_hygiene.py
uv run ruff check
uv run mypy
uv run pytest
(cd docs && npm run build)

v0.12.1 Patch Release

v0.12.1 is a patch-readiness release for the v0.12 train.

0.12.1 incorporates the post-release remediation PRs for README current-state accuracy, provider auth and health-check UX, gateway and doctor readiness, MCP validation, migration apply reporting, localization polish, and CodeQL cleanup.

v0.12.1 Validation Commands

Run the standard release gate from a clean checkout before signed tag creation:

uv run python scripts/check_version_consistency.py
uv run python scripts/check_release_version.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_dead_code.py
uv run ruff check
uv run mypy src
uv run pytest -q

v0.12.0 Goal Workflow

Structural CI guard generalization starts the v0.12.0 release train.

0.12.0 begins with G0 before functional migration, internationalization, and compatibility work. G0 strengthens the release-readiness writer-coverage guard with qualified call resolution, explicit dynamic-dispatch allowlisting, and a complementary dead-code scan.

Area
Status
Goal issue
G0 structural CI guard generalization
ready after this PR
#819 · release-readiness writer coverage resolves calls through qualified function paths and import maps, registry-dispatched callables use a capped documented allowlist, and scripts/check_dead_code.py runs vulture at confidence 80 as a complementary dead-code gate
G1 adjacent runtime migration CLI
ready after this PR
#827 · craik migrate inspect, plan, and dry-run import inspect adjacent agent-runtime JSON exports, map source records to proposed Craik target schemas, preserve source files, support text/JSON output, and report skipped secret-like fields without copying values
G2 migration maps
ready after this PR
#828 · object-level migration maps classify agents, profiles/personas, provider/model config, aliases, fallback chains, channels, skills, memory, sessions, schedules, sandbox, gateway, and approval/security posture as importable, partial, manual, unsupported, or skipped-secret with explicit target Craik objects and operator actions
G3 migration reports
ready after this PR
#826 · craik migrate report emits deterministic safe-to-share review artifacts with summary counts, importable objects, manual actions, skipped secrets, security posture changes, unsupported capabilities, recommended next commands, validation checklist items, and source-to-target links without raw secret values
G4 secret migration
ready after this PR
#825 · secret inventory detects nested secret-like fields without logging values, dry-run receipts write nothing by default, optional keyring import requires operator confirmation and a secure backend, file fallback blocks import, and SECURITY.md documents the trust boundary
G5 compatibility fixture suite
ready after this PR
#829 · public-safe adjacent-runtime fixtures exercise provider config, fallback chains, profiles/personas, channels, sessions, memory, skills, schedules, sandbox, gateway, approval posture, invalid JSON warnings, and redaction assertions without real operator secrets
G6 MCP server and client compatibility
ready after this PR
#831 · MCP server compatibility mode exposes safe read tools first, gated write tools require policy and receipts, JSON-RPC smoke handling returns structured denials, and MCP client import/export redacts external config and secret-like environment values
G7 session export/import compatibility
ready after this PR
#830 · portable session exports preserve source identity, redact session events, import adjacent transcript shapes as stopped imported sessions, and report unsupported tool calls as evidence instead of executable authority
G8 agent/client protocol bridge
ready after this PR
#832 · bridge decisions block missing operator auth, policy envelopes, capability grants, receipts, redaction, instruction elevation, and unbounded tools; the first local bridge adapter emits redacted receipts for allowed calls and no receipts for denials
G9 i18n and localized operator surfaces
ready after this PR
#833 · stable message ids back localized operator text, CRAIK_LOCALE and --locale configure text output, missing translations fall back predictably, slash help and migration report headings localize, and translation contribution rules preserve machine-readable semantics
G10 capture-and-cache auth UX
ready after this PR
#834 · craik auth login captures provider keys through hidden prompt, writes keyring-ref profiles, reports backend-aware health, removes cached credentials on logout, migrates env-var profiles with consent, and shares auth status across slash commands, TUI, dashboard, and readiness state

v0.12.0 Validation Commands

Run the structural gate from a clean checkout before starting functional v0.12.0 implementation goals:

uv run python scripts/check_release_readiness.py
uv run python scripts/check_dead_code.py
uv run pytest tests/test_release_readiness_guards.py tests/test_dead_code_check.py tests/test_auth_capture_and_cache.py -v

v0.11.0 Goal Workflow

Product surfaces and channel operations are ready for release prep.

0.11.0 moves Craik beyond command-only operation with a terminal UI, authenticated dashboard, desktop companion, service-managed gateway, real channel adapters, approval UX, diagnostics/update workflows, and multimodal companion contracts. Each goal lands through its own issue and PR before release prep; all v0.11.0 implementation goal issues are closed.

Area
Status
Goal issue
Release signing-key asset rules
ready
#787 · signed annotated tag rules, GitHub Release public signing-key asset, and fingerprint verification gate
Release signing-key and URL-scheme remediation
ready
#812 · local signing-key exports are ignored, public signing keys remain GitHub Release assets, and future craik:// handlers are documented as review-only with no direct mutating actions
Terminal UI
ready
#788 · craik --tui, craik tui, shared slash commands, multiline composer, status/model/session/approval/artifact/gateway/skill panels, autocomplete metadata, redacted approval modal fixture, tests, and guide docs
Authenticated local dashboard
ready
#789 · craik dashboard, local-only default bind, token or active operator-session auth, status/provider/session/run/handoff/receipt/approval/gateway/skill/model pages, shared slash-command action route, route tests, and security docs
Dashboard action remediation
ready
#808 · dashboard action POSTs enforce local Origin checks for browser requests and reject mutating slash-command families from the generic read-only action endpoint
Dashboard session binding remediation
ready
#809 · tokenless dashboard mode requires X-Craik-Operator-Session to match the active operator session token, and preview output warns about the required header
Follow-up dashboard, service, and Discord remediation
ready after this PR
#821 · dashboard session binding uses a random per-session token instead of JWT jti, systemd gateway units tolerate executable paths with spaces, and Discord webhook diagnostics distinguish verifier unavailability from invalid signatures
Desktop companion MVP
ready
#790 · craik desktop status, menu, action, approval notification, and update-check surfaces for local dashboard launch, gateway command actions, provider/auth health, doctor, and redacted notification deep links
Gateway service lifecycle
ready
#791 · craik gateway install, uninstall, status, logs, doctor, stop, and restart with launchd/systemd generation, Windows plan, stale pid recovery, and log discovery
Gateway service executable remediation
ready
#810 · generated launchd, systemd, and Windows service-plan output uses the absolute craik executable path resolved at install time
Real channel adapters
ready
#792 · WebChat, Telegram, Discord, and Slack adapter contracts, setup and doctor commands, secret-reference plans, inbound normalization, pairing/allowlist policy gates, redacted outbound delivery receipts, and channel security docs
Channel persistence remediation
ready
#807 · channel setup writes adapter contracts, pairings, allowlists, and policy envelopes through production CLI paths; webhook ingress receipts use shared persistence; readiness checks writer reachability through wrappers
Channel webhook signature remediation
ready
#811 · webhook ingress verifies WebChat/Craik HMAC, Slack signatures and replay timestamps, Telegram secret-token headers, and fail-closed Discord native signature support warnings
Approval UX
ready
#793 · /approvals, craik approvals list, show, approve, and deny, dashboard queue payloads, TUI approval modal/counts, desktop notifications, decision receipts, retry-path linkage, and lifecycle tests
Doctor, fix, and update
ready
#794 · craik doctor, craik doctor --fix, craik update --check, expanded operator/provider/model/gateway/channel/security diagnostics, explicit dry-run fix plans, unsafe fix confirmation, JSON output, and fixture tests
Multimodal and companion contracts
ready
#795 · voice posture, speech-to-text and text-to-speech adapter contracts, multimodal artifact references, mobile/desktop/visual companion decisions, work-graph visual bridge, accessibility requirements, transcript/media metadata redaction, and fixture tests
Release prep
ready after this PR
#823 · version declarations, changelog promotion, roadmap/readiness documentation, package-lock metadata, and release validation gate before maintainer signing

v0.11.0 Release Readiness

Status: ready for maintainer signing after release-prep PR lands.

The v0.11.0 milestone contains the TUI, authenticated dashboard, desktop companion, gateway service lifecycle, real channel adapters, approval UX, doctor/update workflow, multimodal companion contracts, and all follow-up remediation. Tagging remains gated on the final release-prep validation and maintainer-managed GPG signing.

Gate
Status
Evidence
Implementation
ready
Operator-facing TUI, dashboard, desktop, gateway, channel, approval, diagnostic, update, and multimodal surfaces landed through issue-linked PRs and follow-up remediation.
Security
ready
Dashboard session binding uses a random per-session token, dashboard actions reject mutating slash commands and enforce local Origin checks, gateway service units use absolute executable paths, channel signatures verify platform-specific boundaries, and release key asset rules are documented.
Structural guards
ready
Release readiness resolves writer reachability through qualified import-aware call graphs and the complementary dead-code check runs vulture at confidence 80.
Known blockers
none known
No open v0.11.0 milestone issue remains. Tagging is gated on the release-prep PR, signed annotated tag creation, GitHub Release publication, and signing-key asset verification.

v0.11.0 Validation Commands

Run the full release gate from a clean checkout before tagging 0.11.0:

uv run python scripts/check_version_consistency.py
uv run python scripts/check_release_version.py
uv run ruff check
uv run mypy
uv run python scripts/generate_cli_reference.py --check
uv run python scripts/check_doc_links.py
uv run python scripts/check_public_docs_hygiene.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_dead_code.py
uv run python scripts/check_changed_file_strictness.py
uv run python scripts/check_max_file_lines.py
uv run python scripts/quickstart_smoke.py
uv run pytest

v0.10.0 Goal Workflow

Agent shell and setup UX gate is ready for release prep.

0.10.0 turns the root CLI into a usable agent shell, adds progressive setup guidance before auth is configured, introduces browser-assisted provider login, and exposes model, session, profile, usage, and learning-loop controls through operator-facing commands. The implementation goal landed through a green PR tied to the v0.10.0 milestone before release prep begins.

Area
Status
Goal issue
Agent shell and setup UX
ready
#779 · root shell launch, craik chat, slash commands, readiness states, provider login, model/session/profile controls, and learning-loop command surfaces
Release prep
ready after this PR
#781 · version declarations, changelog, roadmap, release-readiness docs, generated references, and validation gate

v0.10.0 Release Readiness

Status: ready for release checks after release-prep PR lands.

The v0.10.0 milestone contains the agent shell, progressive setup states, browser-assisted provider login, secure credential-storage posture reporting, model/session/profile UX, usage summaries, and learning-loop controls. Release prep owns the final version bump, changelog promotion, signed tag, package publication, docs publication, and post-release verification.

Gate
Status
Evidence
Implementation
ready
The root craik shell can launch before auth, one-shot chat works through craik chat and craik --one-shot, readiness states explain setup gaps, and slash commands route users toward setup, auth, provider, model, session, approval, and doctor actions.
Auth and credentials
ready
craik auth login provides browser-assisted provider setup for hosted providers and guided fallback for local models while keeping output redacted and surfacing credential-storage posture.
Runtime UX
ready
Model aliases, fallbacks, status, probes, session list/show/resume/rename/export/prune/delete, local profiles, usage summaries, and insight summaries are exposed through command surfaces and generated CLI reference docs.
Learning controls
ready
Skill telemetry, proposals, eval, promote, rollback, and history commands preserve the policy boundary: agents can propose improvement paths, but promotion remains an operator-governed action.
Tests
ready
Focused v0.10 shell, readiness, slash-command, provider-login, model/session/profile, learning-loop, release-readiness guard, and runtime-layout tests landed with the implementation PR. Full local validation passed with loopback access enabled for HTTP-server tests.
Docs
ready
Agent shell, model/session/profile UX, readiness states, slash commands, authentication, quickstart, learning-loop, roadmap, generated CLI reference, and release-readiness docs are updated.
Known blockers
none known
No unresolved v0.10.0 implementation blocker remains. Tagging is gated on release-prep validation and maintainer signing.

v0.10.0 Validation Commands

Run the full release gate from a clean checkout before tagging 0.10.0:

uv run python scripts/check_version_consistency.py
uv run ruff check
uv run mypy
uv run python scripts/generate_cli_reference.py --check
uv run python scripts/check_doc_links.py
uv run python scripts/check_public_docs_hygiene.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_dead_code.py
uv run python scripts/check_changed_file_strictness.py
uv run python scripts/check_max_file_lines.py
uv run python scripts/quickstart_smoke.py
uv run pytest

Release Signing-Key Asset Gate

Every release must be cut from a signed annotated tag and must publish the ASCII-armored public release signing key on the matching GitHub Release as craik-release-signing-key.asc. This asset is the public key operators can import or inspect; it is not a detached tag signature. The embedded Git tag signature remains the integrity check for the release commit.

git tag -v vX.Y.Z
git ls-remote --tags origin vX.Y.Z
gpg --show-keys --with-fingerprint craik-release-signing-key.asc
gh release view vX.Y.Z --repo eidetic-labs/craik --json assets --jq '.assets[].name'

Release prep is not complete until the fingerprint shown for craik-release-signing-key.asc matches the key reported by git tag -v vX.Y.Z.

v0.9.0 Goal Workflow

Persistent agent runtime gate is ready for release prep.

0.9.0 adds provider-backed persistent sessions, guided provider setup, Gemini and local model routes, provider certification, explicit failure recovery, and a deterministic persistent-agent launch demo. All implementation goals landed through green PRs tied to the v0.9.0 milestone before release prep begins.

Area
Status
Goal issue
Provider setup
ready
#737 · guided setup for OpenAI, Anthropic, Gemini, and local models
Gemini runtime
ready
#738 · Gemini generateContent adapter, docs, and runner matrix coverage
Local model presets
ready
#740 · Ollama, LM Studio, vLLM, and generic OpenAI-compatible presets
Persistent prompt loop
ready
#741 · persistent session prompt execution with events, receipts, and handoff links
Provider certification matrix
ready
#742 · generated matrix for hosted, local, and fixture provider routes
Failure recovery
ready
#743 · stale pid/endpoint, auth, provider, sandbox, reconnect, and resume states
Persistent launch demo
ready
#744 · deterministic launch, prompt, receipts, handoff, and status demo
Security and sandbox boundaries
ready
#745 · persistent-agent state boundaries, redacted inspection surfaces, environment receipt links, and denied side-effect receipts
Readiness docs reconciliation
ready after this PR
#756 · roadmap and release-readiness docs reflect the closed v0.9.0 goal workflow
CLI auth coverage remediation
ready after this PR
#759 · release-readiness auth guard now scans every cli_*.py module and stateful CLI commands require the canonical operator session check, with explicit bootstrap/demo/CI policy-test exemptions

v0.9.0 Release Readiness

Status: ready for pre-release checks after docs reconciliation lands.

The v0.9.0 milestone now contains provider setup, Gemini runtime support, local model presets, provider-backed persistent sessions, provider certification, explicit failure recovery, a launch demo, persistent-agent security boundaries, MCP routing decisions, sandbox backend tracking, browser/tool boundary tracking, sandbox policy validation, and environment capability receipts. Release prep remains responsible for the version bump, final changelog section, signed tag, publish, and post-publish verification.

Gate
Status
Evidence
Implementation
ready
Persistent sessions can launch, prompt providers, persist events, carry receipt and handoff links, recover from stale process, auth, provider, and sandbox states, and route through OpenAI, Anthropic, Gemini, local, and fixture providers.
Tests
ready
Focused coverage exists for provider setup, Gemini transport, local presets, persistent prompt execution, provider certification, recovery, launch demo behavior, environment receipt linkage, and denied side-effect receipts.
Docs
ready
Persistent-agent runtime, authentication, local model setup, provider routing, provider certification, persistent-agent security, execution-environment security, and environment receipt docs are updated.
Security
ready
Session inspection uses redacted views, persistent agents retain references instead of secrets, environment receipts link provider and sandbox boundaries to sessions, and missing side-effect grants produce denial receipts.
Milestone provenance
ready
Backfill issues #765 through #773 link MCP, sandbox, browser boundary, sandbox policy, and environment receipt tiles to implementation evidence.
Known blockers
none known
No unresolved v0.9.0 implementation blocker remains. Tagging is gated on release-prep validation.

v0.9.0 Validation Commands

Run the full release gate from a clean checkout before promoting 0.9.0 into release prep:

uv run python scripts/check_version_consistency.py
uv run ruff check
uv run mypy
uv run python scripts/generate_cli_reference.py --check
uv run python scripts/check_doc_links.py
uv run python scripts/check_public_docs_hygiene.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_changed_file_strictness.py
uv run pytest tests/test_provider_cli.py tests/test_provider_runtime.py tests/test_local_model_presets.py tests/test_agent_sessions.py tests/test_agent_session_integrity.py tests/test_provider_certification.py tests/test_cli_agents.py tests/test_cli_operator_auth.py tests/test_release_readiness_guards.py tests/test_demos.py tests/test_environment_receipts.py tests/test_sandbox_policy_boundaries.py -q
uv run pytest

v0.8.0 Goal Workflow

Operator integrations and always-on gateway gate.

0.8.0 ships a foreground gateway daemon, setup/diagnostics/update operator commands, channel contracts, messaging fixture ingress, identity pairing, allowlists, channel policy envelopes, webhook validation, scheduled automations, and gateway receipts. Each remediation issue shipped implementation, tests, docs, validation, a green PR, merge, issue closure, and branch pruning before release prep.

Area
Status
Goal issue
Runnable gateway daemon and runtime state
ready
#713 · craik gateway start, pid-file lock, /health, and persisted lifecycle transitions
Gateway/channel contract persistence
ready
#723 · typed local-store helpers for adapter contracts, pairings, allowlists, schedules, automations, policies, and gateway receipts
Webhook hardening
ready
#715 · body cap, JSON depth cap, timestamp freshness, normalized signature headers, and persistent replay detection
Operator auth for diagnostics/reconfiguration
ready
#719 · active operator session required before reading local diagnostics or reconfiguring an existing setup
Schedule throttle
ready
#718 · cron-like schedules reject runs more frequent than every five minutes
Pairing token expiry
ready
#721 · channel identity pairings require and enforce expiry
Public-bind TLS warning and override
ready
#722 · public gateway binds require policy plus explicit insecure-public acknowledgement
Release-readiness structural guards
ready
#720 · writer and operator-auth guard coverage widened for v0.8.0 surfaces
End-to-end gateway pipeline test
ready
#717 · integrated daemon, webhook, channel policy, receipt, schedule, and store coverage
Readiness, security, and changelog reconciliation
ready after this PR
#714 · docs now reflect shipped behavior and residual limitations
Release prep
pending
#716 · version bump, final changelog section, tag, publish, and verification remain the final gate

v0.8.0 Release Readiness

Status: ready for release prep after docs reconciliation lands.

All implementation remediation goals are closed. The shipped daemon is a foreground local service with health checks and persisted lifecycle state; hosted public operation, production dispatch loops, and broad third-party channel adapters remain future surfaces.

Gate
Status
Evidence
Implementation
ready
Gateway setup, diagnostics, foreground daemon, webhook validation, messaging fixture ingress, identity pairing, allowlists, policy envelopes, schedules, automations, and gateway receipts are implemented and persisted.
Tests
ready
Focused unit coverage exists for each surface plus tests/test_v0_8_0_gateway_pipeline_e2e.py for the integrated flow.
Security
ready
Webhook payload limits, timestamp/replay checks, operator session gates, pairing expiry, schedule throttling, and public-bind policy/TLS acknowledgement are covered.
Known blockers
none known
No unresolved v0.8.0 implementation blocker remains. Tagging is gated on #716 release-prep validation.

v0.8.0 Validation Commands

Run the full release gate from a clean checkout before promoting 0.8.0 into release prep:

uv run python scripts/check_version_consistency.py
uv run ruff check
uv run mypy
uv run python scripts/generate_cli_reference.py --check
uv run python scripts/check_doc_links.py
uv run python scripts/check_public_docs_hygiene.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_changed_file_strictness.py
uv run pytest tests/test_v0_8_0_gateway_pipeline_e2e.py tests/test_gateway.py tests/test_webhook_ingress.py tests/test_channel_identity.py tests/test_channel_allowlist.py tests/test_channel_policy.py tests/test_scheduled_automations.py tests/test_schedules.py tests/test_gateway_receipts.py tests/test_store.py -q
uv run pytest

v0.7.0 Goal Workflow

Operator experience gate.

0.7.0 ships a read-only operator surface for inspecting project state without reading raw logs. Each goal issue must ship implementation, tests, docs, validation, a green PR, merge, issue closure, and branch pruning before the next goal begins.

Area
Status
Goal issue
Dashboard / TUI decision
ready
#685 · CLI-first craik operator overview · read-only snapshot and formatter contract
Work graph explorer
ready
#686 · craik operator work-graph · terminal view and JSON export
Handoff viewer
ready
#687 · craik operator handoff · durable summary, risks, receipts, and next steps
Receipt viewer
ready
#688 · craik operator receipt · capability and plugin receipt inspection
Contradiction inbox
ready
#689 · craik operator contradictions · contradiction and operator-attention queue
Evidence and assumption views
ready
#690 · craik operator evidence · evidence and assumptions kept separate
Delegation queue
ready
#691 · craik operator delegations · human delegation inspection
Budget / quota view
ready
#692 · craik operator budget · explicit missing budget and quota data
Instruction distillation view
ready
#693 · craik operator instructions · sources, snapshots, provenance, proposals, and reviews
Quality gate view
ready
#694 · craik operator quality · handoff, evidence, critic, and red-team signals
Memory impact preview
ready
#695 · craik operator memory-impact · previewed durable-memory effects
Known traps view
ready
#696 · craik operator traps · known traps and negative knowledge
Run delta view
ready
#697 · craik operator run-delta · recovery and continuity deltas
Release readiness and docs assessment
ready
#698 · final 0.7.0 readiness record, changelog, and docs assessment

v0.7.0 Release Readiness

Status: ready for release prep.

All v0.7.0 operator-experience goals have shipped implementation, tests, docs, validation, a green PR, merge, issue closure, and branch pruning. No known release blockers remain as of 2026-05-21.

Gate
Status
Evidence
Implementation
ready
Read-only craik operator commands cover overview, work graph, handoffs, receipts, contradictions, evidence, delegations, budget, instructions, quality, memory impact, traps, and run deltas.
Tests
ready
Focused CLI and formatter coverage exists for each operator surface view, with full-suite validation required before release prep.
Docs
ready
Reference docs, generated CLI docs, roadmap, changelog, and this readiness record are updated to current docs quality.
Known blockers
none known
No unresolved v0.7.0 blocker is recorded after the goal workflow assessment.

v0.7.0 Validation Commands

Run the full release gate from a clean checkout before promoting 0.7.0 into release prep:

uv run python scripts/check_version_consistency.py
uv run ruff check
uv run mypy src
uv run python scripts/generate_cli_reference.py --check
uv run python scripts/check_doc_links.py
uv run python scripts/check_public_docs_hygiene.py
uv run python scripts/check_release_readiness.py
find src -name '*.py' -print0 | xargs -0 uv run python scripts/check_max_file_lines.py
uv run pytest tests/test_cli_operator_auth.py tests/test_cli_operator_scoping.py tests/test_operator_view_sanitization.py tests/test_release_readiness_guards.py -q
uv run pytest

v0.6.0 Goal Workflow

Skills, plugins, and ecosystem foundations gate.

0.6.0 ships reusable skill contracts and governed plugin ecosystem contracts without weakening Craik's no-ambient-authority runtime model. Each goal issue shipped implementation, tests, docs, validation, a green PR, merge, issue closure, and branch pruning before the next goal began.

Area
Status
Goal issue
Skill package format
ready
#659 · craik.skill_package · semantic package versions and no runtime authority
Project-scoped and global skills
ready
#660 · craik.skill_registry · active entry and precedence invariants
Context contracts for skills
ready
#661 · craik.skill_invocation_context · package context requirements and redacted invocation records
Plugin descriptor format
ready
#662 · craik.plugin_descriptor · trust boundary, capabilities, docs, security notes, and compatibility
Probationary plugins
ready
#663 · craik.plugin_probation · evidence-backed criteria and decisions before durable trust
Plugin capability grants
ready
#664 · craik.plugin_capability_grant · explicit operations, scoped targets, approvals, expiry, and authorization helper
Plugin receipts
ready
#665 · craik.plugin_receipt · redacted descriptor, grant, probation, evidence, and handoff links
Adapter packages
ready
#666 · craik.adapter_package · semantic versions, runner modes, Python/platform compatibility, docs, and provenance
Reference integrations
ready
#667 · craik.reference_integration · safe reproducible skill, plugin, and adapter examples
Community skills docs
ready
#668 · package, context, registry, review, and security guidance
Community plugins docs
ready
#669 · descriptors, probation, grants, receipts, adapters, references, and security guidance
Release readiness and docs assessment
ready
#670 · this release record, roadmap, changelog, and release automation hygiene

v0.6.0 Release Readiness

Ready for release prep.

The v0.6.0 skills, plugins, and ecosystem foundations surface is implemented in typed contracts, local-store persistence, validation helpers, operator-visible receipt formatting, reference documentation, and community guides. Release prep still owns the version bump, generated docs lockfile, release heading promotion, full release validation, tag creation, PyPI publish, docs deployment, and GitHub Release verification.

Area
Status
Validation
Runtime contracts
ready
skill_package · skill_registry · skill_invocation_context · plugin_descriptor · plugin_probation · plugin_capability_grant · plugin_receipt · adapter_package · reference_integration.
Docs
ready
Reference pages and guides cover skill packages, skill registries, skill contexts, plugin descriptors, plugin probation, plugin grants, plugin receipts, adapter packages, reference integrations, community skills, and community plugins. Sidebars and index navigation include the community guides.
Validation commands
passed
Focused slices passed across goal PRs. Final readiness validation passed uv run python scripts/check_version_consistency.py, uv run ruff check, uv run mypy src, uv run python scripts/check_doc_links.py, uv run python scripts/check_public_docs_hygiene.py, uv run python scripts/check_release_readiness.py, and full uv run pytest with loopback access enabled for local HTTP-server tests.
Security notes
reviewed
Skills remain instruction packages with no runtime authority. Plugin descriptors declare needs but grant nothing. Probation blocks durable trust until evidence-backed criteria and decisions pass. Plugin grants require explicit operations, scoped targets, expiry, and approval metadata. Denied, expired, and approval-required grants do not authorize execution. Plugin receipts must be redacted and cannot mark result metadata as unredacted.
Release automation hygiene
addressed
The publish workflow changelog extractor now accepts release headings with an em dash, en dash, or ASCII hyphen between the version and date, avoiding the v0.5.0 GitHub Release note extraction failure.
Release blocker state
none known
The v0.6.0 milestone has no open implementation goals other than this readiness issue. No critical release blocker is known before release-prep validation.
Release actions
pending release prep
Release prep must bump version declarations to 0.6.0, regenerate generated docs artifacts if required, promote the changelog heading to 0.6.0 — 2026-05-21, run the full release validation suite, create the signed tag, publish to PyPI, verify the docs site, and verify GitHub Release creation.

v0.5.0 Goal Workflow

Quality, continuity, and recovery gate.

0.5.0 starts with one goal issue for each roadmap capability plus a release-readiness issue. Each issue must ship implementation, tests, docs, and requirement validation before the milestone closes.

Area
Status
Goal issue
Recovery mode
ready
#636 · craik run recover · craik.recovery_session
Runtime critic
ready
#637 · craik.runtime_critic_finding
Red team mode
ready
#638 · craik.red_team_finding
Evidence coverage score
ready
#639 · craik.evidence_coverage_score
Handoff quality score
ready
#640 · craik.handoff_quality_score
Context debt tracking
ready
#641 · craik.context_debt_record
Evidence expiration rules
ready
#642 · attestation and freshness expiry checks
Tool result attestation
ready
#643 · craik.tool_result_attestation
Knowledge freshness probes
ready
#644 · craik.knowledge_freshness_probe
Scratchpad with expiry
ready
#645 · craik.scratchpad_record
Known traps
ready
#646 · craik.known_trap
Negative knowledge
ready
#647 · craik.negative_knowledge
First-class unknowns
ready
#648 · craik.unknown_record
Release readiness and docs assessment
ready
#649 · this release record
Structured context requests
ready
#650 · craik.context_request
Agent exit discipline
ready
#651 · craik.exit_discipline_check
What changed since last time deltas
ready
#652 · craik.run_delta

v0.5.0 Release Readiness

Remediated and release-ready.

The v0.5.0 quality, continuity, and recovery surface is implemented in typed contracts, local-store persistence, runtime helpers, operator views, capture CLI surfaces, recovery/delta operator gates, and reference documentation. The post-readiness remediation closed the gap between contract definitions and production capture paths. Release prep updates the version declarations, changelog heading, generated docs lockfile, package verification, tag checks, PyPI publish, docs deployment verification, and GitHub release verification.

Area
Status
Validation
Runtime contracts
ready
recovery_session · run_delta · runtime_critic_finding · red_team_finding · handoff_quality_score · evidence_coverage_score · context_debt_record · tool_result_attestation · knowledge_freshness_probe · scratchpad_record · known_trap · negative_knowledge · unknown_record · context_request · exit_discipline_check.
Operator surfaces
ready
Quality gate, known traps, negative knowledge, run delta, recovery, scratchpad, unknowns, context requests, context debt, critic findings, red-team findings, and exit-discipline states are formatted or captured without granting policy authority. Knowledge-resolution views distinguish unresolved records, verified receipt links, and missing or tampered receipt links.
CLI exercise path
ready
craik knowledge captures scratchpad, unknown, context-request, known-trap, and negative-knowledge records, and resolves unknown, context-request, and context-debt records with operator receipt linkage; craik review captures critic and red-team findings; craik run recover and craik run delta expose recovery and changed-since-last-time summaries from local durable state behind an active operator session.
Validation commands
passed
uv run pytest tests/test_v0_5_0_pipeline_e2e.py plus the focused readiness slice: uv run pytest tests/test_recovery.py tests/test_critics.py tests/test_quality_scores.py tests/test_context_debt.py tests/test_freshness.py tests/test_known_traps.py tests/test_scratchpad.py tests/test_exit_discipline.py tests/test_operator_views.py tests/test_store.py.
Security notes
reviewed
Critic and red-team findings are non-authoritative by default; freshness and evidence-expiry checks warn or block silent reliance but do not prove truth; scratchpad content expires instead of becoming project memory; negative knowledge requires evidence and scope; resolved unknowns, fulfilled context requests, and resolved context debt require operator receipt links; tool attestations and recovery sessions carry local HMAC integrity metadata; existing recovery/delta state requires an active operator session.
Release blocker state
none known
No critical v0.5.0 implementation blocker is known after remediation of the capture-layer readiness findings. Tagging is gated on the release-prep PR checks and version/tag validation.
Release actions
pending tag
0.5.0 version declarations, release notes, and docs lockfile are prepared in the release-prep branch. The release tag, PyPI package, docs deployment, and GitHub Release are verified after the release-prep PR lands.

v0.4.0 Release Readiness

Runtime instruction distillation gate.

0.4.0 lands the declared-instruction pipeline that turns project instruction files into typed, provenance-linked, reviewable constraints. Sources are registered explicitly, snapshots drive stale invalidation, extracted statements carry line/range provenance, categories and contradiction reports keep review queues explainable, approval receipts are required before a constraint becomes governing, and active constraints flow into case files and compiled prompts.

Area
Status
Release notes
Package version
shipped
pyproject.toml, src/craik/init.py, docs/package.json, and docs/package-lock.json declare 0.4.0.
Instruction source registry
ready
Projects can register declared instruction sources with typed source metadata, canonical paths, owner identity, path confinement, registry receipts, and project-scoped active source lists.
Source ingestion
ready
Markdown, Cursor rules, Codex, Copilot, and policy document sources parse into candidate statements without treating arbitrary repository Markdown as authority.
Source snapshots and stale invalidation
ready
Registered sources are hashed with normalized newlines and tracked as new, unchanged, changed, or missing; changed, missing, newly observed, or omitted sources defer derived proposals.
Line/range provenance
ready
Extracted statements persist deterministic provenance records with source ID, snapshot ID, path, line and column ranges, summaries, and excerpt hashes.
Instruction categorization
ready
Provenanced statements become reviewable proposals with deterministic category traceability across policy, security, boundary, command, instruction, handoff, memory, preference, and stale-risk classes.
Inter-source contradictions
ready
Normalized policy, boundary, command, instruction, and security-rule proposals open contradiction reports for cross-source conflicts while skipping same-source and stale deferred items.
Approval flow and receipts
ready
Proposals become governing only through explicit operator approval receipts; re-approval is idempotent, rejections are receipted, stale or contradicted approvals require override rationale, and active consumers exclude constraints whose approval receipt HMAC is missing or invalid.
Case-file and prompt-compilation integration
ready
Case files include deterministic governing distillation evidence, and compiled prompts render exactly one Active instruction constraints section with ordered items, provenance annotations, empty-state behavior, and stale-exclusion warnings.
Distillation CLI
ready
craik instructions register, ingest, list, approve, reject, and show expose source registration, pipeline execution, proposal review, approval decisions, rejection decisions, and provenance-aware item inspection through the active operator session.
Reference documentation
ready
docs/reference/instruction-sources.md, docs/reference/distilled-instructions.md, docs/reference/instruction-approval.md, and docs/guides/managing-instructions.md document the shipped v0.4.0 operator surface and link through the sidebars.
Release actions
complete
v0.4.0 is tagged, published to PyPI, and represented by the GitHub Release. The GitHub milestone is closed with zero open issues.

v0.4.0 Verification Commands

Run these before release prep and again before tagging:

uv run python scripts/check_version_consistency.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_doc_links.py
uv run python scripts/check_public_docs_hygiene.py
uv run pytest tests/test_instruction_sources.py tests/test_instruction_ingestion.py tests/test_instruction_provenance.py tests/test_instruction_distillation.py tests/test_instruction_invalidation.py tests/test_instruction_contradictions.py tests/test_instruction_promotion.py tests/test_instruction_runtime_context.py tests/test_instruction_workflow_docs.py tests/test_instruction_pipeline_e2e.py tests/test_case_files.py tests/test_prompts.py tests/test_contracts.py -q

v0.4.0 Security Notes

The v0.4.0 trust boundary is also documented in SECURITY.md.

  • Instruction sources must be registered explicitly and remain confined to the registered project root before ingestion.
  • Raw source files and distilled proposals are evidence, not authority; only governing constraints backed by approval receipts enter case files and compiled prompts.
  • Stale or contradicted approvals require an explicit override and rationale, and review receipts record whether stale or contradiction guards were bypassed.
  • Approval receipt HMACs, backed by an owner-only local secret, are verified before governing constraints are rendered into case files, onboarding context, handoffs, or compiled prompts.
  • Release workflows pin GitHub Actions to immutable SHAs and attest package provenance before PyPI publish.
  • Stale governing items are excluded from compiled prompt context and surfaced as distillation warnings instead of silent authority.
  • Contradiction detection opens reviewable reports for cross-source policy, boundary, command, instruction, and security-rule conflicts.

v0.3.0 Release Readiness

Multi-agent review and coordination gate.

0.3.0 lands the governed multi-agent surface: authenticated mailbox messages, intent-lock coordination across simultaneous runs, structured debate with adjudication, cross-agent review, human delegation pause and resume, scope-change decisions, and live work-graph coordination. The same release tightens the security boundary: identity-isolated handoff consumption, role-allowlist dispatch, operator-bound delegation resolution, and authenticated mailbox sends.

Area
Status
Release notes
Package version
ready
pyproject.toml, src/craik/init.py, docs/package.json, and docs/package-lock.json declare 0.3.0.
Multi-agent messaging
ready
Receipt-backed craik agent-message and local-store helpers send and receive authenticated typed messages linked to tasks, runs, handoffs, and roles. Senders are authenticated against the run's role state, message bodies are bounded, and same-subject repeats get unique IDs instead of overwriting.
Intent-lock coordination
ready
Overlapping active scopes on the same project block before new loop phases or tool dispatch and persist a denial receipt, so simultaneous runs cannot race the same intent lock.
Structured debate
ready
The debate runtime helper creates role-linked debate turns, summarizes agreement or disagreement, and resolves by adjudication receipt or human-delegation receipt.
Cross-agent review
ready
The review protocol helper creates receipted review requests for worker results, handoffs, or debate summaries and completes them with typed findings linked back to the reviewed artifacts.
Human delegation
ready
Runs can be interrupted with receipted delegation requests, resolved or cancelled by CLI, and resumed from the recorded response. Resolution requires resolver operator identity and rejects attempts to resume a paused run opened by another operator.
Scope-change protocol
ready
Discovered work outside the current intent lock interrupts the run, records a scope-change request receipt, and exposes craik scope-change decide for explicit expand, sibling-task, handoff, or denial decisions before continuing.
Live work graph
ready
Mailbox messages, reviews, debates, delegations, and scope-change artifacts persist work-graph events that can be queried as active coordination state.
Identity isolation
ready
Consuming a handoff records an explicit consumer credential and operator assignment, rejects producer identity reuse by default, and requires an explicit continuation flag plus rationale when reuse is intentional.
Handoff consumption
ready
craik task resume --from-handoff creates a follow-up task, case file, and pending run that record source handoff provenance while requiring an explicit consumer credential and operator identity.
Role-based dispatch
ready
craik run execute --role records a policy-checked specialist role assignment, dispatch receipt, and run-level role metadata. Role dispatch requires explicit role allowlists and gates runner overrides behind the role.runner.override policy capability.
Release actions
pending
Create immutable tag v0.3.0, run the protected publish workflow, then verify PyPI and docs after publication.

v0.3.0 Verification Commands

Run these before tagging:

uv run python scripts/check_version_consistency.py
uv run python scripts/check_release_version.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_release_tag.py --tag v0.3.0 --expected-version 0.3.0
uv run pytest tests/test_agent_mailbox.py tests/test_intent_lock_coordination.py tests/test_role_dispatch.py tests/test_scope_changes.py tests/integration/test_multi_agent_v030_flow.py -q

v0.3.0 Security Notes

  • Delegation resolution requires resolver operator identity and rejects attempts to resume a paused run opened by another operator.
  • Mailbox sends authenticate from_agent against the sender run's role state before storing the message or receipt.
  • Role dispatch requires explicit role allowlists and gates runner overrides behind the role.runner.override policy capability.
  • Mailbox message bodies are bounded and repeated same-subject messages receive unique IDs instead of overwriting the latest message.
  • Handoff consumption records an explicit consumer credential and operator assignment, rejects producer identity reuse by default, and requires an explicit continuation flag plus rationale when reuse is intentional.

v0.2.0 Release Readiness

Durable execution continuity gate.

0.2.0 hardens the provider-backed loop into durable execution: resumable phase boundaries, wall-clock and provider-token budgets, sandboxed shell tool dispatch, run recovery commands, tool-result attestations, and local-store migrations.

Area
Status
Release notes
Package version
ready
pyproject.toml, src/craik/init.py, docs/package.json, and docs/package-lock.json declare 0.2.0.
Resumable execution
ready
Interrupted runs reopen from persisted phase outputs, stable idempotency keys prevent duplicate phase output capture, and craik run resume continues unfinished provider-backed runs.
Budgets
ready
Per-run wall-clock budgets, provider token ledgers, and pre-dispatch time checks interrupt before additional provider calls or side effects when exhausted.
Sandboxed tool execution
ready
Configured shell tool calls execute through the local-process sandbox backend, propagate cancellation to in-flight commands, and record hashed tool-result attestations linked to side-effect receipts.
Recovery and observability
ready
craik run show, craik run cancel, craik run delta, and persisted exit-discipline checks expose continuity state and handoff readiness.
Storage migrations
ready
Local-store migrations now run through a registered, forward-only framework with compatibility fixtures, ordering tests, and migration failure guidance.
Release actions
pending
Create immutable tag v0.2.0, run the protected publish workflow, then verify PyPI and docs after publication.

v0.2.0 Verification Commands

Run these before tagging:

uv run python scripts/check_version_consistency.py
uv run python scripts/check_release_version.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_release_tag.py --tag v0.2.0 --expected-version 0.2.0
uv run pytest tests/test_loop.py tests/test_local_process_backend.py tests/test_loop_tool_dispatch.py tests/test_store.py tests/test_cli.py tests/test_handoffs.py tests/test_provider_runner.py -q

v0.2.0 Security Notes

  • Shell execution remains policy-gated and only registered command references are routed to the local-process sandbox backend.
  • The local-process backend avoids shell expansion and propagates cancellation to in-flight subprocesses.
  • Budget checks happen before provider calls and immediately before tool dispatch, preventing exhausted runs from producing new side effects.
  • Tool-result attestations hash redacted replay payloads and link each dispatched result to its side-effect receipt.

v0.1.0 Release Readiness

In-repo green.

All in-repo readiness gates are passing. The remaining work is the maintainer-driven v0.1.0 tag and the protected publication workflow.

Snapshot

Area
Status
Notes
Code health
green
CI, CodeQL, version checks, file-size budget, build, doctor all pass.
Test coverage
green
HTTP transport, credentials, OIDC, governance, redaction, handoffs.
Security hygiene
green
No leaked secret patterns. Operator and credential stores are file-locked, atomic, and owner-only.
Documentation
green
Roadmap, README, changelog, limitations, mvp docs, Docusaurus build.
Operational state
green
Milestones present. 22 issues closed for v0.1.0. No blockers open. Dependabot clear.
External release actions
pending
Tag and publish remain maintainer actions.

Code health

CI on main

Latest ci.yml run on main completed success: run 26010629626.

CodeQL

Latest codeql.yml run on main completed success: run 26010629612.

Code scanning

Zero open alerts via gh api repos/eidetic-labs/craik/code-scanning/alerts.

Version consistency

uv run python scripts/check_release_version.py.

File-size budget

find src -name "*.py" -print0 | xargs -0 uv run python scripts/check_max_file_lines.py.

craik --version

Prints 0.1.0 via uv run craik --version.

craik doctor

Runs to completion against a fresh CRAIK_HOME. An entirely empty home correctly reports missing local state.

Package artifacts

uv build produced dist/craik-0.1.0.tar.gz and dist/craik-0.1.0-py3-none-any.whl.

Test coverage

Area
Coverage
Test files
HTTP transport
integration
tests/integration/test_http_transport_round_trip.py
Credential sources
unit
API keys · local-CLI OAuth · CLI bridge · secret references · Stigmem references · marker / no-credential behavior · credential pools. Files: test_auth_api_key_source.py, test_auth_local_cli_oauth.py, test_auth_cli_bridge.py, test_auth_secret_ref.py, test_auth_profiles.py, test_auth_credential_pool.py, test_provider_runtime.py.
OIDC & workload identity
unit
Operator auth · session storage · GitHub Actions · Kubernetes · generic file/env tokens · RFC 8693 exchange. Files: test_oidc_operator.py, test_operator_session_store.py, test_workload_identity.py, test_oidc_exchange_secret_manager.py.
JWT hardening
unit
Rejects alg=none, unknown kid, tampered payloads, asymmetric/symmetric confusion (test_oidc_operator.py).
Governance behavior
unit
Credential-scoped receipts · operator-scoped receipts · policy-bound credentials · policy-bound operators · approval gates · expiry-as-risk · per-credential redaction · handoff identity isolation. Files: test_provider_runtime.py, test_policy.py, test_loop.py, test_case_files.py, test_redaction.py, test_handoffs.py.

Focused readiness set: ran the combined readiness subset with uv run pytest tests/integration/test_http_transport_round_trip.py tests/test_auth_api_key_source.py tests/test_auth_local_cli_oauth.py tests/test_auth_cli_bridge.py tests/test_auth_secret_ref.py tests/test_auth_profiles.py tests/test_auth_credential_pool.py tests/test_oidc_operator.py tests/test_operator_session_store.py tests/test_workload_identity.py tests/test_oidc_exchange_secret_manager.py tests/test_provider_runtime.py tests/test_policy.py tests/test_case_files.py tests/test_handoffs.py -q — all passed.

Security hygiene

Secret-pattern grep

No raw secret patterns in tests or scripts: grep -rE "sk-[a-zA-Z0-9]{20,}|xoxb-|ghp_|ANTHROPIC.{0,5}=.{20,}" tests/ scripts/.

Operator session file

Owner-only 0o600 writes in src/craik/runtime/auth/operator/store.py.

Auth profiles store

auth-profiles.json writes are file-locked and atomic via fcntl.flock + tempfile + os.replace in src/craik/runtime/auth/store.py.

Credential pool store

Pool writes are file-locked and atomic in src/craik/runtime/auth/pool.py.

Resolver errors

Reference-level error wording such as secret reference could not be resolved — never raw values.

Documentation

Roadmap gates

Exactly 12 release gates v0.1.0v0.12.0, no gaps.

Roadmap auth scope

docs/roadmap.md states v0.1.0 includes OIDC, pluggable credentials, operator + credential identity on receipts, policy-bound auth, approval-gated first use, expiry risk, per-credential redaction, handoff identity bookkeeping.

Changelog

CHANGELOG.md ## 0.1.0 - 2026-05-17 narrates Phase A and Phase B.

README

"What Works Today" names OIDC and typed credential profiles.

Auth on-ramp

docs/guides/authentication.md exists and is linked from docs/index.md. docs/guides/quickstart.md covers it.

Limitations honesty

docs/limitations.md no longer treats shipped auth capabilities as future work.

MVP docs

docs/mvp.md and docs/mvp-roadmap.md reflect the expanded v0.1.0 scope including OIDC and credential profiles.

Docs build

npm run build from docs/ succeeds.

Operational state

Milestones

v0.1.0v0.12.0 exist with titles matching the roadmap.

v0.1.0 milestone

22 closed issues · 0 open issues.

Blockers

No open PRs or open issues currently blocking the release.

Dependabot

Alert #1 fixed.

Tag posture

Tag v0.1.0 does not exist locally. Tagging is a maintainer release action and should happen only after this report is accepted.

External release actions

Action
Status
Notes
Create and push tag v0.1.0
pending
Maintainer action.
Run protected package publication workflow
pending
Maintainer action.
Optional live-provider smoke tests
pending
Require real provider credentials and an operator IdP. Fixture, cassette, and in-process socket paths are already validated in-repo.

What's next