Version: MVP

Release Readiness Validation

5 min readFor maintainersUpdated 2026-05-26

What you'll find here

The repository-owned readiness record for Craik releases. The current pre-release gate is 0.12.9; historical sign-offs remain below for audit continuity.

v0.12.9 TUI Cutover

Gateway backend cleanup follow-up is in progress.

The current post-cutover goal creates a local Gateway session boundary for audited prompt execution, JSONL stdio transport for TUI clients, CLI mirrors for raw prompt execution, structured model profiles, and regression tests for Claude Code marker routing and progress events. This work starts from checkpoint commit c6cd81d checkpoint: pre-backend-cleanup.

Gateway session

ready after this PR

runtime/backend/session.py owns raw prompt execution for provider and Claude Code marker paths and emits normalized lifecycle, working-state, progress, receipt, output, completion, and error events.

Event history

ready after this PR

Each audited prompt run persists normalized Gateway events as a redacted craik.run_output artifact, giving clients replayable model, lifecycle, receipt, output, and completion evidence.

Textual Gateway client

ready after this PR

Textual raw prompt submission now uses a local Gateway client and updates the activity panel from normalized Gateway events, while final transcript text preserves the model output for copy/export.

JSONL protocol

ready after this PR

craik tui-backend --jsonl accepts session.status, prompt.submit, slash.submit, model.set, approval.decide, run.interrupt, and close messages over stdio for Textual/Rust/frontend evaluation.

Slash/CLI mirrors

ready after this PR

/run <prompt> and craik run prompt <prompt> share the audited Gateway path; backend-affecting slash mirrors are covered by regression tests.

Model profiles

ready after this PR

craik model set keeps legacy selectors while persisting provider/model profile metadata, display labels, backend preference, common provider options, and provider-specific passthrough knobs. Gateway prompt runs now pass the active profile options into provider runtime requests for OpenAI, Anthropic, Chat Completions, and Gemini payloads.

TUI evaluation fixtures

ready after this PR

Gateway JSONL replay fixtures and summary helpers provide a shared evaluation contract for Textual and Rust clients. The Rust ratatui replay prototype under crates/craik-tui-rs parses the same fixture and verifies lifecycle, working-state, run, task, receipt, and progress rendering.

Gateway Cleanup Validation Commands

uv run pytest tests/test_backend_gateway_session.py tests/test_backend_jsonl.py tests/test_slash_cli_mirrors.py
uv run pytest tests/test_gateway_replay.py
cargo test --manifest-path crates/craik-tui-rs/Cargo.toml
uv run pytest tests/test_v010_agent_shell.py tests/test_v011_tui.py tests/test_v0122_slash_inline_execution.py tests/test_v0122_textual_app.py tests/test_v0123_multiline_input_methods.py tests/test_v0125_tui_polish.py tests/test_v0127_anthropic_claude_cli.py tests/test_provider_runner.py tests/test_provider_runtime.py tests/test_cli.py
uv run python scripts/generate_cli_reference.py --check

v0.12.9 completes the runtime consumption path for the CLI/TUI command contract.

0.12.9 wires the three TUI entry points to the contract dispatcher, routes Textual slash output through the new renderer pipeline, replaces the legacy dispatcher implementation with a compatibility shim, replaces the hand-maintained slash metadata tuple with the live AutoSlashRegistry, routes prompt-backed commands through canonical modals, and adds runtime-consumption CI guards so future contract infrastructure cannot ship without live TUI use.

v0.12.9 Acceptance Status

Runtime dispatch

ready

textual_app.py, tui.py, and agent_shell.py dispatch slash commands through craik.runtime.contract.dispatch.

Registry consumption

ready

The TUI runtime consumes a cached AutoSlashRegistry from the live Typer app plus shell-only slash built-ins.

Renderer consumption

ready

Textual slash output and inline actions render through format_command_result(..., kind="tui").

Canonical modals

ready

/auth login, /auth logout, and /receipts detail use canonical-composed modal screens under runtime/shell/modals/; the legacy textual_modals.py file is removed.

Prompt runtime

ready

interactive_prompts metadata now drives runtime behavior by intercepting typer.confirm and typer.prompt during contract-dispatch callback invocation.

Regression guards

ready

check_contract_dispatch_consumed.py, check_no_legacy_modal_pushes.py, check_no_slash_command_specs_consumption.py, and check_interactive_prompts_runtime_consumed.py verify runtime imports, block legacy cutover APIs, and assert contract-dispatch invocation across TUI entry points.

v0.12.9 Validation Commands

Run the standard release gate from a clean checkout before signed tag creation:

uv run python scripts/check_version_consistency.py
uv run python scripts/check_release_version.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_cli_tui_contract.py
uv run python scripts/check_command_result_return.py
uv run python scripts/check_no_direct_stdout.py
uv run python scripts/check_contract_dispatch_consumed.py
uv run python scripts/check_no_legacy_modal_pushes.py
uv run python scripts/check_no_slash_command_specs_consumption.py
uv run python scripts/check_interactive_prompts_runtime_consumed.py
uv run python scripts/check_modal_screen_mappings.py
uv run python scripts/check_modal_screen_security.py
uv run python scripts/check_payload_shape_validity.py
uv run python scripts/check_next_actions_validity.py
uv run python scripts/check_format_flag_coverage.py
uv run python scripts/generate_snapshots.py /status --name status --width 80 --check
uv run python scripts/check_snapshot_coverage.py
uv run python scripts/check_dead_code.py
uv run python scripts/check_oauth_callback_safety.py
uv run python scripts/check_dock_bottom_snapshot_coverage.py
uv run python scripts/check_text_selection_wiring.py
python3 scripts/check_codebase_brand_hygiene.py
uv run python scripts/check_slash_command_registry.py
uv run python scripts/check_changed_file_strictness.py
uv run python scripts/check_public_docs_hygiene.py
uv run python scripts/check_doc_links.py
uv run python scripts/generate_cli_reference.py --check
uv run ruff check
uv run mypy
uv run pytest
uv run pytest --cov=craik --cov-report=term --cov-fail-under=80
(cd docs && npm run build)

v0.12.8 CLI/TUI Contract

v0.12.8 makes the shared command contract release-ready across CLI, slash, JSON, and TUI surfaces.

0.12.8 keeps migrated commands on the CommandResult contract, prevents strictly migrated shared callbacks from writing directly to stdout during slash dispatch, adds canonical modal metadata for prompt-backed commands, expands slash snapshots across the standard width matrix, and adds CI gates so future migrations stay aligned with the canonical renderer flow.

v0.12.8 Acceptance Status

Command contract

ready

Migrated CLI/TUI callbacks declare @craik_command, return CommandResult, and use shared renderers.

Interactive prompts

ready

Current prompt-backed commands declare interactive_prompts metadata and modal mapping guards reject drift.

Snapshot coverage

ready

Table, card, and card-list slash command snapshots cover widths 60, 80, 100, 120, 160, and 200.

Structural guards

ready

Payload shape, NextAction target, format coverage, direct stdout, modal, return annotation, CLI/TUI metadata, and snapshot guards run in CI.

Legacy command posture

ready

Commands that cannot safely return CommandResult are documented with explicit legacy markers.

v0.12.8 Validation Commands

Run the standard release gate from a clean checkout before signed tag creation:

uv run python scripts/check_version_consistency.py
uv run python scripts/check_release_version.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_cli_tui_contract.py
uv run python scripts/check_command_result_return.py
uv run python scripts/check_no_direct_stdout.py
uv run python scripts/check_modal_screen_mappings.py
uv run python scripts/check_modal_screen_security.py
uv run python scripts/check_payload_shape_validity.py
uv run python scripts/check_next_actions_validity.py
uv run python scripts/check_format_flag_coverage.py
uv run python scripts/generate_snapshots.py /status --name status --width 80 --check
uv run python scripts/check_snapshot_coverage.py
uv run python scripts/check_dead_code.py
uv run python scripts/check_oauth_callback_safety.py
uv run python scripts/check_dock_bottom_snapshot_coverage.py
uv run python scripts/check_text_selection_wiring.py
python3 scripts/check_codebase_brand_hygiene.py
uv run python scripts/check_slash_command_registry.py
uv run python scripts/check_changed_file_strictness.py
uv run python scripts/check_public_docs_hygiene.py
uv run python scripts/check_doc_links.py
uv run python scripts/generate_cli_reference.py --check
uv run ruff check
uv run mypy
uv run pytest
uv run pytest --cov=craik --cov-report=term --cov-fail-under=80
(cd docs && npm run build)

v0.12.7 Provider OAuth Suite

v0.12.7 completes the provider-specific OAuth adoption path that is usable without private client registrations.

0.12.7 adds OAuth auth-profile contracts, one-shot loopback PKCE helpers, callback-safety CI coverage, OpenAI browser PKCE OAuth, Anthropic Claude CLI delegation, Gemini/Vertex ADC and service-account login through google-auth, provider-specific header handling, OAuth status metadata, billing-surface status metadata, and explicit craik auth login <provider> --mode=api-key|oauth|claude-cli selection.

v0.12.7 Validation Commands

Run the standard release gate from a clean checkout before signed tag creation:

uv run python scripts/check_version_consistency.py
uv run python scripts/check_release_version.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_cli_tui_contract.py
uv run python scripts/check_command_result_return.py
uv run python scripts/check_no_direct_stdout.py
uv run python scripts/generate_snapshots.py /status --name status --width 80 --check
uv run python scripts/check_snapshot_coverage.py
uv run python scripts/check_dead_code.py
uv run python scripts/check_oauth_callback_safety.py
uv run python scripts/check_dock_bottom_snapshot_coverage.py
uv run python scripts/check_text_selection_wiring.py
python3 scripts/check_codebase_brand_hygiene.py
uv run python scripts/check_slash_command_registry.py
uv run python scripts/check_changed_file_strictness.py
uv run python scripts/check_public_docs_hygiene.py
uv run python scripts/check_doc_links.py
uv run python scripts/generate_cli_reference.py --check
uv run ruff check
uv run mypy
uv run pytest
uv run pytest --cov=craik --cov-report=term --cov-fail-under=80
(cd docs && npm run build)

v0.12.6 Inherited Surface Sweep

v0.12.6 closes the inherited-surface sweep for channel ingress, slash-command payload shape, and TUI text selection.

0.12.6 sanitizes normalized messaging-channel text before runtime boundaries, extends the slash-command registry guard to execute structured payload smoke commands, and keeps TUI text-selection behavior documented, styled, and covered by a release-readiness guard.

v0.12.6 Validation Commands

Run the standard release gate from a clean checkout before signed tag creation:

uv run python scripts/check_version_consistency.py
uv run python scripts/check_release_version.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_dead_code.py
uv run python scripts/check_dock_bottom_snapshot_coverage.py
uv run python scripts/check_text_selection_wiring.py
python3 scripts/check_codebase_brand_hygiene.py
uv run python scripts/check_slash_command_registry.py
uv run python scripts/check_changed_file_strictness.py
uv run python scripts/check_public_docs_hygiene.py
uv run python scripts/check_doc_links.py
uv run ruff check
uv run mypy
uv run pytest
uv run pytest --cov=craik --cov-report=term --cov-fail-under=80
(cd docs && npm run build)

v0.12.5 TUI Polish and Release Guards

v0.12.5 completes the terminal UI polish, receipt verification, and release-guard hardening pass for the v0.12 train.

0.12.5 ships bottom-stack TUI ordering, text-selection support, inline subcommand listings, TUI-shaped next-action guidance, standalone receipt verification, coverage publishing, positioning documentation, AST-bound dock-bottom snapshot coverage, all-theme TCSS dock scanning, and an 80% coverage floor for release coverage publication.

v0.12.5 Validation Commands

Run the standard release gate from a clean checkout before signed tag creation:

uv run python scripts/check_version_consistency.py
uv run python scripts/check_release_version.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_dead_code.py
uv run python scripts/check_dock_bottom_snapshot_coverage.py
python3 scripts/check_codebase_brand_hygiene.py
uv run python scripts/check_slash_command_registry.py
uv run python scripts/check_changed_file_strictness.py
uv run python scripts/check_public_docs_hygiene.py
uv run python scripts/check_doc_links.py
uv run ruff check
uv run mypy
uv run pytest
uv run pytest --cov=craik --cov-report=term --cov-fail-under=80
(cd docs && npm run build)

v0.12.4 Command Surface Ergonomics

v0.12.4 completes the slash-command ergonomics and TUI hardening pass for the v0.12 train.

0.12.4 ships centralized slash-command schema metadata, structured inline result rendering, argument-aware help and validation, current-session transcript search, receipt detail modals with integrity status, bounded toast notifications, destructive-action confirmations, inline action-key dispatch, and release guards for command metadata and TUI brand hygiene.

v0.12.4 Validation Commands

Run the standard release gate from a clean checkout before signed tag creation:

uv run python scripts/check_version_consistency.py
uv run python scripts/check_release_version.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_dead_code.py
uv run python scripts/check_dock_bottom_snapshot_coverage.py
python3 scripts/check_codebase_brand_hygiene.py
uv run python scripts/check_slash_command_registry.py
uv run python scripts/check_changed_file_strictness.py
uv run ruff check
uv run mypy src/craik
uv run pytest
(cd docs && npm run build)

v0.12.3 completes the interactive shell refinement pass for the v0.12 train.

0.12.3 ships terminal status-bar usage and quota indicators, reverse history search, multi-line input alternatives, audited ! shell invocations, session naming, theme controls, MCP discovery, CLI mistype recovery, and TUI aesthetic polish.

v0.12.3 Validation Commands

Run the standard release gate from a clean checkout before signed tag creation:

uv run python scripts/check_version_consistency.py
uv run python scripts/check_release_version.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_dead_code.py
python3 scripts/check_codebase_brand_hygiene.py
uv run ruff check
uv run mypy
uv run pytest
(cd docs && npm run build)

v0.12.2 Canonical TUI Release

v0.12.2 is the canonical interactive-runtime release for the v0.12 train.

0.12.2 ships the Textual-based TUI, slash-command inline execution, history and completion support, modal auth and approval flows, privacy documentation, and the codebase brand-hygiene release guard.

v0.12.2 Validation Commands

Run the standard release gate from a clean checkout before signed tag creation:

uv run python scripts/check_version_consistency.py
uv run python scripts/check_release_version.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_dead_code.py
python3 scripts/check_codebase_brand_hygiene.py
uv run ruff check
uv run mypy
uv run pytest
(cd docs && npm run build)

v0.12.1 Patch Release

v0.12.1 is a patch-readiness release for the v0.12 train.

0.12.1 incorporates the post-release remediation PRs for README current-state accuracy, provider auth and health-check UX, gateway and doctor readiness, MCP validation, migration apply reporting, localization polish, and CodeQL cleanup.

v0.12.1 Validation Commands

Run the standard release gate from a clean checkout before signed tag creation:

uv run python scripts/check_version_consistency.py
uv run python scripts/check_release_version.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_dead_code.py
uv run ruff check
uv run mypy src
uv run pytest -q

v0.12.0 Goal Workflow

Structural CI guard generalization starts the v0.12.0 release train.

0.12.0 begins with G0 before functional migration, internationalization, and compatibility work. G0 strengthens the release-readiness writer-coverage guard with qualified call resolution, explicit dynamic-dispatch allowlisting, and a complementary dead-code scan.

Area

Status

Goal issue

G0 structural CI guard generalization

ready after this PR

#819 · release-readiness writer coverage resolves calls through qualified function paths and import maps, registry-dispatched callables use a capped documented allowlist, and scripts/check_dead_code.py runs vulture at confidence 80 as a complementary dead-code gate

G1 adjacent runtime migration CLI

ready after this PR

#827 · craik migrate inspect, plan, and dry-run import inspect adjacent agent-runtime JSON exports, map source records to proposed Craik target schemas, preserve source files, support text/JSON output, and report skipped secret-like fields without copying values

G2 migration maps

ready after this PR

#828 · object-level migration maps classify agents, profiles/personas, provider/model config, aliases, fallback chains, channels, skills, memory, sessions, schedules, sandbox, gateway, and approval/security posture as importable, partial, manual, unsupported, or skipped-secret with explicit target Craik objects and operator actions

G3 migration reports

ready after this PR

#826 · craik migrate report emits deterministic safe-to-share review artifacts with summary counts, importable objects, manual actions, skipped secrets, security posture changes, unsupported capabilities, recommended next commands, validation checklist items, and source-to-target links without raw secret values

G4 secret migration

ready after this PR

#825 · secret inventory detects nested secret-like fields without logging values, dry-run receipts write nothing by default, optional keyring import requires operator confirmation and a secure backend, file fallback blocks import, and SECURITY.md documents the trust boundary

G5 compatibility fixture suite

ready after this PR

#829 · public-safe adjacent-runtime fixtures exercise provider config, fallback chains, profiles/personas, channels, sessions, memory, skills, schedules, sandbox, gateway, approval posture, invalid JSON warnings, and redaction assertions without real operator secrets

G6 MCP server and client compatibility

ready after this PR

#831 · MCP server compatibility mode exposes safe read tools first, gated write tools require policy and receipts, JSON-RPC smoke handling returns structured denials, and MCP client import/export redacts external config and secret-like environment values

G7 session export/import compatibility

ready after this PR

#830 · portable session exports preserve source identity, redact session events, import adjacent transcript shapes as stopped imported sessions, and report unsupported tool calls as evidence instead of executable authority

G8 agent/client protocol bridge

ready after this PR

#832 · bridge decisions block missing operator auth, policy envelopes, capability grants, receipts, redaction, instruction elevation, and unbounded tools; the first local bridge adapter emits redacted receipts for allowed calls and no receipts for denials

G9 i18n and localized operator surfaces

ready after this PR

#833 · stable message ids back localized operator text, CRAIK_LOCALE and --locale configure text output, missing translations fall back predictably, slash help and migration report headings localize, and translation contribution rules preserve machine-readable semantics

G10 capture-and-cache auth UX

ready after this PR

#834 · craik auth login captures provider keys through hidden prompt, writes keyring-ref profiles, reports backend-aware health, removes cached credentials on logout, migrates env-var profiles with consent, and shares auth status across slash commands, TUI, dashboard, and readiness state

v0.12.0 Validation Commands

Run the structural gate from a clean checkout before starting functional v0.12.0 implementation goals:

uv run python scripts/check_release_readiness.py
uv run python scripts/check_dead_code.py
uv run pytest tests/test_release_readiness_guards.py tests/test_dead_code_check.py tests/test_auth_capture_and_cache.py -v

v0.11.0 Goal Workflow

Product surfaces and channel operations are ready for release prep.

0.11.0 moves Craik beyond command-only operation with a terminal UI, authenticated dashboard, desktop companion, service-managed gateway, real channel adapters, approval UX, diagnostics/update workflows, and multimodal companion contracts. Each goal lands through its own issue and PR before release prep; all v0.11.0 implementation goal issues are closed.

Area

Status

Goal issue

Release signing-key asset rules

ready

#787 · signed annotated tag rules, GitHub Release public signing-key asset, and fingerprint verification gate

Release signing-key and URL-scheme remediation

ready

#812 · local signing-key exports are ignored, public signing keys remain GitHub Release assets, and future craik:// handlers are documented as review-only with no direct mutating actions

Terminal UI

ready

#788 · craik --tui, craik tui, shared slash commands, multiline composer, status/model/session/approval/artifact/gateway/skill panels, autocomplete metadata, redacted approval modal fixture, tests, and guide docs

Authenticated local dashboard

ready

#789 · craik dashboard, local-only default bind, token or active operator-session auth, status/provider/session/run/handoff/receipt/approval/gateway/skill/model pages, shared slash-command action route, route tests, and security docs

Dashboard action remediation

ready

#808 · dashboard action POSTs enforce local Origin checks for browser requests and reject mutating slash-command families from the generic read-only action endpoint

Dashboard session binding remediation

ready

#809 · tokenless dashboard mode requires X-Craik-Operator-Session to match the active operator session token, and preview output warns about the required header

Follow-up dashboard, service, and Discord remediation

ready after this PR

#821 · dashboard session binding uses a random per-session token instead of JWT jti, systemd gateway units tolerate executable paths with spaces, and Discord webhook diagnostics distinguish verifier unavailability from invalid signatures

Desktop companion MVP

ready

#790 · craik desktop status, menu, action, approval notification, and update-check surfaces for local dashboard launch, gateway command actions, provider/auth health, doctor, and redacted notification deep links

Gateway service lifecycle

ready

#791 · craik gateway install, uninstall, status, logs, doctor, stop, and restart with launchd/systemd generation, Windows plan, stale pid recovery, and log discovery

Gateway service executable remediation

ready

#810 · generated launchd, systemd, and Windows service-plan output uses the absolute craik executable path resolved at install time

Real channel adapters

ready

#792 · WebChat, Telegram, Discord, and Slack adapter contracts, setup and doctor commands, secret-reference plans, inbound normalization, pairing/allowlist policy gates, redacted outbound delivery receipts, and channel security docs

Channel persistence remediation

ready

#807 · channel setup writes adapter contracts, pairings, allowlists, and policy envelopes through production CLI paths; webhook ingress receipts use shared persistence; readiness checks writer reachability through wrappers

Channel webhook signature remediation

ready

#811 · webhook ingress verifies WebChat/Craik HMAC, Slack signatures and replay timestamps, Telegram secret-token headers, and fail-closed Discord native signature support warnings

Approval UX

ready

#793 · /approvals, craik approvals list, show, approve, and deny, dashboard queue payloads, TUI approval modal/counts, desktop notifications, decision receipts, retry-path linkage, and lifecycle tests

Doctor, fix, and update

ready

#794 · craik doctor, craik doctor --fix, craik update --check, expanded operator/provider/model/gateway/channel/security diagnostics, explicit dry-run fix plans, unsafe fix confirmation, JSON output, and fixture tests

Multimodal and companion contracts

ready

#795 · voice posture, speech-to-text and text-to-speech adapter contracts, multimodal artifact references, mobile/desktop/visual companion decisions, work-graph visual bridge, accessibility requirements, transcript/media metadata redaction, and fixture tests

Release prep

ready after this PR

#823 · version declarations, changelog promotion, roadmap/readiness documentation, package-lock metadata, and release validation gate before maintainer signing

v0.11.0 Release Readiness

Status: ready for maintainer signing after release-prep PR lands.

The v0.11.0 milestone contains the TUI, authenticated dashboard, desktop companion, gateway service lifecycle, real channel adapters, approval UX, doctor/update workflow, multimodal companion contracts, and all follow-up remediation. Tagging remains gated on the final release-prep validation and maintainer-managed GPG signing.

Gate

Status

Evidence

Implementation

ready

Operator-facing TUI, dashboard, desktop, gateway, channel, approval, diagnostic, update, and multimodal surfaces landed through issue-linked PRs and follow-up remediation.

Security

ready

Dashboard session binding uses a random per-session token, dashboard actions reject mutating slash commands and enforce local Origin checks, gateway service units use absolute executable paths, channel signatures verify platform-specific boundaries, and release key asset rules are documented.

Structural guards

ready

Release readiness resolves writer reachability through qualified import-aware call graphs and the complementary dead-code check runs vulture at confidence 80.

Known blockers

none known

No open v0.11.0 milestone issue remains. Tagging is gated on the release-prep PR, signed annotated tag creation, GitHub Release publication, and signing-key asset verification.

v0.11.0 Validation Commands

Run the full release gate from a clean checkout before tagging 0.11.0:

uv run python scripts/check_version_consistency.py
uv run python scripts/check_release_version.py
uv run ruff check
uv run mypy
uv run python scripts/generate_cli_reference.py --check
uv run python scripts/check_doc_links.py
uv run python scripts/check_public_docs_hygiene.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_dead_code.py
uv run python scripts/check_changed_file_strictness.py
uv run python scripts/check_max_file_lines.py
uv run python scripts/quickstart_smoke.py
uv run pytest

v0.10.0 Goal Workflow

Agent shell and setup UX gate is ready for release prep.

0.10.0 turns the root CLI into a usable agent shell, adds progressive setup guidance before auth is configured, introduces browser-assisted provider login, and exposes model, session, profile, usage, and learning-loop controls through operator-facing commands. The implementation goal landed through a green PR tied to the v0.10.0 milestone before release prep begins.

Area

Status

Goal issue

Agent shell and setup UX

ready

#779 · root shell launch, craik chat, slash commands, readiness states, provider login, model/session/profile controls, and learning-loop command surfaces

Release prep

ready after this PR

#781 · version declarations, changelog, roadmap, release-readiness docs, generated references, and validation gate

v0.10.0 Release Readiness

Status: ready for release checks after release-prep PR lands.

The v0.10.0 milestone contains the agent shell, progressive setup states, browser-assisted provider login, secure credential-storage posture reporting, model/session/profile UX, usage summaries, and learning-loop controls. Release prep owns the final version bump, changelog promotion, signed tag, package publication, docs publication, and post-release verification.

Gate

Status

Evidence

Implementation

ready

The root craik shell can launch before auth, one-shot chat works through craik chat and craik --one-shot, readiness states explain setup gaps, and slash commands route users toward setup, auth, provider, model, session, approval, and doctor actions.

Auth and credentials

ready

craik auth login provides browser-assisted provider setup for hosted providers and guided fallback for local models while keeping output redacted and surfacing credential-storage posture.

Runtime UX

ready

Model aliases, fallbacks, status, probes, session list/show/resume/rename/export/prune/delete, local profiles, usage summaries, and insight summaries are exposed through command surfaces and generated CLI reference docs.

Learning controls

ready

Skill telemetry, proposals, eval, promote, rollback, and history commands preserve the policy boundary: agents can propose improvement paths, but promotion remains an operator-governed action.

Tests

ready

Focused v0.10 shell, readiness, slash-command, provider-login, model/session/profile, learning-loop, release-readiness guard, and runtime-layout tests landed with the implementation PR. Full local validation passed with loopback access enabled for HTTP-server tests.

Docs

ready

Agent shell, model/session/profile UX, readiness states, slash commands, authentication, quickstart, learning-loop, roadmap, generated CLI reference, and release-readiness docs are updated.

Known blockers

none known

No unresolved v0.10.0 implementation blocker remains. Tagging is gated on release-prep validation and maintainer signing.

v0.10.0 Validation Commands

Run the full release gate from a clean checkout before tagging 0.10.0:

uv run python scripts/check_version_consistency.py
uv run ruff check
uv run mypy
uv run python scripts/generate_cli_reference.py --check
uv run python scripts/check_doc_links.py
uv run python scripts/check_public_docs_hygiene.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_dead_code.py
uv run python scripts/check_changed_file_strictness.py
uv run python scripts/check_max_file_lines.py
uv run python scripts/quickstart_smoke.py
uv run pytest

Release Signing-Key Asset Gate

Every release must be cut from a signed annotated tag and must publish the ASCII-armored public release signing key on the matching GitHub Release as craik-release-signing-key.asc. This asset is the public key operators can import or inspect; it is not a detached tag signature. The embedded Git tag signature remains the integrity check for the release commit.

git tag -v vX.Y.Z
git ls-remote --tags origin vX.Y.Z
gpg --show-keys --with-fingerprint craik-release-signing-key.asc
gh release view vX.Y.Z --repo eidetic-labs/craik --json assets --jq '.assets[].name'

Release prep is not complete until the fingerprint shown for craik-release-signing-key.asc matches the key reported by git tag -v vX.Y.Z.

v0.9.0 Goal Workflow

Persistent agent runtime gate is ready for release prep.

0.9.0 adds provider-backed persistent sessions, guided provider setup, Gemini and local model routes, provider certification, explicit failure recovery, and a deterministic persistent-agent launch demo. All implementation goals landed through green PRs tied to the v0.9.0 milestone before release prep begins.

Area

Status

Goal issue

Provider setup

ready

#737 · guided setup for OpenAI, Anthropic, Gemini, and local models

Gemini runtime

ready

#738 · Gemini generateContent adapter, docs, and runner matrix coverage

Local model presets

ready

#740 · Ollama, LM Studio, vLLM, and generic OpenAI-compatible presets

Persistent prompt loop

ready

#741 · persistent session prompt execution with events, receipts, and handoff links

Provider certification matrix

ready

#742 · generated matrix for hosted, local, and fixture provider routes

Failure recovery

ready

#743 · stale pid/endpoint, auth, provider, sandbox, reconnect, and resume states

Persistent launch demo

ready

#744 · deterministic launch, prompt, receipts, handoff, and status demo

Security and sandbox boundaries

ready

#745 · persistent-agent state boundaries, redacted inspection surfaces, environment receipt links, and denied side-effect receipts

Readiness docs reconciliation

ready after this PR

#756 · roadmap and release-readiness docs reflect the closed v0.9.0 goal workflow

CLI auth coverage remediation

ready after this PR

#759 · release-readiness auth guard now scans every cli_*.py module and stateful CLI commands require the canonical operator session check, with explicit bootstrap/demo/CI policy-test exemptions

v0.9.0 Release Readiness

Status: ready for pre-release checks after docs reconciliation lands.

The v0.9.0 milestone now contains provider setup, Gemini runtime support, local model presets, provider-backed persistent sessions, provider certification, explicit failure recovery, a launch demo, persistent-agent security boundaries, MCP routing decisions, sandbox backend tracking, browser/tool boundary tracking, sandbox policy validation, and environment capability receipts. Release prep remains responsible for the version bump, final changelog section, signed tag, publish, and post-publish verification.

Gate

Status

Evidence

Implementation

ready

Persistent sessions can launch, prompt providers, persist events, carry receipt and handoff links, recover from stale process, auth, provider, and sandbox states, and route through OpenAI, Anthropic, Gemini, local, and fixture providers.

Tests

ready

Focused coverage exists for provider setup, Gemini transport, local presets, persistent prompt execution, provider certification, recovery, launch demo behavior, environment receipt linkage, and denied side-effect receipts.

Docs

ready

Persistent-agent runtime, authentication, local model setup, provider routing, provider certification, persistent-agent security, execution-environment security, and environment receipt docs are updated.

Security

ready

Session inspection uses redacted views, persistent agents retain references instead of secrets, environment receipts link provider and sandbox boundaries to sessions, and missing side-effect grants produce denial receipts.

Milestone provenance

ready

Backfill issues #765 through #773 link MCP, sandbox, browser boundary, sandbox policy, and environment receipt tiles to implementation evidence.

Known blockers

none known

No unresolved v0.9.0 implementation blocker remains. Tagging is gated on release-prep validation.

v0.9.0 Validation Commands

Run the full release gate from a clean checkout before promoting 0.9.0 into release prep:

uv run python scripts/check_version_consistency.py
uv run ruff check
uv run mypy
uv run python scripts/generate_cli_reference.py --check
uv run python scripts/check_doc_links.py
uv run python scripts/check_public_docs_hygiene.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_changed_file_strictness.py
uv run pytest tests/test_provider_cli.py tests/test_provider_runtime.py tests/test_local_model_presets.py tests/test_agent_sessions.py tests/test_agent_session_integrity.py tests/test_provider_certification.py tests/test_cli_agents.py tests/test_cli_operator_auth.py tests/test_release_readiness_guards.py tests/test_demos.py tests/test_environment_receipts.py tests/test_sandbox_policy_boundaries.py -q
uv run pytest

v0.8.0 Goal Workflow

Operator integrations and always-on gateway gate.

0.8.0 ships a foreground gateway daemon, setup/diagnostics/update operator commands, channel contracts, messaging fixture ingress, identity pairing, allowlists, channel policy envelopes, webhook validation, scheduled automations, and gateway receipts. Each remediation issue shipped implementation, tests, docs, validation, a green PR, merge, issue closure, and branch pruning before release prep.

Area

Status

Goal issue

Runnable gateway daemon and runtime state

ready

#713 · craik gateway start, pid-file lock, /health, and persisted lifecycle transitions

Gateway/channel contract persistence

ready

#723 · typed local-store helpers for adapter contracts, pairings, allowlists, schedules, automations, policies, and gateway receipts

Webhook hardening

ready

#715 · body cap, JSON depth cap, timestamp freshness, normalized signature headers, and persistent replay detection

Operator auth for diagnostics/reconfiguration

ready

#719 · active operator session required before reading local diagnostics or reconfiguring an existing setup

Schedule throttle

ready

#718 · cron-like schedules reject runs more frequent than every five minutes

Pairing token expiry

ready

#721 · channel identity pairings require and enforce expiry

Public-bind TLS warning and override

ready

#722 · public gateway binds require policy plus explicit insecure-public acknowledgement

Release-readiness structural guards

ready

#720 · writer and operator-auth guard coverage widened for v0.8.0 surfaces

End-to-end gateway pipeline test

ready

#717 · integrated daemon, webhook, channel policy, receipt, schedule, and store coverage

Readiness, security, and changelog reconciliation

ready after this PR

#714 · docs now reflect shipped behavior and residual limitations

Release prep

pending

#716 · version bump, final changelog section, tag, publish, and verification remain the final gate

v0.8.0 Release Readiness

Status: ready for release prep after docs reconciliation lands.

All implementation remediation goals are closed. The shipped daemon is a foreground local service with health checks and persisted lifecycle state; hosted public operation, production dispatch loops, and broad third-party channel adapters remain future surfaces.

Gate

Status

Evidence

Implementation

ready

Gateway setup, diagnostics, foreground daemon, webhook validation, messaging fixture ingress, identity pairing, allowlists, policy envelopes, schedules, automations, and gateway receipts are implemented and persisted.

Tests

ready

Focused unit coverage exists for each surface plus tests/test_v0_8_0_gateway_pipeline_e2e.py for the integrated flow.

Security

ready

Webhook payload limits, timestamp/replay checks, operator session gates, pairing expiry, schedule throttling, and public-bind policy/TLS acknowledgement are covered.

Known blockers

none known

No unresolved v0.8.0 implementation blocker remains. Tagging is gated on #716 release-prep validation.

v0.8.0 Validation Commands

Run the full release gate from a clean checkout before promoting 0.8.0 into release prep:

uv run python scripts/check_version_consistency.py
uv run ruff check
uv run mypy
uv run python scripts/generate_cli_reference.py --check
uv run python scripts/check_doc_links.py
uv run python scripts/check_public_docs_hygiene.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_changed_file_strictness.py
uv run pytest tests/test_v0_8_0_gateway_pipeline_e2e.py tests/test_gateway.py tests/test_webhook_ingress.py tests/test_channel_identity.py tests/test_channel_allowlist.py tests/test_channel_policy.py tests/test_scheduled_automations.py tests/test_schedules.py tests/test_gateway_receipts.py tests/test_store.py -q
uv run pytest

v0.7.0 Goal Workflow

Operator experience gate.

0.7.0 ships a read-only operator surface for inspecting project state without reading raw logs. Each goal issue must ship implementation, tests, docs, validation, a green PR, merge, issue closure, and branch pruning before the next goal begins.

Area

Status

Goal issue

Dashboard / TUI decision

ready

#685 · CLI-first craik operator overview · read-only snapshot and formatter contract

Work graph explorer

ready

#686 · craik operator work-graph · terminal view and JSON export

Handoff viewer

ready

#687 · craik operator handoff · durable summary, risks, receipts, and next steps

Receipt viewer

ready

#688 · craik operator receipt · capability and plugin receipt inspection

Contradiction inbox

ready

#689 · craik operator contradictions · contradiction and operator-attention queue

Evidence and assumption views

ready

#690 · craik operator evidence · evidence and assumptions kept separate

Delegation queue

ready

#691 · craik operator delegations · human delegation inspection

Budget / quota view

ready

#692 · craik operator budget · explicit missing budget and quota data

Instruction distillation view

ready

#693 · craik operator instructions · sources, snapshots, provenance, proposals, and reviews

Quality gate view

ready

#694 · craik operator quality · handoff, evidence, critic, and red-team signals

Memory impact preview

ready

#695 · craik operator memory-impact · previewed durable-memory effects

Known traps view

ready

#696 · craik operator traps · known traps and negative knowledge

Run delta view

ready

#697 · craik operator run-delta · recovery and continuity deltas

Release readiness and docs assessment

ready

#698 · final 0.7.0 readiness record, changelog, and docs assessment

v0.7.0 Release Readiness

Status: ready for release prep.

All v0.7.0 operator-experience goals have shipped implementation, tests, docs, validation, a green PR, merge, issue closure, and branch pruning. No known release blockers remain as of 2026-05-21.

Gate

Status

Evidence

Implementation

ready

Read-only craik operator commands cover overview, work graph, handoffs, receipts, contradictions, evidence, delegations, budget, instructions, quality, memory impact, traps, and run deltas.

Tests

ready

Focused CLI and formatter coverage exists for each operator surface view, with full-suite validation required before release prep.

Docs

ready

Reference docs, generated CLI docs, roadmap, changelog, and this readiness record are updated to current docs quality.

Known blockers

none known

No unresolved v0.7.0 blocker is recorded after the goal workflow assessment.

v0.7.0 Validation Commands

Run the full release gate from a clean checkout before promoting 0.7.0 into release prep:

uv run python scripts/check_version_consistency.py
uv run ruff check
uv run mypy src
uv run python scripts/generate_cli_reference.py --check
uv run python scripts/check_doc_links.py
uv run python scripts/check_public_docs_hygiene.py
uv run python scripts/check_release_readiness.py
find src -name '*.py' -print0 | xargs -0 uv run python scripts/check_max_file_lines.py
uv run pytest tests/test_cli_operator_auth.py tests/test_cli_operator_scoping.py tests/test_operator_view_sanitization.py tests/test_release_readiness_guards.py -q
uv run pytest

v0.6.0 Goal Workflow

Skills, plugins, and ecosystem foundations gate.

0.6.0 ships reusable skill contracts and governed plugin ecosystem contracts without weakening Craik's no-ambient-authority runtime model. Each goal issue shipped implementation, tests, docs, validation, a green PR, merge, issue closure, and branch pruning before the next goal began.

Area

Status

Goal issue

Skill package format

ready

#659 · craik.skill_package · semantic package versions and no runtime authority

Project-scoped and global skills

ready

#660 · craik.skill_registry · active entry and precedence invariants

Context contracts for skills

ready

#661 · craik.skill_invocation_context · package context requirements and redacted invocation records

Plugin descriptor format

ready

#662 · craik.plugin_descriptor · trust boundary, capabilities, docs, security notes, and compatibility

Probationary plugins

ready

#663 · craik.plugin_probation · evidence-backed criteria and decisions before durable trust

Plugin capability grants

ready

#664 · craik.plugin_capability_grant · explicit operations, scoped targets, approvals, expiry, and authorization helper

Plugin receipts

ready

#665 · craik.plugin_receipt · redacted descriptor, grant, probation, evidence, and handoff links

Adapter packages

ready

#666 · craik.adapter_package · semantic versions, runner modes, Python/platform compatibility, docs, and provenance

Reference integrations

ready

#667 · craik.reference_integration · safe reproducible skill, plugin, and adapter examples

Community skills docs

ready

#668 · package, context, registry, review, and security guidance

Community plugins docs

ready

#669 · descriptors, probation, grants, receipts, adapters, references, and security guidance

Release readiness and docs assessment

ready

#670 · this release record, roadmap, changelog, and release automation hygiene

v0.6.0 Release Readiness

Ready for release prep.

The v0.6.0 skills, plugins, and ecosystem foundations surface is implemented in typed contracts, local-store persistence, validation helpers, operator-visible receipt formatting, reference documentation, and community guides. Release prep still owns the version bump, generated docs lockfile, release heading promotion, full release validation, tag creation, PyPI publish, docs deployment, and GitHub Release verification.

Area

Status

Validation

Runtime contracts

ready

skill_package · skill_registry · skill_invocation_context · plugin_descriptor · plugin_probation · plugin_capability_grant · plugin_receipt · adapter_package · reference_integration.

Docs

ready

Reference pages and guides cover skill packages, skill registries, skill contexts, plugin descriptors, plugin probation, plugin grants, plugin receipts, adapter packages, reference integrations, community skills, and community plugins. Sidebars and index navigation include the community guides.

Validation commands

passed

Focused slices passed across goal PRs. Final readiness validation passed uv run python scripts/check_version_consistency.py, uv run ruff check, uv run mypy src, uv run python scripts/check_doc_links.py, uv run python scripts/check_public_docs_hygiene.py, uv run python scripts/check_release_readiness.py, and full uv run pytest with loopback access enabled for local HTTP-server tests.

Security notes

reviewed

Skills remain instruction packages with no runtime authority. Plugin descriptors declare needs but grant nothing. Probation blocks durable trust until evidence-backed criteria and decisions pass. Plugin grants require explicit operations, scoped targets, expiry, and approval metadata. Denied, expired, and approval-required grants do not authorize execution. Plugin receipts must be redacted and cannot mark result metadata as unredacted.

Release automation hygiene

addressed

The publish workflow changelog extractor now accepts release headings with an em dash, en dash, or ASCII hyphen between the version and date, avoiding the v0.5.0 GitHub Release note extraction failure.

Release blocker state

none known

The v0.6.0 milestone has no open implementation goals other than this readiness issue. No critical release blocker is known before release-prep validation.

Release actions

pending release prep

Release prep must bump version declarations to 0.6.0, regenerate generated docs artifacts if required, promote the changelog heading to 0.6.0 — 2026-05-21, run the full release validation suite, create the signed tag, publish to PyPI, verify the docs site, and verify GitHub Release creation.

v0.5.0 Goal Workflow

Quality, continuity, and recovery gate.

0.5.0 starts with one goal issue for each roadmap capability plus a release-readiness issue. Each issue must ship implementation, tests, docs, and requirement validation before the milestone closes.

Area

Status

Goal issue

Recovery mode

ready

#636 · craik run recover · craik.recovery_session

Runtime critic

ready

#637 · craik.runtime_critic_finding

Red team mode

ready

#638 · craik.red_team_finding

Evidence coverage score

ready

#639 · craik.evidence_coverage_score

Handoff quality score

ready

#640 · craik.handoff_quality_score

Context debt tracking

ready

#641 · craik.context_debt_record

Evidence expiration rules

ready

#642 · attestation and freshness expiry checks

Tool result attestation

ready

#643 · craik.tool_result_attestation

Knowledge freshness probes

ready

#644 · craik.knowledge_freshness_probe

Scratchpad with expiry

ready

#645 · craik.scratchpad_record

Known traps

ready

#646 · craik.known_trap

Negative knowledge

ready

#647 · craik.negative_knowledge

First-class unknowns

ready

#648 · craik.unknown_record

Release readiness and docs assessment

ready

#649 · this release record

Structured context requests

ready

#650 · craik.context_request

Agent exit discipline

ready

#651 · craik.exit_discipline_check

What changed since last time deltas

ready

#652 · craik.run_delta

v0.5.0 Release Readiness

Remediated and release-ready.

The v0.5.0 quality, continuity, and recovery surface is implemented in typed contracts, local-store persistence, runtime helpers, operator views, capture CLI surfaces, recovery/delta operator gates, and reference documentation. The post-readiness remediation closed the gap between contract definitions and production capture paths. Release prep updates the version declarations, changelog heading, generated docs lockfile, package verification, tag checks, PyPI publish, docs deployment verification, and GitHub release verification.

Area

Status

Validation

Runtime contracts

ready

recovery_session · run_delta · runtime_critic_finding · red_team_finding · handoff_quality_score · evidence_coverage_score · context_debt_record · tool_result_attestation · knowledge_freshness_probe · scratchpad_record · known_trap · negative_knowledge · unknown_record · context_request · exit_discipline_check.

Operator surfaces

ready

Quality gate, known traps, negative knowledge, run delta, recovery, scratchpad, unknowns, context requests, context debt, critic findings, red-team findings, and exit-discipline states are formatted or captured without granting policy authority. Knowledge-resolution views distinguish unresolved records, verified receipt links, and missing or tampered receipt links.

CLI exercise path

ready

craik knowledge captures scratchpad, unknown, context-request, known-trap, and negative-knowledge records, and resolves unknown, context-request, and context-debt records with operator receipt linkage; craik review captures critic and red-team findings; craik run recover and craik run delta expose recovery and changed-since-last-time summaries from local durable state behind an active operator session.

Validation commands

passed

uv run pytest tests/test_v0_5_0_pipeline_e2e.py plus the focused readiness slice:

uv run pytest tests/test_recovery.py tests/test_critics.py tests/test_quality_scores.py tests/test_context_debt.py tests/test_freshness.py tests/test_known_traps.py tests/test_scratchpad.py tests/test_exit_discipline.py tests/test_operator_views.py tests/test_store.py

Security notes

reviewed

Critic and red-team findings are non-authoritative by default; freshness and evidence-expiry checks warn or block silent reliance but do not prove truth; scratchpad content expires instead of becoming project memory; negative knowledge requires evidence and scope; resolved unknowns, fulfilled context requests, and resolved context debt require operator receipt links; tool attestations and recovery sessions carry local HMAC integrity metadata; existing recovery/delta state requires an active operator session.

Release blocker state

none known

No critical v0.5.0 implementation blocker is known after remediation of the capture-layer readiness findings. Tagging is gated on the release-prep PR checks and version/tag validation.

Release actions

pending tag

0.5.0 version declarations, release notes, and docs lockfile are prepared in the release-prep branch. The release tag, PyPI package, docs deployment, and GitHub Release are verified after the release-prep PR lands.

v0.4.0 Release Readiness

Runtime instruction distillation gate.

0.4.0 lands the declared-instruction pipeline that turns project instruction files into typed, provenance-linked, reviewable constraints. Sources are registered explicitly, snapshots drive stale invalidation, extracted statements carry line/range provenance, categories and contradiction reports keep review queues explainable, approval receipts are required before a constraint becomes governing, and active constraints flow into case files and compiled prompts.

Area

Status

Release notes

Package version

shipped

pyproject.toml, src/craik/init.py, docs/package.json, and docs/package-lock.json declare 0.4.0.

Instruction source registry

ready

Projects can register declared instruction sources with typed source metadata, canonical paths, owner identity, path confinement, registry receipts, and project-scoped active source lists.

Source ingestion

ready

Markdown, Cursor rules, Codex, Copilot, and policy document sources parse into candidate statements without treating arbitrary repository Markdown as authority.

Source snapshots and stale invalidation

ready

Registered sources are hashed with normalized newlines and tracked as new, unchanged, changed, or missing; changed, missing, newly observed, or omitted sources defer derived proposals.

Line/range provenance

ready

Extracted statements persist deterministic provenance records with source ID, snapshot ID, path, line and column ranges, summaries, and excerpt hashes.

Instruction categorization

ready

Provenanced statements become reviewable proposals with deterministic category traceability across policy, security, boundary, command, instruction, handoff, memory, preference, and stale-risk classes.

Inter-source contradictions

ready

Normalized policy, boundary, command, instruction, and security-rule proposals open contradiction reports for cross-source conflicts while skipping same-source and stale deferred items.

Approval flow and receipts

ready

Proposals become governing only through explicit operator approval receipts; re-approval is idempotent, rejections are receipted, stale or contradicted approvals require override rationale, and active consumers exclude constraints whose approval receipt HMAC is missing or invalid.

Case-file and prompt-compilation integration

ready

Case files include deterministic governing distillation evidence, and compiled prompts render exactly one Active instruction constraints section with ordered items, provenance annotations, empty-state behavior, and stale-exclusion warnings.

Distillation CLI

ready

craik instructions register, ingest, list, approve, reject, and show expose source registration, pipeline execution, proposal review, approval decisions, rejection decisions, and provenance-aware item inspection through the active operator session.

Reference documentation

ready

docs/reference/instruction-sources.md, docs/reference/distilled-instructions.md, docs/reference/instruction-approval.md, and docs/guides/managing-instructions.md document the shipped v0.4.0 operator surface and link through the sidebars.

Release actions

complete

v0.4.0 is tagged, published to PyPI, and represented by the GitHub Release. The GitHub milestone is closed with zero open issues.

v0.4.0 Verification Commands

Run these before release prep and again before tagging:

uv run python scripts/check_version_consistency.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_doc_links.py
uv run python scripts/check_public_docs_hygiene.py
uv run pytest tests/test_instruction_sources.py tests/test_instruction_ingestion.py tests/test_instruction_provenance.py tests/test_instruction_distillation.py tests/test_instruction_invalidation.py tests/test_instruction_contradictions.py tests/test_instruction_promotion.py tests/test_instruction_runtime_context.py tests/test_instruction_workflow_docs.py tests/test_instruction_pipeline_e2e.py tests/test_case_files.py tests/test_prompts.py tests/test_contracts.py -q

v0.4.0 Security Notes

The v0.4.0 trust boundary is also documented in SECURITY.md.

Instruction sources must be registered explicitly and remain confined to the registered project root before ingestion.
Raw source files and distilled proposals are evidence, not authority; only governing constraints backed by approval receipts enter case files and compiled prompts.
Stale or contradicted approvals require an explicit override and rationale, and review receipts record whether stale or contradiction guards were bypassed.
Approval receipt HMACs, backed by an owner-only local secret, are verified before governing constraints are rendered into case files, onboarding context, handoffs, or compiled prompts.
Release workflows pin GitHub Actions to immutable SHAs and attest package provenance before PyPI publish.
Stale governing items are excluded from compiled prompt context and surfaced as distillation warnings instead of silent authority.
Contradiction detection opens reviewable reports for cross-source policy, boundary, command, instruction, and security-rule conflicts.

v0.3.0 Release Readiness

Multi-agent review and coordination gate.

0.3.0 lands the governed multi-agent surface: authenticated mailbox messages, intent-lock coordination across simultaneous runs, structured debate with adjudication, cross-agent review, human delegation pause and resume, scope-change decisions, and live work-graph coordination. The same release tightens the security boundary: identity-isolated handoff consumption, role-allowlist dispatch, operator-bound delegation resolution, and authenticated mailbox sends.

Area

Status

Release notes

Package version

ready

pyproject.toml, src/craik/init.py, docs/package.json, and docs/package-lock.json declare 0.3.0.

Multi-agent messaging

ready

Receipt-backed craik agent-message and local-store helpers send and receive authenticated typed messages linked to tasks, runs, handoffs, and roles. Senders are authenticated against the run's role state, message bodies are bounded, and same-subject repeats get unique IDs instead of overwriting.

Intent-lock coordination

ready

Overlapping active scopes on the same project block before new loop phases or tool dispatch and persist a denial receipt, so simultaneous runs cannot race the same intent lock.

Structured debate

ready

The debate runtime helper creates role-linked debate turns, summarizes agreement or disagreement, and resolves by adjudication receipt or human-delegation receipt.

Cross-agent review

ready

The review protocol helper creates receipted review requests for worker results, handoffs, or debate summaries and completes them with typed findings linked back to the reviewed artifacts.

Human delegation

ready

Runs can be interrupted with receipted delegation requests, resolved or cancelled by CLI, and resumed from the recorded response. Resolution requires resolver operator identity and rejects attempts to resume a paused run opened by another operator.

Scope-change protocol

ready

Discovered work outside the current intent lock interrupts the run, records a scope-change request receipt, and exposes craik scope-change decide for explicit expand, sibling-task, handoff, or denial decisions before continuing.

Live work graph

ready

Mailbox messages, reviews, debates, delegations, and scope-change artifacts persist work-graph events that can be queried as active coordination state.

Identity isolation

ready

Consuming a handoff records an explicit consumer credential and operator assignment, rejects producer identity reuse by default, and requires an explicit continuation flag plus rationale when reuse is intentional.

Handoff consumption

ready

craik task resume --from-handoff creates a follow-up task, case file, and pending run that record source handoff provenance while requiring an explicit consumer credential and operator identity.

Role-based dispatch

ready

craik run execute --role records a policy-checked specialist role assignment, dispatch receipt, and run-level role metadata. Role dispatch requires explicit role allowlists and gates runner overrides behind the role.runner.override policy capability.

Release actions

pending

Create immutable tag v0.3.0, run the protected publish workflow, then verify PyPI and docs after publication.

v0.3.0 Verification Commands

Run these before tagging:

uv run python scripts/check_version_consistency.py
uv run python scripts/check_release_version.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_release_tag.py --tag v0.3.0 --expected-version 0.3.0
uv run pytest tests/test_agent_mailbox.py tests/test_intent_lock_coordination.py tests/test_role_dispatch.py tests/test_scope_changes.py tests/integration/test_multi_agent_v030_flow.py -q

v0.3.0 Security Notes

Delegation resolution requires resolver operator identity and rejects attempts to resume a paused run opened by another operator.
Mailbox sends authenticate from_agent against the sender run's role state before storing the message or receipt.
Role dispatch requires explicit role allowlists and gates runner overrides behind the role.runner.override policy capability.
Mailbox message bodies are bounded and repeated same-subject messages receive unique IDs instead of overwriting the latest message.
Handoff consumption records an explicit consumer credential and operator assignment, rejects producer identity reuse by default, and requires an explicit continuation flag plus rationale when reuse is intentional.

v0.2.0 Release Readiness

Durable execution continuity gate.

0.2.0 hardens the provider-backed loop into durable execution: resumable phase boundaries, wall-clock and provider-token budgets, sandboxed shell tool dispatch, run recovery commands, tool-result attestations, and local-store migrations.

Area

Status

Release notes

Package version

ready

pyproject.toml, src/craik/init.py, docs/package.json, and docs/package-lock.json declare 0.2.0.

Resumable execution

ready

Interrupted runs reopen from persisted phase outputs, stable idempotency keys prevent duplicate phase output capture, and craik run resume continues unfinished provider-backed runs.

Budgets

ready

Per-run wall-clock budgets, provider token ledgers, and pre-dispatch time checks interrupt before additional provider calls or side effects when exhausted.

Sandboxed tool execution

ready

Configured shell tool calls execute through the local-process sandbox backend, propagate cancellation to in-flight commands, and record hashed tool-result attestations linked to side-effect receipts.

Recovery and observability

ready

craik run show, craik run cancel, craik run delta, and persisted exit-discipline checks expose continuity state and handoff readiness.

Storage migrations

ready

Local-store migrations now run through a registered, forward-only framework with compatibility fixtures, ordering tests, and migration failure guidance.

Release actions

pending

Create immutable tag v0.2.0, run the protected publish workflow, then verify PyPI and docs after publication.

v0.2.0 Verification Commands

Run these before tagging:

uv run python scripts/check_version_consistency.py
uv run python scripts/check_release_version.py
uv run python scripts/check_release_readiness.py
uv run python scripts/check_release_tag.py --tag v0.2.0 --expected-version 0.2.0
uv run pytest tests/test_loop.py tests/test_local_process_backend.py tests/test_loop_tool_dispatch.py tests/test_store.py tests/test_cli.py tests/test_handoffs.py tests/test_provider_runner.py -q

v0.2.0 Security Notes

Shell execution remains policy-gated and only registered command references are routed to the local-process sandbox backend.
The local-process backend avoids shell expansion and propagates cancellation to in-flight subprocesses.
Budget checks happen before provider calls and immediately before tool dispatch, preventing exhausted runs from producing new side effects.
Tool-result attestations hash redacted replay payloads and link each dispatched result to its side-effect receipt.

v0.1.0 Release Readiness

In-repo green.

All in-repo readiness gates are passing. The remaining work is the maintainer-driven v0.1.0 tag and the protected publication workflow.

Snapshot

Area

Status

Notes

Code health

green

CI, CodeQL, version checks, file-size budget, build, doctor all pass.

Test coverage

green

HTTP transport, credentials, OIDC, governance, redaction, handoffs.

Security hygiene

green

No leaked secret patterns. Operator and credential stores are file-locked, atomic, and owner-only.

Documentation

green

Roadmap, README, changelog, limitations, mvp docs, Docusaurus build.

Operational state

green

Milestones present. 22 issues closed for v0.1.0. No blockers open. Dependabot clear.

External release actions

pending

Tag and publish remain maintainer actions.

Code health

CI on main

Latest ci.yml run on main completed success: run 26010629626.

CodeQL

Latest codeql.yml run on main completed success: run 26010629612.

Code scanning

Zero open alerts via gh api repos/eidetic-labs/craik/code-scanning/alerts.

Version consistency

uv run python scripts/check_release_version.py.

File-size budget

find src -name "*.py" -print0 | xargs -0 uv run python scripts/check_max_file_lines.py.

`craik --version`

Prints 0.1.0 via uv run craik --version.

`craik doctor`

Runs to completion against a fresh CRAIK_HOME. An entirely empty home correctly reports missing local state.

Package artifacts

uv build produced dist/craik-0.1.0.tar.gz and dist/craik-0.1.0-py3-none-any.whl.

Test coverage

Area

Coverage

Test files

HTTP transport

integration

tests/integration/test_http_transport_round_trip.py

Credential sources

unit

API keys · local-CLI OAuth · CLI bridge · secret references · Stigmem references · marker / no-credential behavior · credential pools. Files: test_auth_api_key_source.py, test_auth_local_cli_oauth.py, test_auth_cli_bridge.py, test_auth_secret_ref.py, test_auth_profiles.py, test_auth_credential_pool.py, test_provider_runtime.py.

OIDC & workload identity

unit

Operator auth · session storage · GitHub Actions · Kubernetes · generic file/env tokens · RFC 8693 exchange. Files: test_oidc_operator.py, test_operator_session_store.py, test_workload_identity.py, test_oidc_exchange_secret_manager.py.

JWT hardening

unit

Rejects alg=none, unknown kid, tampered payloads, asymmetric/symmetric confusion (test_oidc_operator.py).

Governance behavior

unit

Credential-scoped receipts · operator-scoped receipts · policy-bound credentials · policy-bound operators · approval gates · expiry-as-risk · per-credential redaction · handoff identity isolation. Files: test_provider_runtime.py, test_policy.py, test_loop.py, test_case_files.py, test_redaction.py, test_handoffs.py.

Focused readiness set: ran the combined readiness subset with uv run pytest tests/integration/test_http_transport_round_trip.py tests/test_auth_api_key_source.py tests/test_auth_local_cli_oauth.py tests/test_auth_cli_bridge.py tests/test_auth_secret_ref.py tests/test_auth_profiles.py tests/test_auth_credential_pool.py tests/test_oidc_operator.py tests/test_operator_session_store.py tests/test_workload_identity.py tests/test_oidc_exchange_secret_manager.py tests/test_provider_runtime.py tests/test_policy.py tests/test_case_files.py tests/test_handoffs.py -q — all passed.

Security hygiene

Secret-pattern grep

No raw secret patterns in tests or scripts: grep -rE "sk-[a-zA-Z0-9]{20,}|xoxb-|ghp_|ANTHROPIC.{0,5}=.{20,}" tests/ scripts/.

Operator session file

Owner-only 0o600 writes in src/craik/runtime/auth/operator/store.py.

Auth profiles store

auth-profiles.json writes are file-locked and atomic via fcntl.flock + tempfile + os.replace in src/craik/runtime/auth/store.py.

Credential pool store

Pool writes are file-locked and atomic in src/craik/runtime/auth/pool.py.

Resolver errors

Reference-level error wording such as secret reference could not be resolved — never raw values.

Documentation

Roadmap gates

Exactly 12 release gates v0.1.0–v0.12.0, no gaps.

Roadmap auth scope

docs/roadmap.md states v0.1.0 includes OIDC, pluggable credentials, operator + credential identity on receipts, policy-bound auth, approval-gated first use, expiry risk, per-credential redaction, handoff identity bookkeeping.

Changelog

CHANGELOG.md ## 0.1.0 - 2026-05-17 narrates Phase A and Phase B.

README

"What Works Today" names OIDC and typed credential profiles.

Auth on-ramp

docs/guides/authentication.md exists and is linked from docs/index.md. docs/guides/quickstart.md covers it.

Limitations honesty

docs/limitations.md no longer treats shipped auth capabilities as future work.

MVP docs

docs/mvp.md and docs/mvp-roadmap.md reflect the expanded v0.1.0 scope including OIDC and credential profiles.

Docs build

npm run build from docs/ succeeds.

Operational state

Milestones

v0.1.0–v0.12.0 exist with titles matching the roadmap.

v0.1.0 milestone

22 closed issues · 0 open issues.

Blockers

No open PRs or open issues currently blocking the release.

Dependabot

Alert #1 fixed.

Tag posture

Tag v0.1.0 does not exist locally. Tagging is a maintainer release action and should happen only after this report is accepted.

External release actions

Action

Status

Notes

Create and push tag v0.1.0

pending

Maintainer action.

Run protected package publication workflow

pending

Maintainer action.

Optional live-provider smoke tests

pending

Require real provider credentials and an operator IdP. Fixture, cassette, and in-process socket paths are already validated in-repo.

What's next

ReadMVP roadmapThe work this readiness report validates.ReadLimitationsHonest scope after v0.1.0 ships.ReadSecurity release processThe release-day procedure for security-sensitive work.

v0.12.9 TUI Cutover​

Gateway Cleanup Validation Commands​

v0.12.9 Acceptance Status​

v0.12.9 Validation Commands​

v0.12.8 CLI/TUI Contract​

v0.12.8 Acceptance Status​

v0.12.8 Validation Commands​

v0.12.7 Provider OAuth Suite​

v0.12.7 Validation Commands​

v0.12.6 Inherited Surface Sweep​

v0.12.6 Validation Commands​

v0.12.5 TUI Polish and Release Guards​

v0.12.5 Validation Commands​

v0.12.4 Command Surface Ergonomics​

v0.12.4 Validation Commands​

v0.12.3 Interactive Shell Refinement​

v0.12.3 Validation Commands​

v0.12.2 Canonical TUI Release​

v0.12.2 Validation Commands​

v0.12.1 Patch Release​

v0.12.1 Validation Commands​

v0.12.0 Goal Workflow​

v0.12.0 Validation Commands​

v0.11.0 Goal Workflow​

v0.11.0 Release Readiness​

v0.11.0 Validation Commands​

v0.10.0 Goal Workflow​

v0.10.0 Release Readiness​

v0.10.0 Validation Commands​

Release Signing-Key Asset Gate​

v0.9.0 Goal Workflow​

v0.9.0 Release Readiness​

v0.9.0 Validation Commands​

v0.8.0 Goal Workflow​

v0.8.0 Release Readiness​

v0.8.0 Validation Commands​

v0.7.0 Goal Workflow​

v0.7.0 Release Readiness​

v0.7.0 Validation Commands​

v0.6.0 Goal Workflow​

v0.6.0 Release Readiness​

v0.5.0 Goal Workflow​

v0.5.0 Release Readiness​

v0.4.0 Release Readiness​

v0.4.0 Verification Commands​

v0.4.0 Security Notes​

v0.3.0 Release Readiness​

v0.3.0 Verification Commands​

v0.3.0 Security Notes​

v0.2.0 Release Readiness​

v0.2.0 Verification Commands​

v0.2.0 Security Notes​

v0.1.0 Release Readiness​

Snapshot​

Code health​

CI on main

CodeQL

Code scanning

Version consistency

File-size budget

craik --version

craik doctor

Package artifacts

Test coverage​

Security hygiene​

Secret-pattern grep

Operator session file

Auth profiles store

Credential pool store

Resolver errors

Documentation​

Roadmap gates

Roadmap auth scope

Changelog

README

Auth on-ramp

Limitations honesty

MVP docs

Docs build

Operational state​

v0.12.9 TUI Cutover

Gateway Cleanup Validation Commands

v0.12.9 Acceptance Status

v0.12.9 Validation Commands

v0.12.8 CLI/TUI Contract

v0.12.8 Acceptance Status

v0.12.8 Validation Commands

v0.12.7 Provider OAuth Suite

v0.12.7 Validation Commands

v0.12.6 Inherited Surface Sweep

v0.12.6 Validation Commands

v0.12.5 TUI Polish and Release Guards

v0.12.5 Validation Commands

v0.12.4 Command Surface Ergonomics

v0.12.4 Validation Commands

v0.12.3 Interactive Shell Refinement

v0.12.3 Validation Commands

v0.12.2 Canonical TUI Release

v0.12.2 Validation Commands

v0.12.1 Patch Release

v0.12.1 Validation Commands

v0.12.0 Goal Workflow

v0.12.0 Validation Commands

v0.11.0 Goal Workflow

v0.11.0 Release Readiness

v0.11.0 Validation Commands

v0.10.0 Goal Workflow

v0.10.0 Release Readiness

v0.10.0 Validation Commands

Release Signing-Key Asset Gate

v0.9.0 Goal Workflow

v0.9.0 Release Readiness

v0.9.0 Validation Commands

v0.8.0 Goal Workflow

v0.8.0 Release Readiness

v0.8.0 Validation Commands

v0.7.0 Goal Workflow

v0.7.0 Release Readiness

v0.7.0 Validation Commands

v0.6.0 Goal Workflow

v0.6.0 Release Readiness

v0.5.0 Goal Workflow

v0.5.0 Release Readiness

v0.4.0 Release Readiness

v0.4.0 Verification Commands

v0.4.0 Security Notes

v0.3.0 Release Readiness

v0.3.0 Verification Commands

v0.3.0 Security Notes

v0.2.0 Release Readiness

v0.2.0 Verification Commands

v0.2.0 Security Notes

v0.1.0 Release Readiness

Snapshot

Code health

`craik --version`

`craik doctor`

Test coverage

Security hygiene

Documentation

Operational state

External release actions

What's next