Skip to main content
Version: MVP

Build with Craik

This section is task-oriented. Every page either gets you to a running Craik surface or documents a contract you implement against. Topics are grouped by what you are trying to build, and each group lists the guides and reference in the order most teams reach for them.

If you've never seen Craik before, do these three things first:

  1. Install Craik
  2. Quickstart
  3. Register a project

Implementation paths

Getting started

From zero to a governed first run.

Install the CLI, point it at a repository, initialize the home directory, and walk a complete governed task — without making a single live provider call. Four docs, in order.

Hands-on · 01

Quickstart

A 10-minute narrative tutorial: sandbox the Craik home, register a project, create a task, build a case file, run the policy gate, emit a handoff, and inspect what landed on disk. Uses the fixture-backed provider path — zero credentials, zero network.

  • sandboxed home
  • case file + policy gate
  • handoff
  • stigmem demo
Walk the quickstart

What you'll do

Install Craik, point it at a Git repository, build a case file, run policy tests, and emit a handoff — without making a single live provider call. Every output we discuss is real and persists on disk.

— Quickstart · §What you'll do

  1. 02 · Front door

    Installation

    Prerequisites, three install paths (pipx / pip / source), verification commands, home initialization, optional auth, and a troubleshooting matrix for the most common first-run issues.

    Craik is a Python CLI. You need a recent Python interpreter and a way to put the CLI on your PATH. Installation · §Prerequisites

    • python 3.12+
    • pipx / pip
    • verification
    • troubleshooting

    For: everyone3 min read

  2. 03 · Wire it up

    Setup wizard

    craik setup initializes the local home, the SQLite store, and a default gateway configuration. Writes inspectable, non-secret configuration only. Authentication and credentials are added separately after setup completes.

    The wizard writes inspectable, non-secret configuration only. It does not ask for API keys, channel tokens, webhook secrets, or bearer credentials. Setup · §Secrets-free

    • home init
    • local store
    • gateway config
    • secrets-free

    For: first-time operators4 min read

  3. 04 · Where state lives

    Configuring Craik home

    The default ~/.craik/ layout, overrides via CRAIK_HOME, share patterns for CI runners and containerized runs, and the reset procedure. Read this before pointing Craik at a network share.

    Craik does NOT silently create project-local .craik/ directories inside your repositories. Configuring home · §Default location

    • home layout
    • CRAIK_HOME
    • CI runners
    • reset

    For: operators4 min read

Working with projects

Repositories become typed, queryable projects.

The project profile is the durable representation of a repository. Case files are the per-task brief built from it. Handoffs are how runs end. These four docs cover the full project-to-handoff lifecycle.

Entry point · 01

Project registry

Register a Git repository as a Craik project. Declares mutable docs paths, immutable evidence paths, and (optionally) a memory backend. Project records persist in the local SQLite store — Craik never writes into your repository.

  • boundaries
  • multi-project
  • inspect via onboard
  • local-only state
Register a project

Where state lives

Registration writes only to Craik local state under ~/.craik or $CRAIK_HOME. It does not create project-local .craik/ metadata inside the repository.

— Project registry · §Where state lives

  1. 02 · Typed object

    Project profile reference

    The craik.project_profile shape: stable id, repo paths, default branch, docs and immutable paths, memory backend and scope. Every case-file build and onboarding payload reads against this.

    Project profiles describe repositories Craik can reason about. Project profile · §Intro

    • repo metadata
    • docs boundaries
    • memory backend
    • git detection

    For: integratorsReference

  2. 03 · Pre-run brief

    Using case files

    Build, inspect, and refresh per-task case files. Discovery overrides, the 10 fields to review before authorizing a run, and how to handle open assumptions as first-class — not bugs.

    Case files are only as good as the project they're built against. Using case files · §Tune discovery

    • case build
    • discovery overrides
    • assumptions
    • refresh procedure

    For: operators6 min read

  3. 04 · Continuity

    Writing handoffs

    Status semantics (completed/incomplete/blocked/failed), the eight things every handoff should include, anti-patterns that hurt the next agent, and the six self-audit checks that keep incomplete runs honest.

    Incomplete handoffs are valid. Dishonest handoffs are not — they contaminate the next run's context. Writing handoffs · §Self-audit

    • status semantics
    • anti-patterns
    • self-audit
    • continuity

    For: runners · humans5 min read

Connecting a runner

Codex, Claude, Gemini — or your own.

Every runner consumes the same Craik contracts. The runner-adapter boundary is intentionally runner-agnostic: adapters receive Craik state and return normalized Craik state without leaking provider-specific details into core. Eleven docs cover the contract, the three preview adapters that ship today, and the workflows around them.

  1. 02Contract

    Runner step contracts

    Each phase (plan / act / observe / evaluate / continue / stop) is a typed step request and step result — invoked by the loop, not the runner itself.

    Reference
  2. 03Contract

    Runner metadata

    Captured at adapter boundaries so receipts and handoffs can explain which runner produced work without adding provider-specific fields to the stable contract surface.

    Reference
  3. 04Adapter · preview

    Codex runner adapter

    Conservative v0.1 preview. Turns a compiled prompt into a normalized runner request and returns deterministic fixture results when live execution is unavailable.

    Reference
  4. 05Adapter · preview

    Claude runner adapter

    Focuses on prompt handoff and deterministic fixture output. Live external execution is a later milestone in the adapter roadmap.

    Reference
  5. 06Adapter · preview

    Gemini runner adapter

    Read/review-oriented in v0.1. Uses prompt handoff plus deterministic fixture output rather than direct external execution; full live adapter is post-MVP.

    Reference
  6. 07Workflow

    Runner preview workflows

    Threads the four pieces together: context discovery, policy-aware prompt compilation, runner fixture or prompt-handoff execution, and receipt-plus-handoff metadata capture.

    Guide
  7. 08Workflow

    Single-agent fixture loop

    Smoke-test the loop boundary without live runner credentials or external side effects. The pattern CI uses for every PR.

    Guide
  8. 09Reference · v0.3

    Agent roles

    craik.agent_role defines the role boundary for v0.3 multi-agent coordination. Roles describe responsibility and authority; they do not grant new runtime permissions by themselves.

    Reference
  9. 10Reference

    Adapter packages

    The craik.adapter_package contract records adapter identity, package version, implementation entrypoints, and the metadata plugin discovery uses.

    Reference
  10. 11Contract

    Worker results

    craik.worker_result preserves typed specialist output: findings, artifacts, assumptions, risks, contradiction ids, receipts. Conflicts stay conflicting — review decides later.

    Reference

Connecting a provider

The transport sits under the runner.

Provider transport is independent of the runner: OpenAI Responses, Anthropic Messages, and OAI-compatible Chat Completions are separate transport families. Seven docs cover the model-provider contract, routing decisions, the operator CLI surface, failover policy, certification, identity, and prompt compilation.

Routing input

craik.model_provider records model provider and runtime execution path metadata for provider routing.

— Model providers · §Intro

  1. 02Guide

    Provider routing & sandboxes

    Routing chooses model/runtime metadata. Sandbox routing chooses an execution environment. Keep those decisions separate so policy, receipts, and redaction can audit each boundary independently.

    Guide
  2. 03CLI

    Provider switching

    craik provider exposes the operator-facing surface for model/provider routing — list, show, and switch the active provider within the bounds the policy envelope allows.

    Reference
  3. 04Policy

    Provider failover

    Failover is an explicit routing policy. Craik only falls back from one provider to another when a ProviderFailoverPolicy rule matches — every fallback preserves the active policy envelope id.

    Reference
  4. 05MVP bar

    Provider certification

    OpenAI and Anthropic share one certification bar. Provider metadata alone is not enough; a provider is MVP-ready only when tests and receipts show the runtime can safely use it in a governed workflow.

    Reference
  5. 06Identity

    Authentication & credentials

    Operator identity (OIDC) is separate from credential identity (the provider account used for model calls). Every receipt records both — audit can answer "who authorized this" and "which credential carried it out" without inspecting secret material.

    Guide
  6. 07Tool

    Prompt compiler

    craik prompt compile turns Craik runtime state into a deterministic runner-ready prompt. It does not invoke a runner — it prepares the prompt boundary for adapter previews.

    Reference

Connecting memory & Stigmem

Local SQLite by default — Stigmem for team-scale memory.

Craik runs in degraded local mode against the SQLite store at $CRAIK_HOME/state/craik.sqlite. When you need durable, shared team memory you connect a Stigmem node through the minimum v0.1 HTTP endpoint surface. Six docs cover the integration, the demo, and the contracts beneath.

  1. 02Demo

    Stigmem docs demo

    The first runnable demo. Reconciles Stigmem documentation and observed runtime state without editing files. CI exercises the same command on every PR — craik demo stigmem-docs --repo-path . --no-github.

    Guide
  2. 03Contract

    Memory backends

    The proposal-first interface every memory backend implements: create reviewable proposals, list by task or status, approve or reject with audit, and refuse direct durable writes without a matching grant.

    Reference
  3. 04Compatibility

    Stigmem compatibility

    The minimum endpoint matrix for v0.1: health, capability discovery, fact read, fact provenance. Future endpoints (direct write, federation hooks) remain explicitly post-MVP.

    Reference
  4. 05Persistence

    Local store

    SQLite at $CRAIK_HOME/state/craik.sqlite3. Holds projects, tasks, case files, intent locks, receipts, handoffs, memory proposals, contradictions, run state, and work-graph projections.

    Reference
  5. 06On-disk layout

    Local state layout

    The full ~/.craik/ directory map: config/, secrets/, state/, cache/, logs/, receipts/, handoffs/, case-files/, projects/.

    Reference

CLI & configuration

Flags, env vars, file shapes — the operator surface.

Four docs cover the addressable command surface and how it composes with configuration: the autogenerated CLI reference, the CRAIK_HOME-backed configuration model, the read-only GitHub adapter, and the CI/CD gate matrix that runs on every PR.

  1. 02Config

    Config reference

    Craik v0.1 is configured primarily through environment variables and local state under CRAIK_HOME. The reference enumerates every recognized variable and its purpose.

    Reference
  2. 03Adapter

    GitHub config

    The GitHub adapter is read-only in v0.1: issues, PRs, comments, review state, and CI checks flow in as case-file evidence. Direct writes are explicitly post-MVP.

    Reference
  3. 04Gates

    CI/CD gates

    Craik CI is split by surface so failures point at the area that regressed: unit, contract, integration, quickstart smoke, policy gate, lint, type, security, and CodeQL.

    Reference

Side effects, failure & recovery

Where Craik draws the line.

Side effects pass through policy, grant, redaction, and receipt boundaries — never through a raw call. When a run fails, recovery mode gives the next agent a bounded continuity view before it acts. Six docs cover the wrappers, the failure posture, recovery, provenance, the self-audit, and what's explicitly post-MVP.

  1. 02Posture

    Failure modes

    The fail-closed MVP posture. Prompt-injection containment, secret rejection at persistence, denied-capability handling, fail-open visibility, automation stops, and the explicit list of paths the MVP does not claim.

    Reference
  2. 03Continuity

    Recovery mode

    Bounded continuity view for a resuming agent. Summarizes the latest handoff, case files, receipts, open contradictions, and active instruction constraints. It does not resolve contradictions or replace policy checks.

    Reference
  3. 04Provenance

    Public boundary & provenance

    Public docs must not expose private paths, raw credentials, or internal task labels. craik.runtime.projects.public_docs provides the machine-checkable MVP boundary.

    Reference
  4. 05Honesty

    Self-audit

    Every structured handoff includes a self_audit object — the six honesty checks (schema, redaction, receipts, assumptions, validation, policy exceptions) that keep incomplete runs from masking as complete.

    Reference
  5. 06Post-MVP

    Post-MVP scope

    The 0.x MVP is not a 1.0 compatibility promise. Scope explicitly excludes hosted gateway dispatch, operator dashboards, additional live runners, marketplace workflows, and visual companion surfaces.

    Reference

Skills & plugins

Growing the runtime — without giving up governance.

Skills package reusable operating guidance. Plugins package executable extensions. Both have to compose with policy, grants, receipts, provenance, and promotion gates so reviewers can audit what changed and why. Fourteen docs cover the contracts.

Note: broad marketplace and community-ecosystem workflows are explicitly post-MVP scope. The MVP ships the contracts and approval flow; distribution is later.

  1. 02Guide · plugins

    Community plugins

    Plugins package executable extensions. Treat them as untrusted until their descriptor, provenance, review state, grants, and receipts have been inspected. Marketplace workflows are post-MVP.

    Guide
  2. 03Contract · skill

    Skill packages

    craik.skill_package records reusable instructions, entrypoints, docs, and assets. Packages do not carry plugin runtime authority — they are pure operating guidance.

    Reference
  3. 04Contract · skill

    Skill registries

    craik.skill_registry records project-local and global skill entries and where each came from — so a reviewer can audit which skills a project can use at a given point in time.

    Reference
  4. 05Contract · skill

    Skill invocation contexts

    craik.skill_invocation_context links a skill run to its task, skill package, policy envelope, and optional handoff — the auditable boundary for one skill invocation.

    Reference
  5. 06Contract · skill

    Skill telemetry

    SkillPerformanceTelemetry records how one invocation behaved without allowing the agent to silently rewrite reusable guidance.

    Reference
  6. 07Contract · skill

    Skill proposals

    SkillChangeProposal lets agents draft changes to reusable operating guidance without silently changing their own authority. Reviewer approval gates promotion.

    Reference
  7. 08Contract · skill

    Skill replay

    SkillReplayFixture compares current skill behavior against redacted, reproducible fixtures before learning-loop changes are promoted.

    Reference
  8. 09Contract · skill

    Skill promotion gates

    SkillPromotionRequest prevents reviewed skill proposals from becoming promoted guidance without explicit approval — every promotion is named and dated.

    Reference
  9. 10Contract · skill

    Skill rollbacks

    SkillRollbackTarget provides a reviewable path for reverting promoted skill updates when a promoted version causes regressions or violates policy.

    Reference
  10. 11Contract · plugin

    Plugin descriptors

    craik.plugin_descriptor records governed plugin metadata without granting runtime authority. Authority comes from grants and receipts — never the descriptor alone.

    Reference
  11. 12Contract · plugin

    Plugin probation

    craik.plugin_probation keeps new or changed plugins out of durable trust until review criteria are satisfied. Probation is the gate between "available" and "trusted".

    Reference
  12. 13Contract · plugin

    Plugin receipts

    craik.plugin_receipt records plugin actions and outputs under policy — joinable to the task, actor, and the plugin descriptor that authorized the call.

    Reference
  13. 14Contract · plugin

    Plugin capability grants

    The grant shape that authorizes a plugin to act under policy. Like regular capability grants but with plugin-descriptor binding so the authority can be retracted at the plugin boundary.

    Reference

9 · Integrations & migrations

GitHub, MCP, adjacent runtimes, and migration paths from existing tools.

Work-graph export

The typed graph — exportable for review and audit.

A single command exports the work graph as deterministic, redacted JSON or Graphviz DOT. Use it for review, audit, or visualization — the export is a read-only projection over the runtime objects already in $CRAIK_HOME/state/.

Where to go next

  • Run, monitor, and maintainOperate
  • Govern executionSecure
  • Understand the modelLearn