CI/CD gates
What you'll find here
The split CI surface — pull-request gates, coverage ratchet, conformance reports, provider integration, code scanning, dependency audit, static analysis, changed-file strictness, and nightly reliability.
Surface-split so failures point.
Craik CI is split by surface so failures point at the area that regressed.
Pull request gates
| Gate | Purpose |
|---|---|
lint | Ruff import, style, and modernization checks. |
type | Strict mypy checks for the craik package. |
unit | Full pytest suite with coverage XML and JUnit artifacts. |
contract | Runtime contract fixture conformance and report generation. |
integration | Recorded provider integration tests that do not require live credentials. |
security | Dependency audit, static analysis, policy baseline, release readiness, public docs hygiene. |
CodeQL | GitHub code scanning for Python on PRs and pushes to main. |
docs | Generated CLI docs, docs hygiene, Docusaurus build. |
package | Source distribution and wheel build, metadata validation, smoke install. |
Coverage ratchet
Currently 75% line coverage.
Configured in pyproject.toml. Conservative for the MVP
ramp — only moves upward after CI has been green at the higher
threshold on main.
Coverage artifacts uploaded from the unit job:
reports/tests/unit.xml
reports/coverage/coverage.xml
Conformance reports
The contract job validates every registered runtime schema against the pinned fixture bundle and writes:
reports/conformance/contracts.json
reports/conformance/contracts.md
These reports are uploaded as CI artifacts and also produced by the nightly reliability workflow.
Provider integration
Cassettes, not live credentials.
The integration job runs
pytest -m "integration and not live" tests/integration.
Live-shaped payloads pass through cassettes so PRs validate provider
wiring without paid API keys or secrets in CI.
Optional live provider checks stay gated behind explicit env flags
such as CRAIK_RUN_LIVE_TESTS=1 and are not part of the default PR
gate.
Code scanning
The repo-owned CodeQL workflow runs Python analysis on pull requests
and pushes to main, using the default query suite with
build-mode: none. The analyze step uses its default SARIF upload
behavior; results publish to GitHub Security → Code scanning when
repository code-scanning configuration permits advanced workflow
uploads.
Dependency audit
The security job exports uv.lock to a hash-pinned requirements
file and runs pip-audit against that locked set. The audit runs in
pip-free mode so CI checks the committed lock state without resolving
a new dependency graph.
Static analysis
Bandit
Runs against src/craik with an explicit baseline for accepted findings. New findings fail the job and should either be fixed or added to the baseline only with a reviewable rationale.
Gitleaks
Runs against the checked-out tree with repo-local .gitleaks.toml. The configuration extends the default rules and only allowlists generated documentation dependency trees plus intentionally fake test credential fixtures.
Changed-file strictness
scripts/check_changed_file_strictness.py enforces minimum review
discipline.
Package source changes
Require tests.
Contract model changes
Require contract test coverage.
Workflow-only changes
Require a docs or scripts rationale change.
Guard, not substitute.
This is a low-friction guard against high-risk changes landing without visible verification — not a substitute for review.
Nightly reliability
The nightly workflow runs the full quality, coverage, conformance,
security, docs, and package checks on main and uploads all artifacts.