Customer workflow — what to expect from Opencomplai¶
This guide walks a customer through using Opencomplai end-to-end: what to feed it, what comes out, and what obligations remain on the customer's side under the EU AI Act.
Opencomplai is a compliance toolkit, not a certification service. It produces structured, machine-readable evidence (a ScanStatusArtifact, an Annex IV dossier, and a tamper-evident ledger) that a customer can hand to an internal auditor, an external reviewer, or a notified body.
What the product actually does¶
- Classifies an AI system under EU AI Act risk levels: Unacceptable, High, Limited, Minimal.
- Runs deterministic compliance rules (Articles 5, 6, 25, Annex III) and produces pass/fail with a written rationale for each.
- Generates an Annex IV technical-documentation dossier (Article 11) per release candidate.
- Records every check as a Merkle-linked ledger event so the evidence chain can be independently verified.
It is not a policy/training/ticketing platform, and it does not auto-discover AI systems. Nothing happens in the running Docker stack until a customer pushes a manifest through it.
How the customer feeds it data¶
The only input the customer provides is a system manifest describing one AI system. Free-text intended_purpose is what the rule engine pattern-matches against Annex III categories.
Step 1 — install the CLI¶
Step 2 — initialise a manifest per AI system¶
A minimal MINIMAL-risk system needs only two flags:
opencomplai init \
--system-id "loan-decision-model" \
--intended-purpose "automated credit scoring for retail lending"
This creates system-manifest.json:
{
"system_id": "loan-decision-model",
"intended_purpose": "automated credit scoring for retail lending",
"compliance_target": "EU_AI_ACT",
"high_risk_presumption": false,
"commit_ref": "HEAD"
}
For HIGH-risk systems (anything that maps to Annex III, including the credit-scoring example above), the manifest must carry the real Annex IV Section 2 and 3 content. Either pass them inline:
opencomplai init \
--system-id "loan-decision-model" \
--intended-purpose "automated credit scoring for retail lending" \
--high-risk-presumption \
--training-data-description "5M anonymised loan applications 2018-2025, EU-only, GDPR-cleared lineage in s3://opencomplai-evidence/training-data-manifest.json" \
--model-architecture "Gradient-boosted decision trees (XGBoost 2.0), 1200 trees, depth 8, calibrated isotonic probabilities" \
--monitoring-approach "Hourly PSI drift checks per protected attribute; weekly KS test against the training distribution" \
--incident-response-procedure "runbooks/credit-scoring-incident.md (15-min p0 SLA, on-call rotation in PagerDuty)"
…or supply the long-form structured fields from a JSON file:
opencomplai init \
--system-id "loan-decision-model" \
--intended-purpose "automated credit scoring for retail lending" \
--high-risk-presumption \
--section-extras-file ./manifest-extras.json
Where manifest-extras.json looks like:
{
"training_data_description": "5M anonymised loan applications…",
"model_architecture": "Gradient-boosted decision trees…",
"performance_metrics": { "auc": 0.83, "calibration_error": 0.04 },
"known_limitations": [
"Degrades on applicants under 6 months of credit history",
"Not validated for non-EU residency"
],
"human_oversight_measures": [
"All declines reviewed by a human underwriter within 24h",
"Adverse-action notices generated by humans, not the model"
],
"monitoring_approach": "Hourly PSI drift checks per protected attribute…",
"incident_response_procedure": "runbooks/credit-scoring-incident.md"
}
If --high-risk-presumption is set without training_data_description or model_architecture, init emits a warning — the resulting dossier would otherwise misrepresent the system to an auditor.
Step 2b — (optional) corroborate declaration against code¶
The code corroboration scanner cross-checks your intended_purpose against AI capability signals in the repository (dependencies, imports, endpoints, model artifacts). It runs offline and never auto-edits the manifest or risk classification.
opencomplai scan --manifest system-manifest.json --repo-root .
# or opt-in during init/check:
opencomplai init ... --scan
opencomplai check --manifest system-manifest.json --scan
Honesty rules:
- Declaration remains authoritative; the scanner surfaces gaps for human reconciliation.
- "No local AI signals detected" is not a compliance verdict.
- Use
--fail-on new-majorin CI only when you are ready to gate on new discrepancies.
See scan command reference and examples/sample-system/under-declared-* fixtures.
Step 3 — run a check against the running stack¶
The 10-step service-backed workflow runs: validate manifest → classify → run control checks → generate Annex IV dossier → append events to the evidence ledger. This is the step that actually populates the evidence-vault database and Grafana metrics.
Step 4 — generate the Annex IV dossier (per release)¶
OPENCOMPLAI_API_URL=http://localhost:8080 opencomplai docs generate \
--manifest system-manifest.json \
--system-id "loan-decision-model" \
--commit-ref "$(git rev-parse HEAD)" \
--intended-purpose "automated credit scoring for retail lending" \
--provider-name "ACME Financial AI"
Passing --manifest ensures the Section 2/3 inputs from opencomplai init reach the dossier generator. Without it, those sections fall back to placeholder stubs — acceptable only for MINIMAL-risk systems.
Step 5 — verify the evidence chain¶
What the customer receives¶
Opencomplai produces exactly four artifacts. There is no scheduled email, no compliance score, no opinion. Everything is on-demand and machine-readable.
| # | Artifact | Where | Purpose |
|---|---|---|---|
| 1 | Human assessment table | opencomplai check stdout | Quick read: risk level + per-rule pass/fail + rationale. |
| 2 | compliance-artifact.json (ScanStatusArtifact) | working directory | Machine-readable CI gate result. Fields: result, failed_controls, evidence_hashes, rationale_hash, signature. |
| 3 | dossier_<id>.json (Annex IV technical documentation) | --output-dir or evidence-vault CAS | The deliverable a regulator or notified body asks for under Article 11. Five sections: (1) system description, (2) development process + training data, (3) human oversight & monitoring, (4) logging, (5) risk management. SHA-256 bundle_checksum. |
| 4 | Merkle-linked ledger events | evidence-vault PostgreSQL | Tamper-evident audit trail (compliance_check_started, compliance_check_completed, …). Independently verifiable. |
CI exit codes double as the report for automated gates:
| Code | Meaning |
|---|---|
0 | pass |
1 | control failure (one or more rules failed) |
2 | invalid manifest |
3 | policy block (prohibited practice — Article 5) |
4 | trap detected (substantial modification — Article 25) |
The customer's operating checklist¶
Opencomplai gives you evidence and rule outputs — it does not make you compliant on its own. A realistic operating checklist:
- One manifest per AI system, committed alongside that system's code.
intended_purposemust be accurate — it drives Annex III classification. opencomplai checkruns in CI on every PR, with exit codes1/3/4blocking merges.opencomplai docs generateruns on every release tag, producing the Annex IV dossier for that version. Storedossier_<id>.json+bundle_checksumas a release artifact.- For HIGH-risk systems (Annex III match), the team owes the real substance behind the dossier — these fields are NOT auto-filled by the engine:
- Section 2: training-data description, model architecture, performance metrics, known limitations. Supply via
opencomplai init --training-data-description ... --model-architecture ...or--section-extras-file. - Section 3: human-oversight measures, monitoring approach, incident-response procedure. Same input path.
- Section 5: rationale + failed-rule remediation (carried by the rule outputs). When the manifest does not provide these, the generator falls back to stubs (
"Not specified in this release.") and the dossier'ssignature_statusmakes the trust level explicit so an auditor cannot mistake the artifact for a fully populated one. - Run
verify-ledgerperiodically (weekly, and before any audit). If the chain breaks, evidence is no longer trustworthy. - Retain logs for the EU-AI-Act-required period — default
LOG_RETENTION_DAYS=2555(7 years) is already set in Section 4 of the dossier. - If
EU_AIA_ART25_MODIFICATION_TRAPfires (substantial modification declared), do not redeploy until a fresh conformity assessment is signed off. The system enforces this with exit code 4. - Air-gap mode — set
EGRESS_ALLOWED_DESTINATIONS=(empty) ininfra/compose/.envif the customer needs to prove no data leaves their network during assessments.
Gateway authentication and dossier signing modes¶
Gateway auth (OPENCOMPLAI_API_KEY) — the gateway refuses to start unless one of the following is set in infra/compose/.env:
OPENCOMPLAI_API_KEY=<strong-shared-secret>— every non-/healthrequest must carryx-api-key: <secret>. This is the only acceptable production setting.OPENCOMPLAI_AUTH_DISABLED=1— explicit dev/CI escape hatch, logs a warning on every boot. Never use in production.
Generate a secret with:
Dossier signing modes — the dossier's own signature_status field tells an auditor exactly what trust level the artifact carries:
signature_status | What it means | When |
|---|---|---|
unsigned | No signature applied. Bundle checksum is still present for tamper detection. | OSS default. |
hmac-local | HMAC-SHA256 with a local symmetric key (LOCAL_SIGNING_KEY_PATH). Verifiable only by holders of the same key. | OSS with a configured local key — adequate for in-org integrity, not for third-party audit. |
ed25519 | Asymmetric Ed25519 signature (DOSSIER_SIGNING_KEY_PATH). Verifiable by anyone holding the corresponding published public key. | Pro/Enterprise, or any deployment that publishes a verification key. |
When both signing keys are set, Ed25519 always wins — the system never silently downgrades to a weaker mode.
What Opencomplai will NOT do for you¶
- It will not auto-fill training-data lineage, performance metrics, or oversight procedures. Those are human inputs into the dossier.
- It will not classify by reading model weights or code — only by the manifest's free-text
intended_purposeand explicit answers (profiling_detected,substantial_modification,high_risk_presumption). - It is not certification. The dossier is a structured input for an internal conformity assessment or a notified-body review — not a regulator-issued stamp.
- The OSS edition produces an unsigned dossier by default, an
hmac-localdossier when a symmetric key is configured, or aned25519dossier when an Ed25519 PEM private key is configured. HSM/KMS key management, key rotation, and a hosted multi-tenant verification view are the Pro/Enterprise (SaaS) tier.
Not sure if the EU AI Act applies to your system?
Use the EU AI Act Checker — an interactive wizard that runs entirely in the browser and walks through provider/deployer scope, high-risk classification, GPAI, and obligations. No account or backend required.
TL;DR¶
Run
opencomplai init→opencomplai check→opencomplai docs generateagainst the Docker stack, once per AI system per release. You receivecompliance-artifact.json(CI pass/fail),dossier_<id>.json(Annex IV documentation), and a tamper-evident ledger entry. That bundle is what you hand to your auditor. The Docker stack stays empty until that CLI runs.
Demo data — pre-seeded reference scenarios¶
The running stack is pre-loaded with five representative AI systems covering every risk tier and a range of real-world compliance narratives. These are safe to explore, reset, and re-seed at any time — everything is prefixed demo- so it cannot touch production data.
The five demo systems¶
| System ID | Name | Risk class | EU AI Act category | Narrative |
|---|---|---|---|---|
demo-credit-scoring-v1 | Credit Risk Scorer v1.3.0 | HIGH | Art. 6 + Annex III §5b | Mostly passing, with a mid-period failure window on controls CTRL-002 and CTRL-005, plus bias alerts. Recovers cleanly. |
demo-hr-hiring-v2 | HR Candidate Ranker v2.0.1 | HIGH | Annex III §4a | Passes, then 5 consecutive failures on CTRL-004, then a HITL halt (3-week scan gap), then full remediation and resumption. |
demo-medical-triage-v1 | Medical Triage Assistant v1.1.0 | HIGH | Annex III §1a | All 30 scans pass, but the policy bundle is frozen at v1.0.0 — triggers a policy-drift alert even with a clean scan record. |
demo-customer-chat-v1 | Customer Service Bot v3.2.0 | LIMITED | Art. 50 transparency | Continuously green. 30 passing scans, no failures, very low pending verifications. Shows what a well-maintained limited-risk system looks like. |
demo-inventory-opt-v1 | Inventory Optimizer v1.0.4 | MINIMAL | Not listed (MINIMAL) | 30 passing scans at ~98% control pass rate. Minimal documentation required. Baseline for the simplest possible compliance posture. |
What is seeded¶
For each system the seeder injects:
- 5 risk-classification ledger events (one per system, timestamped 91 days ago — before any scan)
- 147 scan-status artifacts across a 90-day rolling window (30 per system except HR Hiring which has 27 due to the 3-week HITL halt gap)
- 4 Annex IV dossiers for the three HIGH-risk systems (HR Hiring gets two — pre-halt v2.0.1 and post-remediation v2.0.1-remediated)
- 5 compliance badges (one per system,
result=pass,pending_verifications_count=0) - 3 HITL ledger events for
demo-hr-hiring-v2:hitl_halt(day −65),hitl_review_started(day −65),hitl_resume(day −50) - 8 bias alerts: 4 for Credit Scoring (HIGH → HIGH → MEDIUM → LOW, showing a severity trend that resolves), 4 for HR Hiring (HIGH × 2 pre-halt, MEDIUM and LOW post-remediation)
Grafana dashboard¶
Open http://localhost:3001 after docker compose up. The Opencomplai — Compliance Health dashboard shows live values from the seeded data:
| Panel | Seeded value |
|---|---|
| Control pass rate | ~94.6% (139 pass / 147 total) |
| First scans completed (total) | 124 |
| Dossiers generated (total) | 4 |
| Badges issued (total) | 5+ |
| Egress blocked (total) | 0 |
Panels update in real time — every opencomplai check you run increments the counters.
Using the demo CLI¶
The seeder runs automatically on docker compose up. To run it manually or explore the reset workflow:
Re-seed (idempotent, safe to run at any time):
docker exec compose-evidence-vault-1 python3 /app/scripts/seed_demo.py \
--gateway http://gateway-api:8080 \
--vault http://localhost:8002
Dry-run — print all payloads without writing anything:
docker exec compose-evidence-vault-1 python3 /app/scripts/seed_demo.py \
--gateway http://gateway-api:8080 \
--vault http://localhost:8002 \
--dry-run
Wipe all demo- data:
docker exec compose-evidence-vault-1 python3 /app/scripts/reset_demo.py \
--vault http://localhost:8002
Wipe and immediately re-seed:
docker exec compose-evidence-vault-1 python3 /app/scripts/reset_demo.py \
--vault http://localhost:8002 \
--reseed
From the host (if Python is installed locally):
Tracing a complete narrative end-to-end¶
HR Hiring — HITL halt and remediation is the richest scenario to follow:
- Query the ledger for the halt event:
- Notice the 3-week scan gap in the timeline (indices 15–17 missing).
- Check the two dossiers — one generated before the halt (
v2.0.1-demo), one after remediation (v2.0.1-remediated-demo). - Inspect bias alerts: severity drops from HIGH to LOW across 4 alerts post-remediation.
- Run
verify-ledgerto confirm the Merkle chain is intact across the halt:
Credit Scoring — mid-period failure recovery shows what CTRL-002/CTRL-005 failures look like in the scan history and how bias alert severity tracks alongside (HIGH bias alerts coincide with the scan failures, then both resolve).
Medical Triage — policy drift without scan failure is the edge case where everything looks green on the surface (100% pass rate) but the policy bundle has not been updated for 90 days — demonstrating that a clean scan record alone is not sufficient evidence of compliance.