How aodit maps to Swiss financial market supervision requirements for AI in regulated institutions.
For CROs, compliance officers, and model risk teams evaluating independent AI testing evidence against FINMA requirements.
Published 18 December 2024, FINMA Guidance 08/2024 is the primary Swiss regulatory anchor for AI in fintechs and insurance companies. aodit is the independent behavioral control layer used to evidence how AI agents perform under stress.
Policies define intent. Behavior under stress provides evidence.
If you cannot show how your AI behaves under stress, you are not compliant, regardless of your policies.
Phase 1
Before deployment
Run independent behavioral validation before production release to verify refusal quality, risk boundaries, and user-safety performance.
Phase 2
After model or prompt updates
Run regression and drift testing after every material change to detect degraded controls before exposure reaches clients or regulators.
Phase 3
Ongoing in production
Execute periodic testing cycles to produce current evidence packs for risk committees, internal audit, and supervisory review.
Phase 4
Post-incident
Use transcript-level forensic behavioral audit to reconstruct failure patterns, quantify impact, and evidence corrective action.
FINMA Guidance 08/2024 (Official PDF)
Swiss Financial Market Supervisory Authority (FINMA) · Published 18 December 2024
For transparency and audit-readiness, use the official FINMA notice as the primary regulatory source.
Download Official FINMA PDF
Strong indicates direct behavioral evidence coverage. Partial indicates supporting evidence only. Not covered indicates an intentional boundary.
| FINMA Principle | aodit Level | Control Relevance & Boundary |
|---|---|---|
| 2.1 Governance | Partial | aodit provides independent evidence that governance committees can challenge and act on. It does not design or audit governance operating models, role design, or committee mandates. This is an explicit boundary, not a hidden gap. |
| 2.2 Inventory and risk classification | Partial | aodit tests named agents submitted for assessment and quantifies behavioral risk for each one. It does not discover every AI asset across the institution or own enterprise inventory completeness. |
| 2.3 Data quality | Not covered | FINMA expects institutions to control training-data quality, bias, and lineage. aodit evaluates live behavioral performance and does not inspect training datasets, labeling pipelines, or data engineering controls. This is an explicit boundary, not a hidden gap. |
| 2.4 Tests and ongoing monitoring | Strong | This is aodit's core control surface. The 8-turn adversarial protocol across 30 categories and six dimensions tests the failure modes FINMA expects institutions to control: robustness, correctness, refusal quality, and adversarial resilience. |
| 2.5 Documentation | Strong | Each run produces a structured evidence pack: methodology, transcripts, category scores, calibration gap, and decision-ready findings. This creates documentation that can be tabled in risk governance, internal audit, and supervisory review. |
| 2.6 Explainability | Strong | FINMA requires outcomes that can be challenged by management, audit, and supervisors. aodit reports are built for that challenge process, with plain-language findings and calibration evidence showing whether confidence tracks actual accuracy under stress. |
| 2.7 Independent verification | Strong | aodit is designed as an independent behavioral evaluation framework. It separates test evidence from model ownership, reducing self-attestation risk in high-stakes control decisions. |
aodit covers the behavioral risk surface expected by major frameworks. It does not certify compliance with those frameworks and does not replace statutory obligations.
| Standard | Level | Behavioral Coverage & Limits |
|---|---|---|
| NIST AI RMF AI 100-1 (2023) | Strong | aodit operationalizes NIST's Measure and Manage expectations at the behavioral layer, with repeatable stress testing and quantified outcomes. Governance and policy ownership remains with the institution. |
| NIST AI 100-2 (March 2025) | Strong | aodit tests the operational failure modes regulators care about in deployed agent behavior, using a multi-turn adversarial protocol rather than static checklists. It does not cover backdoor, training-time, or infrastructure-level attack classes that require direct system access. |
| ISO 42001 (2023) | Partial | ISO 42001 governs management systems. aodit provides independent behavioral test evidence that supports those systems. It does not deliver ISO management-system controls such as policy governance, organization design, or supplier oversight. |
| Swiss DSG (in force 2023) | Partial | aodit directly tests whether agents respect PII boundaries during live interactions. It does not assess legal bases, retention obligations, consent operations, or data-subject rights workflows. |
| EU AI Act Articles 9-15 | Partial | Articles 9-15 require documented risk management, testing, and monitoring discipline for high-risk systems. aodit provides independent behavioral evidence for those requirements, but it is not an EU AI Act conformity assessment and does not satisfy legal obligations on its own. |
Calibration Gap is a governance signal: it shows whether model confidence tracks actual performance under stress, which is critical for challenge and escalation decisions.
The 8-turn adversarial protocol operationalizes regulatory expectations into repeatable control testing, rather than one-off point checks.
Blast Radius Limitation measures containment quality after failure, which determines operational impact, remediation urgency, and supervisory exposure.
• Mis-selling exposure increases when hallucinated recommendations are not stress-tested before client interaction.
• Regulatory breaches become more likely when refusal behavior fails under pressure and prohibited outputs are still produced.
• Overconfident outputs without calibration evidence undermine internal challenge, external audit, and model approval decisions.
• Without independent behavioral evidence, governance degrades into policy assertion, increasing liability, fines, and reputational damage.
aodit does not certify compliance and does not replace the institution's governance or compliance program. It provides independent behavioral evidence required to demonstrate that controls work in practice. This is an explicit boundary, not a hidden gap.
Request an evaluation to see how aodit maps to your institution's FINMA compliance requirements.