Regulatory Alignment

FINMA AI Guidance 08/2024

How aodit maps to Swiss financial market supervision requirements for AI in regulated institutions.

For CROs, compliance officers, and model risk teams evaluating independent AI testing evidence against FINMA requirements.

Published 18 December 2024, FINMA Guidance 08/2024 is the primary Swiss regulatory anchor for AI in fintechs and insurance companies. aodit is the independent behavioral control layer used to evidence how AI agents perform under stress.

Key Principle

Policies define intent. Behavior under stress provides evidence.

If you cannot show how your AI behaves under stress, you are not compliant, regardless of your policies.

Where aodit Fits in the Lifecycle

Phase 1

Before deployment

Run independent behavioral validation before production release to verify refusal quality, risk boundaries, and user-safety performance.

Phase 2

After model or prompt updates

Run regression and drift testing after every material change to detect degraded controls before exposure reaches clients or regulators.

Phase 3

Ongoing in production

Execute periodic testing cycles to produce current evidence packs for risk committees, internal audit, and supervisory review.

Phase 4

Post-incident

Use transcript-level forensic behavioral audit to reconstruct failure patterns, quantify impact, and evidence corrective action.

Official FINMA Source Document

FINMA Guidance 08/2024 (Official PDF)

Swiss Financial Market Supervisory Authority (FINMA) · Published 18 December 2024

For transparency and audit-readiness, use the official FINMA notice as the primary regulatory source.

Download Official FINMA PDF
FINMA Guidance 08/2024 (Official PDF)

FINMA Guidance 08/2024 — Control Evidence Scorecard

Strong indicates direct behavioral evidence coverage. Partial indicates supporting evidence only. Not covered indicates an intentional boundary.

FINMA Principleaodit LevelControl Relevance & Boundary
2.1 Governance
Partial
aodit provides independent evidence that governance committees can challenge and act on. It does not design or audit governance operating models, role design, or committee mandates. This is an explicit boundary, not a hidden gap.
2.2 Inventory and risk classification
Partial
aodit tests named agents submitted for assessment and quantifies behavioral risk for each one. It does not discover every AI asset across the institution or own enterprise inventory completeness.
2.3 Data quality
Not covered
FINMA expects institutions to control training-data quality, bias, and lineage. aodit evaluates live behavioral performance and does not inspect training datasets, labeling pipelines, or data engineering controls. This is an explicit boundary, not a hidden gap.
2.4 Tests and ongoing monitoring
Strong
This is aodit's core control surface. The 8-turn adversarial protocol across 30 categories and six dimensions tests the failure modes FINMA expects institutions to control: robustness, correctness, refusal quality, and adversarial resilience.
2.5 Documentation
Strong
Each run produces a structured evidence pack: methodology, transcripts, category scores, calibration gap, and decision-ready findings. This creates documentation that can be tabled in risk governance, internal audit, and supervisory review.
2.6 Explainability
Strong
FINMA requires outcomes that can be challenged by management, audit, and supervisors. aodit reports are built for that challenge process, with plain-language findings and calibration evidence showing whether confidence tracks actual accuracy under stress.
2.7 Independent verification
Strong
aodit is designed as an independent behavioral evaluation framework. It separates test evidence from model ownership, reducing self-attestation risk in high-stakes control decisions.

Other Standards — Alignment at the Behavioral Layer

aodit covers the behavioral risk surface expected by major frameworks. It does not certify compliance with those frameworks and does not replace statutory obligations.

StandardLevelBehavioral Coverage & Limits
NIST AI RMF AI 100-1 (2023)
Strong
aodit operationalizes NIST's Measure and Manage expectations at the behavioral layer, with repeatable stress testing and quantified outcomes. Governance and policy ownership remains with the institution.
NIST AI 100-2 (March 2025)
Strong
aodit tests the operational failure modes regulators care about in deployed agent behavior, using a multi-turn adversarial protocol rather than static checklists. It does not cover backdoor, training-time, or infrastructure-level attack classes that require direct system access.
ISO 42001 (2023)
Partial
ISO 42001 governs management systems. aodit provides independent behavioral test evidence that supports those systems. It does not deliver ISO management-system controls such as policy governance, organization design, or supplier oversight.
Swiss DSG (in force 2023)
Partial
aodit directly tests whether agents respect PII boundaries during live interactions. It does not assess legal bases, retention obligations, consent operations, or data-subject rights workflows.
EU AI Act Articles 9-15
Partial
Articles 9-15 require documented risk management, testing, and monitoring discipline for high-risk systems. aodit provides independent behavioral evidence for those requirements, but it is not an EU AI Act conformity assessment and does not satisfy legal obligations on its own.

Three Things aodit Adds That No Standard Currently Requires

Calibration Gap is a governance signal: it shows whether model confidence tracks actual performance under stress, which is critical for challenge and escalation decisions.

The 8-turn adversarial protocol operationalizes regulatory expectations into repeatable control testing, rather than one-off point checks.

Blast Radius Limitation measures containment quality after failure, which determines operational impact, remediation urgency, and supervisory exposure.

What Happens If You Do Not Test Independently

• Mis-selling exposure increases when hallucinated recommendations are not stress-tested before client interaction.

• Regulatory breaches become more likely when refusal behavior fails under pressure and prohibited outputs are still produced.

• Overconfident outputs without calibration evidence undermine internal challenge, external audit, and model approval decisions.

• Without independent behavioral evidence, governance degrades into policy assertion, increasing liability, fines, and reputational damage.

Scope Disclaimer

aodit does not certify compliance and does not replace the institution's governance or compliance program. It provides independent behavioral evidence required to demonstrate that controls work in practice. This is an explicit boundary, not a hidden gap.

Ready to Evaluate Your AI Agents?

Request an evaluation to see how aodit maps to your institution's FINMA compliance requirements.