GuideJune 27, 20263 min read

"Show me the evidence": what auditors actually ask about your AI agents

ISO 42001 certification audits don't grade your AI policy — they grade your records. Here are the questions auditors are asking about agentic systems, and why a policy binder can't answer any of them.

Laila Osei

Policy & compliance

Guide

There's a moment in every ISO 42001 audit that separates the prepared from the confident. The auditor has read your AI policy. They nod. And then they say the five words the policy binder can't survive: "Show me it operating."

ISO/IEC 42001 is the first certifiable management-system standard for AI, and "management system" is the load-bearing phrase. As one audit-prep guide puts it, the audit isn't checking whether a policy exists — it's checking whether you can show repeatable governance: assigned owners, risk-based planning, operational controls, monitoring, and corrective action. Documentation is table stakes. What auditors actually want is proof of what did happen, not a description of what should happen.

For traditional software, producing that proof is annoying. For agentic systems — where the behavior is decided at runtime by a model — it's structurally hard, and most AI governance tooling doesn't produce it. As one analysis of 42001 for agentic AI notes: most tooling produces policies, dashboards, and reports. Very little produces per-action evidence that a control executed before the agent did.

The five questions to prepare for

Synthesizing the published audit guidance, the questions that surface gaps fastest look like this:

→"Which AI systems are in scope, and who owns each one?" An inventory of agents with named owners. If your inventory and your org chart disagree, the audit effectively stops here.
→"What is this agent allowed to do, and where is that enforced?" Not the policy document — the mechanism. A written rule an agent can ignore is a procedure, not a control.
→"Show me the record of what it actually did." Operational logs that can reconstruct agent behavior: which tools were called, what was allowed, what was blocked, when, attributed to which run.
→"Which actions require human oversight, and how is that enforced?" If the agent can complete a consequential sequence with no checkpoint the system enforces, oversight is an option, not a control.
→"Show me one issue that led to corrective action." Auditors want evidence of learning — a violation caught, a policy tightened, closure verified.

A certification auditor will ask: show me the mechanism. A policy that says "humans review outputs" is not a mechanism — it is a procedure. A mechanism is something the system cannot bypass.
— AgenticRail, on ISO 42001 for agentic AI

Why this is hard to retrofit

Every one of those questions has the same shape: it asks for a record produced at the moment the agent acted, by a control the agent couldn't bypass. That's not something you can reconstruct the quarter before your audit. Application logs help, but they're written by the same code the agent runs — an auditor (or a regulator under the EU AI Act's logging obligations) will reasonably ask what stops an errant process, or an engineer under deadline, from editing them.

The organizations that pass these audits comfortably all made the same move: they put the evidence-producing control in the runtime path, so that compliance evidence is a byproduct of normal operation instead of a scramble.

Where Vantio fits

This evidence problem is, almost line for line, what Vantio's audit trail was built to produce. Every agent action — observed, allowed, redacted, or blocked — is written to a tamper-proof, signed ledger as it happens, attributed to a specific run. The policy that allowed or blocked it is dashboard-managed, versioned, and enforced in the runtime, not in a binder. When the auditor says "show me," the answer is an export, not an archaeology project.

None of this makes an audit fun. It makes it short — and it means the same records that satisfy the auditor also tell you, on any ordinary Tuesday, exactly what your agents did.

Sources

ShareX LinkedIn YDiscuss on HN

PII redaction, spend caps, and host blocking — live in under an hour.

Put real guardrails on your agents →

Get the next one

Subscribe to The Brief — occasional, signal-only.

No spam. Email only — unsubscribe anytime.

"Show me the evidence": what auditors actually ask about your AI agents

The five questions to prepare for

Why this is hard to retrofit

Where Vantio fits

Get the next one

Keep reading

Your AI agent just inherited 96% of the access you never use

It deleted the whole database in nine seconds