GuideMay 21, 20263 min read

Your agent returned 200. That tells you almost nothing.

An agent can return a clean HTTP 200 while hallucinating, calling a tool it was never allowed to touch, and drifting off-policy for weeks — and your APM will smile the whole time. What an agent audit trail actually needs to capture.

Sam Okafor

Platform engineering

Guide

Traditional monitoring asks one question: is the system up? For agents, that's the wrong question, and the gap between it and the right one is where the trouble lives. As one engineering team writing about regulated workloads put it: an agent can return HTTP 200 with a hallucinated response, call an unauthorized tool while latency metrics stay flat, and drift from its policy baseline over weeks without tripping a single alert.

Your dashboard isn't lying, exactly. It's answering the question it was designed for. It just has no idea what the agent actually did.

Telemetry, observability, and an actual record

These get used interchangeably and they shouldn't. Infrastructure telemetry is CPU, memory, latency, error rate. Useful, and completely blind to agent behavior. DataRobot's framing is the one I keep coming back to: if you can't see reasoning, tool calls, and behavior over time, you don't have observability — you have infrastructure telemetry.

But even rich observability isn't the same as a record you can hand to someone who doesn't trust you. A dashboard is for you, now. An audit trail is for an auditor, a regulator, or your own incident review, six months from now. Those need different things.

What an agent audit trail has to capture

The clearest articulation I've seen of the gap comes from Siddhant Khare's piece on agent observability, and it maps cleanly onto OpenTelemetry's data model. The shape you want:

→A trace ID per task, a span per tool call, with parent/child relationships — so a chain of “the agent read this, which led it to change that, which failed a test, which triggered a retry” is reconstructable, not lost.
→Structured, not scrollback. JSON lines you can query and replay, not terminal output you scrolled past. Background execution is the default now — headless runs, async tasks, background agents — and nobody is watching the terminal when it matters.
→Every tool call, LLM request, and file access with timestamp, inputs, outputs, duration, and result.
→The permission decision itself: what was requested, what policy applied, what was allowed or blocked. That's the line auditors actually ask about.

Why this stops being optional

Two forces are converging. Compliance is the loud one: high-risk-AI audit-trail expectations under the EU AI Act, SOC 2 auditors now asking pointed questions about agent governance, enterprise buyers who simply won't sign until you can show what your agent did and prove it. If you can't answer with structured data, that deal stalls.

The quieter force is a design choice you make early and regret late: where does the record live? Cloud observability tools generate plenty of visibility, but many of them delegate the governance decision — and the data — to someone else's platform. For anything sensitive, you want the trail inside your own boundary, in a format you control, that you can hand over without also handing over your prompts.

Where Vantio fits

Here's the honest pitch: the free tier is the audit trail people keep hand-rolling. Wrap your agent and every action becomes a structured, metadata-only event — trace ID, target host, action taken, bytes, an HMAC receipt you can verify — with zero prompt or completion content ever leaving your environment. No “send us your data so we can show you a graph.” You get the record, queryable and exportable, and you get it before a regulator or a customer asks for it. Start there; you can add enforcement later. The record is the part you'll wish you'd had from day one.

Sources

ShareX LinkedIn YDiscuss on HN

See exactly what your agents do — free, no credit card.

Start free with the Developer SDK →

Get the next one

Subscribe to The Brief — occasional, signal-only.

No spam. Email only — unsubscribe anytime.

Your agent returned 200. That tells you almost nothing.

Telemetry, observability, and an actual record

What an agent audit trail has to capture

Why this stops being optional

Where Vantio fits

Get the next one

Keep reading

Your AI agent just inherited 96% of the access you never use

It deleted the whole database in nine seconds