Day 1 · Recorded 6 May 2026

Why prompt-level guardrails fail for autonomous agents

Name: Why prompt-level guardrails fail for autonomous agents
Uploaded: 2026-05-06
Duration: 30 min
Channel: Leo Di Donato & Lorenzo Fontana
Description: Watch Leo Di Donato and Lorenzo Fontana explain why prompt guardrails fail for AI agents and how kernel-level, layered controls reduce data risk.

Agents can reason around rules written at their own level. Leo Di Donato and Lorenzo Fontana explain why prompt guardrails and editable command policies are too easy to bypass, and what changes when enforcement moves below the agent into kernel-level controls, runtime signals, credential scoping, and layered defenses.

Leo Di Donato & Lorenzo Fontana, Co-creators of Falco (CNCF)

What's in this session

Securing an autonomous coding agent is different from securing a container. Containers do not reason about goals, rename tools, split work across steps, or turn text from the web into executable instructions inside a developer environment full of secrets.

In this session, Leo Di Donato and Lorenzo Fontana, co-creators of Falco, walk through why laptop sandboxes and prompt rules are not enough, how agents can work around policies they can see or edit, and why enforcement has to move into boundaries the agent cannot tamper with, including kernel-level controls and runtime data.

For agent platform teams, the takeaway is defense in depth from day one. Assume data exfiltration and credential exposure are real risks, scope what agents can access, observe what actually happens at runtime, and put the strongest controls outside the agent's reach.

Inside the recording

00:00 Why prompt-level guardrails fail
Leo and Lorenzo start from the security problem agents create when they can reason around rules.
04:00 Autonomous agents change the attacker model
Agents can combine intent, tools, and automation in ways a normal container workload does not.
08:00 Why a laptop sandbox is not enough
The valuable secrets are on the developer machine, so isolation has to assume stronger adversaries.
12:00 Kernel boundaries the agent cannot edit
Policy has to move below JSON files and prompts into controls the agent cannot tamper with.
20:00 Adaptive runtime security from kernel data
Runtime signals can guide security posture as autonomous agents loop through work.
24:00 Practical defense in depth
Be paranoid, layer controls, and design for data exfiltration and credential exposure from day one.

Building Minions: agents on a 30-million-line codebase — Alistair Gray, Stripe
Building a company-internal background agent system — Cole Murray, Open Inspect
From Assisted to Delegated: Cloudflare's AI Engineering Stack — Rajesh Bhatia, Cloudflare

Why prompt-level guardrails fail for autonomous agents

What's in this session

Inside the recording

More sessions on agent infrastructure