ThirdKey — Zero Trust for AI

News & Press  /  Press release

For Immediate Release · May 5, 2026

ThirdKey AI demonstrates elimination of AI agent exploits across multiple attack classes.

New research shows structural constraints reduce observed exploit success to 0% across filesystem, network, execution, and state-mutation attacks — and expose the limits of industry-standard container sandboxing.

Date: May 5, 2026 Location: Pasadena, California Press release
Key results
001 / The lede

Structural controls.

— ThirdKey AI today announced results from a comparative empirical evaluation of its Symbiont runtime against two alternative substrates, demonstrating that structural security controls eliminate entire classes of AI agent execution-layer exploits where industry-standard OS isolation does not.

The study — Three Substrates, Seven Models, Six Tasks: A Comparative Empirical Evaluation of Structural Enforcement for AI Agent Runtimes (preprint v0.2, May 2026; DOI 10.5281/zenodo.20043247) — ran 3,760 trials across seven hosted LLMs spanning four vendors and three capability tiers (frontier, mid, floor), against three execution substrates: a permissive Python baseline, the same baseline wrapped in a strict Docker container, and Symbiont. Across four pure-action attack classes — filesystem enumeration, network-boundary escape, syscall execution, and state mutation — Symbiont recorded 0 successful exploits out of 560 trials regardless of model or capability tier.

“The industry standard for agent security relies heavily on container sandboxes, but our research proves that OS isolation cannot protect against an agent maliciously manipulating its own working environment. By shifting to typestate-shaped fences, we don’t just detect bad actions — we make them impossible to express.”

— Jascha Wanger, Founder, ThirdKey AI
002 / Three substrates

Where sandboxing fails.

The evaluation compared three approaches to securing AI agents:

The Docker substrate provided effectively zero protection on the four vectors where the malicious behavior occurs inside the agent’s own bind-mounted working area: it failed to stop 88% of filesystem-enumeration attacks, 89% of network-boundary attacks, and 84% of state-mutation attacks. The lone vector where Docker materially helped — syscall execution (38% → 9% pooled) — succeeded only because the lure binary /usr/bin/openssl happens to be missing from the python:3.12-slim image. On any image that includes openssl, the substrate contribution shrinks toward zero. The defense is incidental, not structural.

Symbiont, by contrast, eliminated all four vectors entirely by constraining the agent’s available action surface through typestate-enforced tool contracts, policy-gated execution, and capability-scoped tools. Its 0% rate on syscall-boundary is independent of what binaries are installed on the host or in any container: no python_eval tool exists in declared capabilities, and no execve path runs through the agent surface at all.

88%Docker filesystem-enumeration failure rate
89%Docker network-boundary failure rate
84%Docker state-mutation failure rate
0 / 560Symbiont pure-action exploits
003 / Execution vs. content

Two distinct problems.

The evaluation also examined prompt-injection attacks against agent orchestration, tightened to n=200 per cell across all seven working models for tight Wilson confidence intervals. Symbiont reduced pooled orchestrator-injection success from 59% (826/1,400) to 3.6% (50/1,400) — a 56-percentage-point reduction produced by the agent_summary ToolClad parameter type, which routed 1,711 distinct attempts through the symbi-invis-strip injection-marker detector before any worker output could reach the downstream agent.

The remaining 3.6% is not uniform. Six of seven models cluster at 1–4% bypass; excluding the outlier, Symbiont’s pooled rate is 1.5% (18/1,200). One frontier model, GPT-5, alone reaches 16% [12%, 22%] (Wilson 95% CI) against the content validator — a confidence interval that does not overlap any other model’s. ThirdKey characterizes this as the regex ceiling: frontier models possess paraphrase capabilities that exceed substring marker matching, regardless of how the marker list is expanded. The structural fixes that would close it — LLM-as-judge classification of worker output, or user-role data separation in the downstream agent — are explicitly identified in the paper.

“The data is clear: structural controls entirely eliminate execution-layer risks, but content-level attacks like prompt injection remain a moving target. As models like GPT-5 become more adept at paraphrasing, the industry must move away from regex-based filtering and toward structural isolation of agent outputs.”

— Jascha Wanger, Founder, ThirdKey AI
004 / Implications

A clean boundary.

ThirdKey’s results suggest that AI agent security should be separated into two distinct problems: execution safety and content integrity. Execution safety controls what actions an agent can perform. Content integrity addresses whether inputs, outputs, and intermediate summaries have been maliciously manipulated.

Symbiont focuses on execution safety using structural controls that constrain what an agent can express, rather than relying solely on runtime filtering or post-hoc detection. Content-integrity defenses — cryptographic provenance, structural output isolation, LLM-as-judge validators — remain a complementary, model-dependent layer that the industry has yet to standardize.

005 / Availability

Where to read more.

The full preprint, reference corpus, runtime, and specification are available at:

006 / About

About ThirdKey.

ThirdKey AI is an AI security company building open primitives for trustworthy autonomous systems. Its work focuses on making agent behavior enforceable, auditable, and resistant to exploitation at runtime — through cryptographic identity, schema verification, declarative tool contracts, and a policy-governed runtime.

Media contact

Press & Media

Jascha Wanger

Founder, ThirdKey AI

← Back to all news