The failed response is identical to the validated response except for the removal of "Based on the evidence," at the opening. The remaining issue flagged is that "Let me break this down for the panel" — an internal workflow/presenter phrase — remained. That is the only change needed.
Here is the cleaned response:
What We Saw at Pwn2Own Berlin 2026
The AI platform category was a bloodbath, and the vulnerability profiles tell a clear story. Here is what dropped:
LiteLLM ($40K) — Chained three bugs, including SSRF and code injection. According to Security Affairs, k3vg3n "chained three bugs, including a server-side request forgery and a code injection, to bring down LiteLLM, walking away with $40,000." This is classic web application exploitation in the serving layer — LiteLLM acts as a gateway proxying requests between apps and multiple LLM providers, so SSRF here enables lateral reach to internal services.
OpenAI Codex ($40K) — Two independent exploits from Compass Security and maitai of Doyensec. Per Security Affairs, "Two separate teams, Compass Security and maitai of Doyensec, each collected $40,000 for independently exploiting OpenAI's Codex coding agent." Codex agents execute code in sandboxes, so we are likely seeing sandbox escapes or over-permissive tool-use boundaries. The ZDI rules list this at $40,000, not $20,000.
Cursor ($15K and $30K) — Two separate exploits. Security Week reports "Cursor exploits earned $15,000 and $30,000" — different chains, different bounties.
Ollama ($28K) — Included a known vulnerability in the chain. As Security Week notes, "an Ollama exploit earned researchers $28,000 (the exploit included a known vulnerability)." This is telling: attackers are packaging known primitives into ML-specific weaponization paths.
Claude Code, LM Studio, NVIDIA Megatron Bridge, Chroma — $20K each. Per Security Week, "$20,000 bounties were received by Pwn2Own participants for OpenAI Codex, Claude Code, LM Studio, NVIDIA Megatron Bridge, and Chroma vulnerabilities." (Note: OpenAI Codex appears to have had both $40K and $20K tiers depending on exploit chain completeness.)
Highest Downstream Risk
Chroma and LiteLLM are the biggest concern. Not because their bugs were higher severity, but because of architectural positioning:
LiteLLM sits at the API gateway layer — it sees all model requests, all API keys, all routing logic. SSRF here means pivoting to internal cloud metadata endpoints, exfiltrating credentials, or reaching internal services the LLM was never supposed to touch.
Chroma is the vector database every RAG pipeline uses. It stores embeddings from internal documents. A Chroma RCE is not just code execution — it is retrieval-surface poisoning, which means you can manipulate what the LLM "knows" without ever touching the model weights. That is the OWASP LLM Top 10 entry for Insecure Output Handling and Sensitive Information Disclosure rolled into one.
The NVIDIA Megatron Bridge exploit is also notable — anything at that layer touches model serving infrastructure, which means access to model artifacts, fine-tuning data, and the ability to poison outputs at scale.
On LiteLLM as "Soft Underbelly"
Four months between the TeamPCP supply chain compromise and a Pwn2Own pop is not a coincidence — it is a pattern.
April 2026: TeamPCP poisoned PyPI versions 1.82.7 and 1.82.8 for about 40 minutes, backdoored Mercor, stole 4TB including candidate PII, video interviews, source code. Lapsus$ extortion followed. Meta paused $10B in Mercor contracts. Periculo reports "939 GB of source code, 211 GB user database, and 3 TB of storage buckets containing video interviews and identity documents" were exfiltrated.
May 2026: A completely unrelated researcher chains three bugs in LiteLLM for $40K at Pwn2Own. Different attack vector entirely — runtime exploitation, not supply chain.
LiteLLM is getting hit from both the dependency layer and the runtime layer. That makes it a structural risk point in enterprise AI stacks. Organizations are using LiteLLM as the abstraction layer for multi-provider LLM access — it is the throat to choke for any shop running production inference.
What Orgs Running LiteLLM Should Do Right Now
Anyone still running affected 1.82.x versions should have already rotated credentials after April. If you have not, stop reading and do that now.
Beyond that:
Segment the LiteLLM deployment — it should not have network reach to internal services, cloud metadata, or sensitive databases. The SSRF chain in Berlin worked because the gateway had overreach.
Disable unnecessary tool-use features — if you are not using the agentic capabilities, turn them off. The attack surface expands dramatically when the proxy can execute code or make outbound requests.
Monitor for prompt injection in the RAG context — even when the bug classes were traditional SSRF/code injection at Pwn2Own, the injection points are often user-controlled prompts or tool outputs flowing through LiteLLM.
Pin to known-good versions with hash verification — the April supply chain attack worked because PyPI does not have strong attestation. Use pip's hash pinning or opaque dependency verification.