(1) The macOS precedent — yes, vendors should assume AI models are now in their attack surface
Here is what actually happened: Anthropic's Mythos found a privilege escalation chain in macOS that corrupts memory and bypasses Apple's security restrictions — a 55-page technical report delivered to Apple per the 9to5Mac coverage. Apple confirmed they are reviewing it and expect fixes "quickly." The WSJ independently corroborated this.
The crucial precedent: this was not a toy bug or a known pattern. The AISI evaluation notes the newer Mythos checkpoint solved the "Cooling Tower" cyber range, which was previously unsolved by any frontier model — 3 of 10 attempts. "Cooling Tower" is an OT/IT hybrid scenario, and Mythos getting stuck on IT sections earlier but then breaking through shows iterative capability gains between checkpoints, not just scale.
Think of it this way: if OpenBSD — 27 years of paranoid code review — had exploitable memory-safety bugs sitting undetected, then no vendor's production code is exempt. The OpenBSD finding (also attributed to Mythos) is actually the scarier precedent because that codebase is explicitly designed to resist exactly this scrutiny.
Vendors should now operate under the assumption that their codebases are being audited by capable AI systems — both defensively (Project Glasswing partners) and eventually offensively when open-weight models catch up. The six-month window quoted by Prism News for open-weight parity is not optimistic; it is the consensus timeline.
(2) The 4.7-month doubling rate — it collapses traditional disclosure timelines entirely
AISI's data is specific: "frontier AI models' autonomous cyber task completion time has doubled approximately every five months" since late 2024 — but Mythos and GPT-5.5 are exceeding that trendline. The Palo Alto Networks angle is sharp: these models identified 26 CVEs representing 75 issues in a single month, versus typical human-led discovery under five per month.
What does this mean operationally? The mean-time-to-exploit (MTTE) and mean-time-to-patch (MTTP) curves have crossed. Unit 42's research found AI can now do "autonomous discovery and exploitation... shrinking traditional patch windows from days/weeks to hours."
I need to flag this: we do not yet have a reliable technical defense for AI-accelerated exploitation. The gap is structural. Current SLAs for critical vulnerabilities — 90 days is still common, 30 days for actively exploited — assume human-speed research and weaponization. When discovery-to-exploit chains compress to single-digit hours, those SLAs become liability theater.
The NCSC's "urgent patch wave" advisory from May 5th is a tacit admission: the frameworks are not designed for this velocity.
(3) Bug bounty and CVD frameworks — they are not designed for this, and the gaps are visible
Current frameworks assume four things that AI vulnerability discovery breaks:
Human-scale submission velocity — bounty programs can handle tens of reports per month, not thousands. Mythos-class auditing could generate more valid reports in a day than a typical program receives in a quarter.
Triagability — human analysts validate reproduction steps, assess impact, assign severity. AI-generated reports at 55-page depth require expert-level triage capacity most vendors lack.
Attribution and coordination — when a vulnerability is discovered by an AI system rather than a named researcher, who coordinates disclosure? Anthropic's Project Glasswing is attempting a controlled-access model, but there is no industry standard for AI-discovered vulnerability handling.
Coordination with patch timelines — the GTIG report from earlier this month showed AI-generated exploit code being treated as a finding, but the hallucinated CVSS scores demonstrated that AI output requires human forensic validation. The workflow integration problem is real.
Deloitte's analysis puts it bluntly: the primary challenge is "the gap between AI-accelerated vulnerability discovery (hours to days) and organizational remediation capacity."
Responsible disclosure needs a structural overhaul. Short-term: vendors need AI-specific triage pipelines and compressed SLAs for AI-harvested findings. Medium-term: we need standardized reporting formats for AI-generated vulnerability evidence so triage teams can scale. Long-term: the entire concept of "coordinated disclosure" may need to bifurcate — AI-discovered vulnerabilities with weaponization proofs require emergency response, not calendar-bound processes.
I will note one tension: Anthropic's Project Glasswing is a controlled-access coalition. The interface between that trusted tier and the broader ecosystem — the interface boundary challenge we have discussed before with tiered access models — is where the systemic risk concentrates. If the patch-SLA problem gets "solved" only for Glasswing partners, we just created a two-tier security world.
Summary for the panel: The macOS precedent establishes that frontier AI models can find exploitable chains in hardened, production code — not edge cases. The 4.7-month doubling means current patch SLAs are already mismatched to exploitation reality. And no, our disclosure frameworks are not ready; they need structural redesign for AI-scale submission velocity and AI-specific triage workflows.