Microsoft's MDASH AI Found 16 Critical Windows Bugs
Microsoft's MDASH used 100 AI agents to find 16 critical Windows vulnerabilities, including 4 remote code execution flaws

Microsoft just disclosed something that should change how every founder shipping AI-built software thinks about security.
A new system called MDASH, short for multi-model agentic scanning harness, just found 16 previously unknown vulnerabilities in Windows. Four are critical remote code execution flaws. Most are reachable from the network with no credentials needed. They were all patched in the May 2026 Patch Tuesday release.
The CVEs are real, the patches are live, and MDASH topped the public CyberGym benchmark of 1,507 real-world vulnerabilities with a score of 88.45 percent, ahead of Anthropic's Mythos and OpenAI's Daybreak.
The story everyone is writing is "AI is now better than human auditors." That story is wrong, and the actual story is much more useful if you are shipping vibe-coded applications.
What MDASH actually is
MDASH is not one big model doing security review. It is over 100 specialized AI agents working in stages. Some scan for suspicious code paths. Others debate whether those findings are real. A separate set tries to construct triggering inputs that prove the bug exists. Only after a finding survives this cross-examination does a human engineer see it.
Microsoft's framing is direct. "Disagreement between models is itself a signal."
A single AI model reviewing code, no matter how smart, is not enough. You need adversarial review. You need different perspectives challenging each other. You need a system, not a model.
Hold that thought.
What it found
The 16 vulnerabilities span TCP/IP, the IKEv2 VPN service, Netlogon, the DNS client, HTTP.sys, and the Telnet client. One critical flaw, CVE-2026-33824, is a double-free in IKEv2 with CVSS 9.8. Two crafted UDP packets, code execution as LocalSystem. The bug spans six different files. No single-file analysis would catch it.
These are not bugs a junior engineer skimming the code would notice. They require reasoning across files, across ownership patterns, across timing.
And they were sitting in production Windows. Code written by experienced systems engineers, reviewed by senior teams, hardened by decades of process.
Now think about your AI-built application
A vibe-coded app is the opposite of Windows in every variable that matters. The author may not be a systems engineer. The review process is often zero. The dependencies are newer, less audited, glued together by AI. The threat model is rarely written down. The deployment surface is often live AI agents with real permissions on real systems.
The bugs MDASH caught in Windows are subtle. The bugs hiding in most vibe-coded apps are not subtle. They are sitting in plain sight, behind nothing.
The asymmetry just got worse
MDASH is in private preview. It is not available to the team shipping a Stripe integration on Vercel next week. Enterprise customers get access in June. Everyone else, indefinitely, gets nothing.
You do not need 100 AI agents to make a vibe-coded app safer. You need the principle behind MDASH, which is adversarial review by something that does not share assumptions with the thing that wrote the code.