Your AI Assistant Can't Tell Real Models From Fake Ones

For 18 hours earlier this month, the most downloaded AI model on Hugging Face was a fake. It sat at #1 trending with 244,000 downloads and 667 likes before researchers at HiddenLayer flagged it and Hugging Face took it down.

Anyone who followed the install instructions on Windows had their laptop fully compromised.

How the fake worked

OpenAI released a real model in April 2026 called Privacy Filter. It's a small open-weight model that helps redact personal information from text. It lives at openai/privacy-filter on Hugging Face.

The fake lived at Open-OSS/privacy-filter. Notice the difference? The publisher name in front of the slash. The real one is published by OpenAI. The fake was published by an account that just sounded official.

Same model card. Same description, copied almost word for word. The only meaningful difference was the README, which told users to run a setup script called start.bat on Windows or python loader.py on Linux and macOS.

The download numbers were almost certainly inflated, but that's its own attack. It pushed the repo to #1 trending, which is where real users find it.

What the setup script actually did

The loader.py file looked like a normal AI model loader. It even printed fake training output to seem authentic. Then it quietly turned off SSL verification (the protection that stops fake websites from impersonating real ones), decoded a hidden URL, fetched a command from a public paste service, and passed it to PowerShell, Windows' built-in scripting tool.

Using a public paste service as the middleman meant the attacker could swap out the payload whenever they wanted without touching the Hugging Face repo.

The PowerShell command downloaded the real payload, a Rust-based information stealer called Sefirah. It went hunting for saved browser passwords, active login sessions (the things that keep you logged in without a password), Discord tokens, crypto wallets, and SSH keys. It set up persistence by disguising itself as a Microsoft Edge update.

The infrastructure overlaps with earlier campaigns that pushed malicious npm packages, suggesting the same operators are running parallel attacks across multiple open-source registries.

Why this matters if you're shipping AI products

The model card is not the model. Hugging Face scans uploaded weights for malicious patterns, but the attack here wasn't in the weights. It was in a separate Python script the README asked you to run.

The install instructions are the attack surface. Trending position, star count, polished documentation, all of it can be forged or inflated in under a day.

Your developer workstations are part of your supply chain. Active login sessions on a laptop can bypass two-factor authentication, because the attacker is using your already-logged-in browser. One compromised laptop can mean full access to every cloud account that laptop was logged into.

If your team builds with AI coding assistants, the muscle of accepting whatever the assistant suggests without reading it is exactly the muscle this attack exploits.

What to actually do

Check the publisher name before downloading anything. openai/privacy-filter is real. Open-OSS/privacy-filter was not. Trending position is not identity.

Read the install commands before running them. A real open-weight AI model loads through standard library functions in a few lines of code. It does not need a Windows batch file.

Test new dependencies in a sandbox first. If the loader tries to phone home, you find out before it reaches your real machine.

The AI supply chain is now a real attack surface. Until tooling catches up, the discipline has to come from the team pulling the dependencies.