There’s a saying in security circles: the weakest link isn’t the lock on the front door but the spare key under the mat. This past week gave us two vivid, simultaneous demonstrations of that principle, and if you’re building anything in the AI space right now, both deserve your full attention.

The Mythos Leak and Accidental Transparency

Let’s start with Anthropic. On March 26, two security researchers, (Roy Paz of Layer Security and Alexandre Pauwels of the University of Cambridge) discovered that Anthropic’s content management system had been misconfigured to make uploaded assets public by default unless explicitly marked private. Nearly 3,000 unpublished internal documents spilled out, including a draft blog post describing a next-generation model internally called “Claude Mythos,” part of a tier Anthropic calls “Capybara.”

The model is described as a significant step beyond the current Opus flagship, including stronger benchmarks across coding, academic reasoning, and, most notably, cybersecurity tasks. The leaked draft describes it as “currently far ahead of any other AI model in cyber capabilities” and warns it “presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders.”

That’s Anthropic’s own language about their own model and now let that sink in.

To be clear, this was not a malicious breach. This was a CMS configuration error, which is a checkbox left unchecked and a default left unreviewed. But the lesson here isn’t about the model itself. It’s about the fact that Anthropic had been privately briefing government officials about Mythos’s cybersecurity implications for weeks before the accidental release and they knew there were security issues. The concern was real enough to warrant closed-door conversations at the highest levels and then a content management oversight made those conversations public for everyone.

The AI industry has a complicated relationship with transparency. We tout openness and responsible disclosure, but we also operate under the implicit assumption that the most capable systems will be handled with proportionate care. A draft blog post about an “unprecedented cybersecurity risk” doesn’t belong in a publicly accessible CMS bucket. This is a process failure, not a technical one, and it’s the kind of failure that scales badly as AI systems become more powerful. I have significant fears professionally that the speed of innovation where AI is concerned, is always going to leave us more vulnerable to scenarios like the above.

The LiteLLM Attack and When Security Tools Become the Attack Vector

Now, with that said, let’s talk about something, which in my mind, is the more instructive story of the week and the one that has direct consequences for anyone running AI workloads today.

LiteLLM is the connective tissue of the modern AI stack. If you’re building an application that calls OpenAI, Anthropic, Bedrock, Gemini, or virtually any other LLM provider, there’s a reasonable chance LiteLLM is sitting in the middle of it all as a unified proxy layer. It handles routing, fallback, cost tracking, and criticalmost anything which sits directly between your application and your API credentials. It downloads roughly 3.4 million times per day and is present in an estimated 36% of cloud environments.

On March 24, two malicious versions (1.82.7 and 1.82.8) were published to PyPI. They were available for approximately three hours before PyPI quarantined them.

What makes this attack genuinely sophisticated, and worth understanding in detail? TeamPCP, the threat group responsible, didn’t attack LiteLLM directly and one of the areas of breaches that I’m constantly fascinated with and what happened was part of a conversation during a podcast I was on with Hamish Watson earlier this week.

The hackers attacked Trivy, a widely used open-source vulnerability scanner, five days earlier by exploiting a misconfigured CI/CD workflow to exfiltrate its PyPI publishing credentials. LiteLLM’s own build pipeline used Trivy without a pinned version, so when that compromised scanner ran inside LiteLLM’s CI/CD process, the attackers inherited LiteLLM’s publishing credentials. One dependency, one unpinned version which resulted in one painful, chain reaction.

The payload they deployed was a three-stage attack: a credential harvester sweeping SSH keys, cloud credentials, Kubernetes secrets, .env files, and API tokens; a Kubernetes lateral movement toolkit deploying privileged pods across every node; and a persistent systemd backdoor polling a typosquatted domain for additional instructions. This wasn’t smash-and-grab but was designed for long-term persistence and expansion.

The nuance here that the headlines missed is the following: the attackers specifically targeted security tools. Trivy is a vulnerability scanner and Checkmarx KICS is an infrastructure-ascode security analyzer. These are the tools organizations trust to protect them. By compromising the guardians first, TeamPCP got the keys to the castle without ever having to knock on the front door. This is the reason I use the analogy of locking all the doors, the garage, the car doors, the trunk, etc., because you realize that many breaches happen over time and from many angles and layers, so you can dance like no one is watching, but you better secure like everyone is.

What the Two Stories Have in Common

These events may look different on the surface, as one is an accidental internal disclosure, the other is a deliberate criminal supply chain campaign, but they share an underlying dynamic that matters enormously to anyone advising organizations on AI adoption and that's we've built the AI infrastructure stack on a foundation of implicit trust, and that trust is being systematically exploited.

LiteLLM is trusted because it’s popular and because it simplifies a real problem. Trivy is trusted because it’s supposed to be a security tool. Anthropic’s CMS is trusted by employees uploading internal materials. In each case, the trust was not unwarranted, but it was also unverified, not versioned, and inconsistently monitored.

In my advisory work, I talk often about AI “duct taped” as a layer on top of processes that haven’t been hardened for them. The LiteLLM attack is the technical illustration of that risk. When you route your OpenAI API keys, your Anthropic credentials, your AWS tokens, your Kubernetes secrets, and your database credentials through a single intermediary layer, that layer becomes the highest-value target in your stack. You’ve done the attacker’s aggregation work for them.

The Mythos leak, meanwhile, illustrates something different and that's the governance gap between what organizations know internally about AI risk and what their operational processes reflect. I want to give credit where credit is due: Anthropic knew enough to brief government officials, but they either didn’t know enough or hadn’t enforced rigorously enough to keep unpublished documents out of a public-accessible CMS bucket. The sophistication of the model and the simplicity of the failure are almost comically misaligned for an AI organization like Anthropic, one that I respect more so than any other AI vendor.

The Lessons That Actually Matter

A few things I’d take from this week:

Pin your dependencies. I know it sounds basic and you’re right, it is basic. The LiteLLM attack was enabled by an unpinned Trivy version in a CI/CD pipeline. Version pinning is not optional in production AI workloads...full stop.

Treat your AI gateway like a secrets vault. If LiteLLM or any similar proxy layer is sitting in your stack with access to multiple API providers, it needs to be treated with the same rigor you’d apply to your secrets manager. Audit it, monitor it, isolate it, and watch for unexpected outbound connections.

Your security tools are part of your attack surface. This is the lesson that should keep security teams up at night. The tools you use to scan for vulnerabilities have privileged access to your build pipeline, your credentials, and your infrastructure. If they’re compromised, the attacker has everything they need without touching your application code, databases or other assets you’re commonly concerned with.

Governance can’t lag capability. Anthropic building a model they describe as posing “unprecedented cybersecurity risks” while simultaneously having a CMS misconfiguration that exposes internal documents is not a company-specific failure, but a symptom of an industry innovating faster than its own processes. This applies to every organization adopting AI right now. Your AI capabilities and your AI governance need to be on the same roadmap and in the same step.

The week of RSA 2026 gave us a lot to think about. The models are getting more powerful and the hackers are getting more sophisticated. Unfortunately, the infrastructure connecting them is as fragile as any other software ecosystem we’ve ever built, maybe more so, because of how much hyperscalers have concentrated into it.

The question isn’t whether to adopt AI, but whether you’re being honest with yourself about what that adoption actually requires.

This Week's AI Trust Problem Became Everyone’s Problem

The Mythos Leak and Accidental Transparency

The LiteLLM Attack and When Security Tools Become the Attack Vector

What the Two Stories Have in Common

The Lessons That Actually Matter

Rate

Share

Categories

Share

Rate

This Week's AI Trust Problem Became Everyone’s Problem

The Mythos Leak and Accidental Transparency

The LiteLLM Attack and When Security Tools Become the Attack Vector

What the Two Stories Have in Common

The Lessons That Actually Matter

Rate

Share

Categories

Share

Rate

Related content

Lessons from the Postmark-MCP Backdoor

Building AI Governance and Policies- First Steps

Using Ollama on Windows 11 as an Alternative to Public LLMs

Everyone Wants a Piece of the AI Pie

The Problem with AI Job Loss Headlines?