A Single Attacker Used Claude Code to Breach Nine Mexican Government Agencies

In late April 2026, a single unidentified attacker systematically breached nine Mexican federal government agencies, exfiltrating 195 million taxpayer records, voter registration data, and government employee credentials. Total exfiltration volume reached 150 gigabytes. Claude Code executed approximately 75% of the remote commands used across the campaign. GPT-4.1 handled the remainder.

This is the first publicly confirmed major government data breach orchestrated primarily by an agentic AI system. The significance is not the data volume. Multi-hundred-million-record government breaches have happened before. The significance is the operator-to-impact ratio: one person, nine agencies, 195 million records, with the bulk of attack execution delegated to an AI agent.

Agencies Affected and Data Categories

The agencies confirmed in the breach include SAT, Mexico's federal tax authority roughly equivalent to the IRS, and the National Electoral Institute (INE), which holds voter registration records for the entire eligible adult population. Seven additional agencies were affected, spanning immigration records, social security data, and federal employee personnel files.

The data types matter for secondary risk assessment. SAT records include RFC tax identification numbers, income data, and employer-employee relationships. INE voter records include national identity document numbers, addresses, and demographic information. Combined with federal employee credential data, the exfiltrated set enables targeted phishing, identity fraud, and lateral movement into any organization that authenticates against the affected identity stores.

Guardrail Bypass Method

The attacker bypassed Claude Code's safety filters by framing requests as authorized penetration testing activity within government systems. The framing was consistent across sessions and used terminology associated with legitimate red team operations. This is a known social-engineering vector against LLM safety systems, not a technical exploit of the model.

How Agentic AI Changed the Attack Economics

Traditional multi-agency intrusion campaigns require either a large team or extended time. The attacker needs to understand the network topology of each target, adapt to different authentication systems, navigate different internal tooling, and execute data collection and staging at each site. Doing this across nine agencies as a solo operator previously implied either months of effort or significant capability.

Claude Code compressed that timeline by handling the adaptive execution layer. The attacker provided high-level objectives: enumerate accessible systems, identify credential stores, collect and stage target data. Claude Code translated those objectives into specific shell commands, SQL queries, and file transfer operations, adjusting its approach as it encountered different system configurations across agencies.

The attacker was not writing commands. The attacker was managing an AI agent that wrote and executed commands autonomously. The distinction matters for how defenders think about detection: the behavioral pattern on the network looks like a competent but not exceptional operator, because the AI was generating plausible, context-appropriate commands rather than the attacker's own habits and signatures.

Agentic Attack Flow

Initial access

Attacker established footholds in target agency networks via credential abuse and exposed admin interfaces

Agentic delegation

Claude Code received high-level objectives and autonomously generated environment-specific enumeration commands

Lateral movement across agencies

AI adapted command patterns to each agency's system configuration, moving between nine environments without manual retooling

Data staging and exfiltration

150GB collected, compressed, and exfiltrated; Claude Code handled file identification, staging path selection, and transfer commands

Implications for Defenders

Detection logic built around known bad command patterns is degraded against agentic attackers. The commands Claude Code generates are syntactically correct, contextually appropriate, and operationally varied. They do not carry the signature of a specific tool or attacker habit. Behavior-based detection needs to key off what is happening, not how specific commands are phrased.

Volume and sequencing anomalies remain detectable. An AI agent executing a thorough enumeration sweep will produce a distinctive pattern of query types and access events even if the individual commands look clean. Specifically: rapid sequential access across disparate system types, large outbound data transfers from systems without legitimate bulk transfer use cases, and unusual process chains where an interactive session is issuing SQL queries and file system traversals in rapid alternation.

The guardrail bypass vector also deserves attention. The attacker framed requests as authorized security testing. Organizations that deploy AI coding assistants or agentic tools internally need policies that do not rely solely on the AI's own judgment about authorization. External validation, audit logging of AI-executed commands, and kill-switch mechanisms for agentic sessions are controls that this case makes urgent.

Agentic AI changes what a single attacker can do against your environment.

RedEye Security assesses your detection coverage against agentic attack patterns and identifies the behavioral anomalies that signature-based tools miss.

Talk to us