- What: SentinelOne documented Gaslight, a Rust macOS implant that embeds a prompt-injection payload to make LLM-assisted malware triage agents doubt their session and abort analysis.
- Impact: Beyond the AI-evasion trick, the implant runs a Telegram C2 shell and a Python stealer that harvests the macOS Keychain, Terminal history, process snapshots, and Chrome, Brave, Firefox, and Safari data into an exfiltrated ZIP.
- Fix / mitigation: No patch exists; defenders should treat embedded text as adversarial input, never let triage agents act on attacker-supplied content, and alert on the LaunchAgent label com.apple.system.services.activity and Telegram API C2 traffic.
- Who's at risk: macOS users and any SOC or reverse-engineering team that has put LLM triage tools in the analysis loop, with developers and crypto-adjacent targets the likely DPRK focus.
SentinelOne has documented Gaslight, a previously unseen Rust-based macOS implant and information stealer that does something new: it attacks the AI analyzing it. Embedded in the artifact is a cascade of 38 fabricated 'system' messages built to make an LLM-assisted triage agent doubt its own session and abort, truncate, or refuse the analysis. Researcher Phil Stokes summed up the design: 'It attacks the agent's perception, rather than the sandbox it runs in.' The tooling is assessed with high confidence as the work of North Korea-aligned actors.
The takeaway for security teams is not the stealer payload, which is conventional. It is that an attacker now treats your LLM triage pipeline as part of the attack surface. As AI agents move into the reverse-engineering and SOC-triage loop, the text they read becomes an injection vector. Gaslight is the first malware engineered to exploit that directly.
How the prompt injection works
Inside the binary is a Markdown-fenced block containing 38 bogus system messages. They impersonate the kind of operational noise an analyst's tooling produces: token-expiry notices, out-of-memory kills, disk exhaustion, and repeated operation failures. The scaffold also plants fake warnings about injection vulnerabilities and static-analysis flags, designed to make an automated agent conclude the artifact is unsafe or unprofitable to continue examining. The goal is a clean, automated abort, so the sample slips through triage untouched.
This works because most LLM-assisted triage tools feed extracted strings and file content straight into a model as if it were trusted data. It is the same class of failure as web prompt injection: the system cannot tell instructions apart from input. Gaslight weaponizes that gap inside the binary itself.
If your triage agent can act on strings pulled from a sample, an attacker can script its behavior. Sandbox the model's outputs, strip or neutralize control-style text before it reaches the prompt, and never let an agent make an abort/allow decision on attacker-controlled content without a human in the loop.
The Telegram-driven C2 and shell
Underneath the AI trick, Gaslight is a working remote-access implant. Its command-and-control runs over the Telegram Bot API in a polling loop, letting the operator drive an interactive shell and return results. A token-conflict quirk reveals the design: if two instances poll the same bot token at once, Telegram returns a 'Conflict' and the second copy terminates.
The shell exposes six main commands, with signs of a seventh:
- help - show command help
- id - identify the implant to the operator
- shell - execute a command via execvp
- kill - terminate a process by PID
- upload - exfiltrate a file using Telegram's attach:// mechanism
- stop - halt the implant
- focus - a suspected seventh command whose function is still undetermined
Operator configuration is not hard-coded. The bot token, chat ID (tg_room_id), and related settings are supplied at runtime, and the implant self-redacts its own token in runtime output, denying it to anyone who captures logs or crash artifacts. That defeats a common analyst shortcut: pulling the C2 token straight from the sample.
What it steals
Persistence is established through a LaunchAgent whose .plist carries the deliberately innocuous label com.apple.system.services.activity. A 6.6 KB Base64-encoded Python script does the harvesting: Terminal command histories, installed application listings, running-process snapshots, the full system hardware and software profile, the macOS Keychain database, and data from Chrome, Brave, Firefox, and Safari. Everything is compressed into temp/collected_data.zip and pushed out over Telegram.
The Python stealer is delivered by a separate 2 KB Base64 bash installer that drops a cpython-3.10.18 interpreter from the astral-sh/python-build-standalone project, so the malware brings its own runtime rather than relying on the host. The heavy emoji use and verbose comment headers in the scripts strongly suggest they were generated with an LLM, the same pattern showing up across recent DPRK tooling.
Flag the LaunchAgent label com.apple.system.services.activity, outbound traffic to api.telegram.org from non-browser processes, the staging path temp/collected_data.zip, and unexpected standalone cpython-3.10.18 interpreters dropped on developer endpoints.
Why this matters for defenders
Gaslight is a proof point that the AI tooling defenders are adopting is now a target, not just a productivity aid. SentinelOne framed it as an 'attempt to weaponize the LLM-assisted triage pipelines that increasingly sit in the reverse-engineering loop.' If your SOC or RE workflow lets a model decide what gets escalated, assume an attacker will eventually write to that decision.
Practical steps: keep a human gate on any AI-driven abort-or-escalate decision, isolate model outputs from execution, and verify findings against deterministic tooling rather than trusting the agent's narrative. On the host side, the stealer is conventional and detectable. Watch for the LaunchAgent label, Telegram C2 traffic, and the bundled Python interpreter. The novel part is the psychology aimed at your tools, and that is the part to design out of your pipeline now, before the next sample arrives.
Questions about your exposure?
RedEye Security provides assessments for organizations that need to understand their real risk.
Talk to us