Critical Ollama Vulnerabilities Expose 300,000+ Servers to Memory Leaks and Persistent Code Execution

300,000+
Vulnerable Ollama Servers
9.1
CVSS Score (CVE-2026-7482)
171,000+
GitHub Stars (Attack Surface)
90+ Days
Windows Flaws Unpatched

Ollama, the widely-deployed framework for running large language models locally, contains three critical vulnerabilities that expose organizations to memory exfiltration and persistent code execution. The most severe flaw affects an estimated 300,000+ servers globally, while two additional vulnerabilities in the Windows client remain unpatched over 90 days after disclosure.

Bleeding Llama: CVE-2026-7482 Enables Complete Memory Exfiltration

CVE-2026-7482, dubbed "Bleeding Llama" by Cyera researchers, is a heap out-of-bounds read vulnerability carrying a CVSS score of 9.1. The flaw exists in Ollama versions prior to 0.17.1 and stems from improper validation of GGUF model files submitted to the /api/create endpoint. When processing attacker-supplied GGUF files where the declared tensor offset and size exceed the file's actual length, Ollama reads past allocated heap buffers during quantization operations in fs/ggml/gguf.go and server/quantization.go.

The vulnerability originates from Ollama's use of Go's unsafe package in the WriteTo() function, which bypasses the language's memory safety guarantees. This design choice creates a pathway for attackers to craft GGUF files with inflated tensor shapes that trigger out-of-bounds heap reads during model creation.

Three-Step Attack Chain Requires No Authentication

The exploitation process is straightforward and requires no authentication, as Ollama's REST API lacks built-in authentication mechanisms. Attackers execute a three-step chain: First, upload a maliciously crafted GGUF file with an inflated tensor shape to a network-accessible Ollama server via HTTP POST. Second, trigger model creation using the /api/create endpoint, which activates the out-of-bounds read vulnerability. Third, exfiltrate leaked memory data by pushing the resulting model artifact to an attacker-controlled registry through the /api/push endpoint.

Critical Data at Risk

Successful exploitation exposes environment variables, API keys, system prompts, and conversation data from concurrent users. Organizations using Ollama with development tools like Claude Code face amplified risk, as all tool outputs flow through the Ollama server and persist in heap memory.

"An attacker can learn basically anything about the organization from your AI inference—API keys, proprietary code, customer contracts, and much more," warned Cyera security researcher Dor Attias. The exposure scope extends to any data processed by the Ollama instance, making this particularly dangerous for organizations running local LLM inference on sensitive workloads.

Windows Update Mechanism Contains Two Unpatched Flaws

Striga researchers disclosed two additional vulnerabilities affecting Ollama's Windows update mechanism that remain unpatched as of May 2026, over 90 days after January 27 disclosure. CVE-2026-42248 (CVSS 7.7) involves missing signature verification that fails to validate update binaries before installation, unlike the macOS version. CVE-2026-42249 (CVSS 7.7) is a path traversal vulnerability where the Windows updater creates local paths for installer staging directories directly from HTTP response headers without sanitization.

The Windows desktop client auto-starts on login from the Startup folder, listens on 127.0.0.1:11434, and periodically polls the /api/update endpoint for updates. When chained with the on-login routine, these vulnerabilities enable attackers who can influence update responses to execute arbitrary code at every login. Ollama for Windows versions 0.12.10 through 0.17.5 are vulnerable.

Exploitation Scenarios and Persistence Mechanisms

Attackers can exploit the Windows flaws by controlling an update server reachable by the victim's Ollama client. One method involves overriding the OLLAMA_UPDATE_URL environment variable to point the client at a local server on plain HTTP. This assumes AutoUpdateEnabled is active, which is the default configuration. The missing integrity check alone enables code execution without requiring path traversal exploitation—the malicious installer drops into the expected staging directory and executes during the next launch from the Startup folder without signature re-verification.

However, this approach lacks persistence as legitimate updates overwrite the staged file. By combining path traversal, attackers redirect the executable outside the usual path, achieving persistent code execution that survives subsequent legitimate updates. The attack surface is significant given Ollama's popularity—the project has over 171,000 GitHub stars and has been forked more than 16,100 times.

Immediate Mitigation Steps

Update to Ollama 0.17.1 or later immediately to address CVE-2026-7482. For Windows users running versions 0.12.10-0.17.5, disable automatic updates and remove Ollama shortcuts from the Startup folder (%APPDATA%\Microsoft\Windows\Start Menu\Programs\Startup) until patches are available.

Recommended Security Controls

Organizations running Ollama must implement multiple defense layers. Apply the latest patches for CVE-2026-7482 immediately and audit all running instances for internet exposure. Ollama instances should never be directly accessible from the internet—isolate them behind firewalls and deploy authentication proxies or API gateways to compensate for the lack of built-in authentication. Limit network access to trusted IP ranges and implement network segmentation to contain potential breaches.

Broader Implications for Local LLM Security

These vulnerabilities highlight critical security gaps in the local LLM deployment model that many organizations have adopted to maintain data privacy. The irony is stark: organizations choose local LLM deployment specifically to avoid cloud data exposure, yet insecure implementations create attack vectors that can leak the very data they sought to protect. The use of unsafe memory operations in performance-critical AI infrastructure components represents a concerning pattern as organizations race to deploy AI capabilities without adequate security review.

The 90-day disclosure window for the Windows vulnerabilities has elapsed with no patch available, placing the burden on security teams to implement compensating controls. CERT Polska has assumed coordination of the disclosure process, suggesting potential challenges in Ollama's vulnerability response procedures. With over 300,000 potentially affected servers and Ollama's deep integration into development workflows, the attack surface extends beyond individual instances to entire development pipelines and connected toolchains.

Questions about your exposure?

RedEye Security provides assessments for organizations that need to understand their real risk.

Talk to us