Claude Cowork is vulnerable to file exfiltration attacks via indirect prompt injection as a result of known-but-unresolved isolation flaws in Claude's code execution environment. //
Anthropic shipped Claude Cowork as an "agentic" research preview, complete with a warning label that quietly punts core security risks onto users. The problem is that Cowork inherits a known, previously disclosed isolation flaw in Claude's code execution environment—one that was acknowledged and left unfixed. The result: indirect prompt injection can coerce Cowork into exfiltrating local files, without user approval, by abusing trusted access to Anthropic's own API.
The attack chain is depressingly straightforward. A user connects Cowork to a local folder, uploads a seemingly benign document (or "Skill") containing a concealed prompt injection, and asks Cowork to analyze their files. The injected instructions tell Claude to run a curl command that uploads the largest available file to an attacker-controlled Anthropic account, using an API key embedded in the hidden text. Network egress is "restricted," except for Anthropic's API—which conveniently flies under the allowlist radar and completes the data theft.
Once uploaded, the attacker can chat with the victim's documents, including financial records and PII. This works not just on lightweight models, but also on more "resilient" ones like Opus 4.5. Layer in Cowork's broader mandate—browser control, MCP servers, desktop automation—and the blast radius only grows. Telling non-technical users to watch for "suspicious actions" while encouraging full desktop access isn't risk management; it's abdication.