Apr 03, 2026
1 min read

Google DeepMind Researchers Map Out Ways Hackers Hijack AI Agents

Google DeepMind researchers have released a paper detailing how autonomous AI agents can be hijacked.

_{Photo credit: NorthSky Films / Shutterstock.com}

Google DeepMind researchers have released a paper detailing how autonomous AI agents can be hijacked, warning that the internet can be weaponized against agentic systems.

The paper, entitled AI Agent Traps, argues that the open internet can be a threat to AI systems designed to browse and act independently online. Individuals and companies are adopting AI agents for a wide range of administrative tasks, such as making transactions and managing emails. Unlike traditional software, agents interpret messy, untrusted content at scale, making them vulnerable to manipulation.

The study explains in its abstract:

As autonomous AI agents increasingly navigate the web, they face a novel challenge: the information environment itself. This gives rise to a critical vulnerability we refer to as ‘AI Agent Traps’, i.e. adversarial content designed to manipulate, deceive, or exploit visiting agents. … By mapping this new attack surface, we identify critical gaps in current defences and propose a research agenda that could secure the entire agent ecosystem.

The paper identifies six main attack types. Content injection traps hide malicious instructions in code or metadata that the AI agent sees, but a human does not. Semantic manipulation affects the agent’s reasoning through persuasive language or misleading framing in a similar manner to how humans can be taken in by this language.

Cognitive state traps distort an agent’s memory, causing it to treat falsehoods as facts. Behavioral control traps directly override safeguards, forcing agents to leak sensitive data, with a high success rate. Systemic traps exploit multiple agents at a time to potentially trigger cascading problems. Finally, human-in-the-loop traps can trick the human users reviewing outputs into approving harmful actions.

The researchers recommend layered technical defences, including adversarial training, runtime content scanners, and preventative output monitoring. They also advise stricter standards for determining which content is AI-readable as well as reputation systems for website domains.

The study notes a gap in legal accountability as it is currently unclear where liability lies if an AI agent is manipulated into causing harm.

Share this article

AI Beginner Global News

Relevant articles

news
2 weeks ago
1 min read

Rogue AI Agent at Meta Triggers Security Incident After Exposing Sensitive Data

Luke Owain Boult
Content Writer

news
1 week ago
1 min read

Washington State Criminalizes Forged Digital Likenesses in New AI Deepfake Law

Luke Owain Boult
Content Writer

news
Mar 6, 2026
1 min read

Roblox Deploys AI to Rewrite Swearing and Slurs in Real-Time Chat

Roblox has introduced a new AI-powered feature that automatically rewrites inappropriate language in chat messages in real time.

Luke Owain Boult
Content Writer

news
Feb 17, 2026
2 min read

Developer Warns AI Agent’s Defamation Post Shows Risks of Autonomous AI

Luke Owain Boult
Content Writer

What is Sumsub anyway?

Not everyone loves compliance—but we do. Sumsub helps businesses verify users, prevent fraud, and meet regulatory requirements anywhere in the world, without compromises. From neobanks to mobility apps, we make sure honest users get in, and bad actors stay out.