AI Interview Series #2: Explain Some of the Common Model Context Protocol (MCP) Security Vulnerabilities

Tool Poisoning

A Tool Poisoning Attack happens when an attacker inserts hidden malicious instructions inside an MCP tool’s metadata or description.

Users only see a clean, simplified tool description in the UI.

LLMs, however, see the full tool definition — including hidden prompts, backdoor commands, or manipulated instructions.

This mismatch allows attackers to silently influence the AI into harmful or unauthorized actions.

Tool Hijacking

A Tool Hijacking Attack happens when you connect multiple MCP servers to the same client, and one of them is malicious. The malicious server injects hidden instructions inside its own tool descriptions that try to redirect, override, or manipulate the behavior of tools provided by a trusted server.

In this case, Server B pretends to offer a harmless add() tool, but its hidden instructions try to hijack the email_sender tool exposed by Server A.

MCP Rug Pulls

An MCP Rug Pull happens when a server changes its tool definitions after the user has already approved them. It’s similar to installing a trusted app that later updates itself into malware — the client believes the tool is safe, but its behavior has silently changed behind the scenes.

Because users rarely re-review tool specs, this attack is extremely hard to detect.