Researchers have unveiled a new cyberattack method that leverages AI image downscaling to hide malicious prompts capable of stealing sensitive user data. The technique, developed by Trail of Bits researchers Kikimora Morozova and Suha Sabi Hussain, shows how attackers can exploit common image-processing pipelines used in AI-powered platforms.
The concept builds on a theory from a 2020 USENIX paper by TU Braunschweig, which first explored the idea of image-scaling attacks in machine learning. Now, Trail of Bits has demonstrated that this concept can be weaponized in real-world systems, posing significant risks to organizations and end-users.
When users upload images to AI systems, the files are usually automatically downscaled to reduce processing costs and improve performance. During this step, image resampling algorithms—such as nearest neighbor, bilinear, or bicubic interpolation—alter the pixels. These changes may create aliasing artifacts that can unintentionally reveal hidden instructions embedded within the original image.

Source: Zscaler
For example, in a demonstration, maliciously crafted dark areas of an image turned red after bicubic downscaling, while black text became visible. This text acted as covert prompts that the AI interpreted as part of the user’s request. As a result, the model could execute unauthorized instructions, leading to data exfiltration or unintended system actions, all without the user’s knowledge.
The researchers successfully tested the attack on multiple platforms, including:
- Google Gemini CLI
- Vertex AI Studio (Gemini backend)
- Gemini’s web interface and API via llm CLI
- Google Assistant on Android
- Genspark
In one proof-of-concept, attackers managed to exfiltrate Google Calendar data to an arbitrary email address by exploiting Gemini CLI with Zapier MCP under “trust=True.” This demonstrated how subtle manipulations can bypass normal safeguards.
To further explore their findings, the team developed Anamorpher, an open-source tool (currently in beta) capable of generating such malicious images tailored for different downscaling algorithms.
Mitigation strategies
Trail of Bits suggests several defense mechanisms against this threat. One key recommendation is to apply dimension restrictions on uploaded images, preventing potentially malicious large files from being downscaled in ways that expose hidden data. They also advise showing users a preview of the final downscaled image that will be processed by the large language model (LLM).
Additionally, explicit user confirmation should be required before executing sensitive tool calls, especially when hidden text is detected within an image. Most importantly, researchers emphasize the need for secure AI design patterns that can resist prompt injection attacks, including multimodal vulnerabilities where both text and images can be manipulated.
Conclusion
This newly discovered AI image-scaling attack highlights how adversaries are finding increasingly creative ways to bypass traditional defenses. With AI adoption accelerating across industries, securing multimodal systems is no longer optional—it’s essential. Organizations must adopt stronger safeguards and proactive monitoring to ensure that the very tools designed to empower users don’t become vectors of exploitation.





