Microsoft patched a vulnerability rated as max critical in its M365 Copilot AI platform last Tuesday. On Monday, researchers from security firm Varonis who discovered the vulnerability revealed how their proof-of-concept exploit could retrieve two-factor authentication codes and other sensitive data from emails accessible to Copilot. The root cause stems from AI bots' inability to distinguish between instructions provided by users and those embedded in third-party content the models process, leaving Microsoft and other LLM providers unable to prevent their products from complying with malicious data-retrieval requests.
Varonis Researchers Bypass Copilot Guardrails Using Markup Language
Microsoft built guardrails into Copilot to prevent the LLM from submitting web forms, sending emails, and taking similar actions that could exfiltrate user data. Varonis researchers worked around these restrictions using markup language, which allows adding formatting elements such as headings, lists, and links to text without HTML tags. Another workaround involved wrapping sensitive data inside HTML tags such as and . In both cases, a web request containing the data hits the attacker's web server, where the secret information is captured in logs.
Microsoft implemented additional guardrails including wrapping Copilot output in blocks so browsers treat it as straight text, and restricting sites Copilot can visit without explicit approval. While Copilot has blanket permission to send requests to Microsoft domains, guardrails restrict requests to untrusted sites.
Parameter-to-Prompt Injection Exploits URL Query Parameters
Varonis devised an exploit chain that bypassed these guardrails using what the researchers call a Parameter-to-Prompt Injection. The parameter in this case is the q in a URL, which flags a query that has been included. The Parameter-to-Prompt Injection is a close relative of the prompt injection, with the difference that the malicious command is located in the query parameter rather than in an email or other piece of untrusted content.
FAQ
What vulnerability did Microsoft patch in Copilot last Tuesday?
Microsoft patched a max-critical vulnerability in its M365 Copilot AI platform that allowed hackers to retrieve two-factor authentication codes and other sensitive data from emails accessible to Copilot. Varonis researchers who discovered the vulnerability revealed their proof-of-concept exploit on Monday.
How did Varonis researchers bypass Copilot's security guardrails?
Varonis researchers used markup language to add formatting elements without HTML tags and wrapped sensitive data inside HTML tags such as and . They also employed a Parameter-to-Prompt Injection technique that placed malicious commands in URL query parameters rather than in email content, allowing web requests containing user data to hit attacker-controlled servers where the information was captured in logs.