Animated icon in email subject
Many thanks to Alexander O'Mara for such a well-researched answer about the goomoji-tagged HTML images!
I just wanted to add three things:
There are still many many emoji (and other Unicode sequences generating pictures) that spammers and other erstwhile marketers are starting to use in email subject lines and that gmail does not convert to HTML images. In some browsers these show up bold and colored, which is almost as bad as animation. Browsers could also choose to animate these, but I don't know if any do. These Unicode sequences get displayed by the browser as Unicode text, so the exact appearance (color or not, animated or not, ...) depends on what text rendering system the browser is using. The appearance of a given Unicode emoji also depends on any Unicode variation selectors and emoji modifiers that appear near it in the Unicode code point sequence. Unlike the image-based emoji spam, these sequences can be copied-and-pasted out of the browser and into other apps as Unicode text.
I hope the many marketers reading this StackOverflow question will just say no. It is a horrible idea to include these sequences in your email subject lines and it will immediately tarnish you and your brand as lowlife spammers. It is not worth the "attention" your email will get.
Of course the first question coming to everyone's mind is: "how do I get rid of these things?" Fortunately there is this open-source Greasemonkey/Tampermonkey/Violentmonkey userscript:
Gmail Subject Line Emoji Roach Motel
This userscript eliminates both HTML-image (thanks to awesome work of Alexander O'Mara) and pure-Unicode types.
For the latter type, the userscript includes a regular expression designed to capture the Unicode sequences likely to be abused by marketers. The regex looks like this in ES6 Javascript (the userscript translates this to widely-supported pre-ES6 regex using the amazing ES6 Regex Transpiler):
var re = /(\p{Emoji_Modifier_Base}\p{Emoji_Modifier}?|\p{Emoji_Presentation}|\p{Emoji}\uFE0F|[\u{2100}-\u{2BFF}\u{E000}-\u{F8FF}\u{1D000}-\u{1F5FF}\u{1F650}-\u{1FA6F}\u{F0000}-\u{FFFFF}\u{100000}-\u{10FFFF}])\s*/gu
// which includes the Unicode Emoji pattern from
// https://github.com/tc39/proposal-regexp-unicode-property-escapes
// plus also these blocks frequently used for spammy emojis
// (see https://en.wikipedia.org/wiki/Unicode_block ):
// U+2100..U+2BFF Arrows, Dingbats, Box Drawing, ...
// U+E000..U+F8FF Private Use Area (gmail generates them for some emoji)
// U+1D000..U+1F5FF Musical Symbols, Playing Cards (sigh), Pictographs, ...
// U+1F650..U+1FA6F Ornamental Dingbats, Transport and Map symbols, ...
// U+F0000..U+FFFFF Supplementary Private Use Area-A
// U+100000..U+10FFFF Supplementary Private Use Area-B
// plus any space AFTER the discovered emoji spam
#Short description:
They are referred to internally as goomoji
, and they appear to be a non-standard UTF-8 extension. When Gmail encounters one of these characters, it is replaced by the corresponding icon. I wasn't able to find any documentation on them, but I was able to reverse engineer the format.
#What are these icons?
Those icons are actually the icons that appear under the "Insert emoticons" panel.
While I don't see the 52E
icon in the list, there are several others that follow the same convention.
B0C
4F4
Note that there are also some icons whose names are prefixed, such as gtalk.03C
. I was not able to determine if or how these icons can be used in this manner.
#What is this Data URI thing?
It's not actually a Data URI, though it does share some similarities. It's actually a special syntax for encoding non-ASCII characters in email subjects, defined in RFC 2047. Basically, it works like this.
=?charset?encoding?data?=
So, in our example string, we have the following data.
=?UTF-8?B?876Urg==?=
charset
=UTF-8
encoding
=B
(means base64)data
=876Urg==
#So, how does it work?
We know that somehow, 876Urg==
means the icon 52E
, but how?
If we base64 decode 876Urg==
, we get 0xf3be94ae
. This looks like the following in binary:
11110011 10111110 10010100 10101110
These bits are consistent with a 4-byte UTF-8 encoded character.
11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
So the relevant bits are the following.:
011 111110 010100 101110
Or when aligned:
00001111 11100101 00101110
In hexadecimal, these bytes are the following:
FE52E
As you can see, except for the FE
prefix which is presumably to distinguished the goomoji
icons from other UTF-8 characters, it matches the 52E
in the icon URL. Some testing proves that this holds true for other icons.
#Sounds like a lot of work, is there a converter?:
This can of course be scripted. I created the following Python code for my testing. These functions can convert the base64 encoded string to and from the short hex string found in the URL. Note, this code is written for Python 3, and is not Python 2 compatible.
###Conversion functions:
import base64
def goomoji_decode(code):
#Base64 decode.
binary = base64.b64decode(code)
#UTF-8 decode.
decoded = binary.decode('utf8')
#Get the UTF-8 value.
value = ord(decoded)
#Hex encode, trim the 'FE' prefix, and uppercase.
return format(value, 'x')[2:].upper()
def goomoji_encode(code):
#Add the 'FE' prefix and decode.
value = int('FE' + code, 16)
#Convert to UTF-8 character.
encoded = chr(value)
#Encode UTF-8 to binary.
binary = bytearray(encoded, 'utf8')
#Base64 encode return end return a UTF-8 string.
return base64.b64encode(binary).decode('utf-8')
###Examples:
print(goomoji_decode('876Urg=='))
print(goomoji_encode('52E'))
###Output:
52E
876Urg==
And, of course, finding an icon's URL simply requires creating a new draft in Gmail, inserting the icon you want, and using your browser's DOM inspector.
If you use the correct hex code point (e.g. fe4f4 for 'pile of poo') and If it is correctly encoded within the subject line header, let it be base64 (see @AlexanderOMara) or quoted-printable (=?utf-8?Q?=F3=BE=93=B4?=
), then Gmail will automatically parse and replace it with the corresponding emoji.
Here's a Gmail emoji list for copying and pasting into subject lines - or email bodies. Animated emojis, which will grab even more attention in the inbox, are placed on a yellow background: