On 27 December 2017 Microsoft announced that the support for EAI in email has been implemented. This means that the Microsoft cloud now supports EAI email addresses in outlook, email flow, connectors and rules. For now there won’t be any support for adding IDN domains to the platform itself. With this new functionality in Office 365 a new door opens for phishing scams.
What is Email Address Internationalization, and why is this a threat?
EAI is part of a set of RFCs (RFC6530, RFC6531,RFC6532, RFC6533) that have been raised in 2012 to enabled non-ASCII characters to be used in the SMTP protocol. Effectively this means that non-ASCII characters can now be used in both the local and domain part of an email address. The RFC 6530 supports email based on the UTF-8 encoding, which in turn allows all Unicode characters to be used. RFC 6531 provides a mechanism for SMTP servers to negotiate transmission of the SMTP UTF8 content. This for example means that you can now use Greek alphabet letters in email.
Samples of Unicode letters that become available: ( you can use windows character map to see them all )
“Ã ? Æ a O ʢ ʁ ˩ Ξ Ϣ տ ق ᴂ ᶻ Ợ ἥ ▒ ╦ Ꞧ ꝿ ﺾ ﴿”
While this is a noble attempt to allow People with non-Latin letters in their name or domain to finally write it the proper way, you can imagine that this also opens up a whole set of new options in regards to phishing. The majority subset of these letters are very distinct compared to Latin letters they are easily recognizable. The problems start with a small subset of characters that are almost identical to the Latin letters. If these letters are replaced in a well-known domain it becomes virtually impossible for a human reader to verify if it is the real thing or scam attempt. In the modern world spammers make sure that their domains are validated with SPF, DKIM and DMARC so these mails will not be blocked by any spam filter as the domain is valid. The human reader will see no difference in the “mail from”, so the likelihood of a successful data theft or credential theft increases drastically.
Samples of letters not recognizable as non latin:
Display letter | Unicode | Description |
---|---|---|
ս | “U+057D” | Armenian small letter Seh |
ſt | “U+FB05” | Latin small long S T |
ꭇ | “U+AB47” | Undefined |
ᴍ | “U+1D0D” | Latin small letter capital M |
а | “U+0430” | Cyrillic small letter a |
Ͳ | “U+0372” | Greek Capital letter Archaic Sampi |
Or even an entire domain part can be replace with Unicode characters. (example of Unicode only domain: contact@Τесн-ѕеᴠᴠу.nl)
How are these new formats being processed by DNS and MTA?
The technology behind EAI is called punycode. Punycode converts Latin style domain names in a special format to non-Latin characters; this only works if all servers and clients that process the mail are able to support punycode. These days most modern clients like Outlook do support puny code. If we take the sample used above again and we encode it in punycode we will see the real domain. Punycode in domain name is also called IDN (Internationalized Domain Name). Not all TLDs support the IDN format. For example, the TLD “.com” does support it where “.nl” does not. A list of TLD’s that support IDN can be found here. If we take a look at the sample “contact@Τесн-ѕеᴠᴠу.nl” and we convert it to punycode we get “contact@xn—-1mb35ab2bxar6l239ua.nl” as the real domain name that will be used for SPF, DKIM, and DMARC validation. This will of course pass all tests as the domain owner of tech-savvy.nl never though of registering that domain and publish SPF “-all” on it as I mentioned in the SPF best practises.
The following screenshot will show how it looks from Outlook when send to my Microsoft account in Office 365. In this sample all characters have been replace it is a bit more obvious character, but it makes drives the point home. Notice that only the mail headers reveal the true address, so even the “press reply test” won’t help in identifying this as phishing mail. Also, pay attention to how close the “e” after the @ looks like the “e” before the @. In the modern versions of Outlook a small extra line is inserted if a EAI used email is detected. This depends on the client to make it visible.
How can I protect myself from this kind of attack?
As per the time of writing there is no option to turn off this behavior. If EAI is enabled on your tenant it is on and working.
But wait. Does this mean I am screwed right now.. ?
There is hope, but it requires intervention from an admin perspective. To prevent non-international domain names containing hyphens to being accidentally interpreted as punycode, international domain name punycode sequences have a so-called ASCII Compatible Encoding (ACE) prefix, “xn--“, prepended. With the support of EAI is also implemented in transport rules where we can create a regex transport rule to block all domains starting with the prefix “@xn--” and/or deliver them to the quarantine.
An example of a rule you can create to block this: (Notice we use “@xn--“. This allows us to still be able to use EAI in the prefix for users. If all EAI traffic is to blocked just use “xn--“)
After the rule is created we can test the deployment again and we can see that the mail now ends up in quarantine. Of course you can change the action to your own liking.
Enjoy your now safer, EAI spam attack protected O365 tenant and don’t forget to implement a similar counter measure on-premises if you still receive email there directly. This does off course require EAI support on your edge MTA.
Martijn (Scriptkiddie) van Geffen