Is using 'dot' and 'at' in email addresses in public text still useful?
To understand this, we must understand how crawlers find the email. While steering away from the technicals, the basic idea is this (today's algorithms are, of course, smarter than that):
- Find
@
in the page. - Is there a dot within 255 characters after the
@
? - Grab what's behind the
@
until you reach a space or the beginning of the line. - Grab the
.
and what's behind it until you reach the@
. - Grab what's after the
.
until your reach the end of the line or a space.
Now, an easy countermeasure would be to replace the @
with at
and the .
with dot
. The most intuitive counter-countermeasure would be to teach the crawler that at
is actually @
. Well, it's not that simple. Take the following text:
We climbed into the attic and found a dotted piece of wood. Please email us: adnan at gmail dot com.
Now let's run our new crawler on it. First it will find the at
in attic
, then it will find the dot
in dotted
. The resulting email would be [email protected]
, then it will find the second email [email protected]
. Then spammers started teaching crawlers about finding certain domains, ignoring spaces, taking spaces into account, considering certain domain names, etc.
Then we started using images, spammers used OCR. We started using JavaScript tricks, inserting comments, URL-encide, etc. and always the spammers found a way to get around them. It's a race.
Having that said, the most basic techniques usually give good enough results (apparently, in some place in the world, that link is NSFW. Personally, I disagree), and the more obfuscate, the better results you get.
So, to directly answer your question: Is using 'dot' and 'at' in email addresses in public text still useful? Yes, I think so, at least to some degree. But this solution has been around long enough for us to assume that some crawlers have already found a way around it.
My advice? Either use some fancy advanced munger, or simply use images.
To my humble opinion, email obfuscation (of any sort) is one of the worse ideas ever invented.
The foremost concern for any user interface, web based or any other, is convenience and safety of its users. Spam bots are not users, thus they are not worth any consideration or effort.
The logic goes as following:
E-mail obfuscation is a nuisance for legitimate users. Rather than simply clicking the mailto link, user will be forced to manually type in e-mail address into their mail address prompt.
1.a. Even this by itself may deter the user from contacting the intended address - they will go elsewhere to simply avoid tedious interaction.
1.b. The chance to enter erroneous but similar address in the process and thus send the possibly important mail to some typo-scamming mailbox is very high.
Most legitimate e-mail addresses in existence are already known to spammers. Every mail box I've encountered to date (and this is a rather large number of mailboxes) was receiving some volume of spam on a regular basis. This is why all contemporary mail servers and clients come with spam filter integration, which, in most cases, is very efficient.
In short, just use plain and normal "mailto:" links and don't annoy your users unnecessarily.
I have never understood the paradigm since its conception. We are simply depriving spam battling software the necessary data. As mentioned before, adding "at" "dot" to the parser is trivial too.
I would actually urge otherwise. Let the hell loose. Use your email and use any email for that matter. I even wrote a bot 10 years ago or so, where it produced infinite random emails page by page. If a crawler hit it, it would forever crawl non-existent emails.
We should not reduce the emails spam bots have to process. We should increase the number so in turn resource requirements, hence the cost of running a spammer would get higher and spam becomes less feasible economically.
We should take quality of spam filters into account when choosing a mail service so they get economical benefit while spam keeps hurting.
We have many instruments in place today which did not exist a decade ago. DKIM, SPF, reverse-PTR, blacklists and whatnot. Spam is getting less and less attractive. We should push it forward. Let it handle the load not ourselves.