Tuesday, May 27, 2008

Self-Spam

A small technical exercise testing the effectiveness of gmail to combat spam, which on their site says "Fast, searchable email with less spam".

This is true mostly. However, there is a downside to gmail spam filters w.r.t false-positives, i.e (filter mistakenly flags a good mail as spam)
These numbers are particularly high in case of gmail. I'm comparing it to, say using Spam-Assassin , or Akismet as your spam filter. I've had a sufficient number of good mails, not so useful, but not spam either-> which have been flagged off as spam.
This is unacceptable. In my case, I've the patience, and also a weird interest to go through my spam messages everday. But, busy people would consider this to be a "costly nuisance".
The Nuisance is obvious. It's also costly as the flagged message might have critical business information.
Most people however do not worry much about this. Because they trust their spam filters to weed out only the bad guys. As Matt Cutts of Google says about Google Search so elegantly, "
Sure, I could stop all the spam in the world if I didn’t have to return any search results." :) [source]

What people don't like, however, is the appearance of { stimulating/enlarging/Nigerian King bequeathing his 100 million $ /cheap adobe software or Rolex-es / mortgage loans at unbelievable rates } mails in their inbox. This is totally irritating, and this does more damage to the lay-user than the previously mentioned almost-rare case.
Spam filters involve many techniques which are constantly under development. These chiefly include heuristic[ drawing from previous experiences, a learning method], bayesian filters, simple database matching, matching ip's with regular spammers, and lots more involving complex probability/statistical models beyond my wildest dreams.
So, being the evil chap that I am, I did this nonsensical thing. I sent a spam mail to myself.


Self Spam
Granted, this is stupid. Gmail should place a trust on the sender[ me], and the sending server[gmail.com], and hence classify this as a legitimate mail.
So I repeated this exercise with a well-known email spoofer, www.pranketh.com . What this nifty project [written by two brilliant chaps from University of waterloo] does is pretty simple.
As the site says - "It allows you to send an email, that looks like its sent from someone else." Or in simple terms.
I can send an email from id's like mukeshambani[at]reliance[dot]in, or soniagandhirocks[at]congressrocks[dot]gov[dot]in.
[ note the delicate usage of "rocks", Blogger belongs to google. I don't want no risks]. A screenshot of pranketh's page.

Pranketh Page
Yes. I know your doubt. If its so easy, why do nefarious miscreants use stupid yahoo/gmail id's to threaten people, or send smokescreen bomb-footage, to the extremely retarded tv channel Aaj Tak[ self-proclaimed to be "Sarv-Shresht].
The point is the email server, folks. Your id is spoofed. But the smtp server name. No no. I couldn't dream in my distant dreams to get a gov.in smtp[ Since I don't happen to be a chinese (govt-sponsored) hacker :( ]. As a test case, try sending a prankethmail to your own id. And check the "original mail" option in gmail.
Ah. The post digresses from its core issue. Lets come back to the main point, shall we.
An important reason to gmail, not blocking my self-spam, was that it trusts my id, and its servers.
I tried sending the same text through pranketh. Guess what. It thrashes the mail left and right, before it even leaves their servers. Why?. They use Aksimet.

Pranketh Spam

My advice to gmail [ In the remotest probability that Matt Cutts is reading this,], and other mail providers is this.
Google's servers are checking all our mail-contents for generating their automated ads and stuff anyways. So, there is no illusion of privacy. So, the next time, I'm sending a mail, check the contents before hand. Warn me if its spammish. Keep the thresholds appropriately such that I'm not regularly annoyed with these warnings.
If all smtp servers start this routine, we can see at least some major changes.
  1. All the worlds emails would take a longer time to reach their destination. [ There's got to be some catch.This is it]
  2. Say I send one mail to a big bunch of people, it'd be scanned for spam-behaviour only once. Then some certification can be piggy-backed along, saying its reliable, and not spam. The experts can handle that bit. Not too difficult.
  3. Botnets prevention. Say, some dumbo privately runs a smtp server, and has been been subjected to a backdoor/trojan attack. And this is currently acting as a zombie sending out bunch of viagrish mails to innocent people, who've left their email id's lying out in the open.
I'm not saying you give up your earlier approach. That'd be foolish. But if its absolutely obvious that a mail is spammy[ self-spam for eg.]. Block it before it leaves your grounds.

Now, a general warning to all those who think spams are obnoxious. Your bigdaddydog@rediffmail.com might be the prettiest email id around. But don't leave it on some arbit website, for all the world to see. One syntax-based text crawler and you get thousands of them.
Believe me, some of these spammers are millionaires[ Not the Nigerian kind]. And run their business professionally. And have awesome technical expertise too.
If anything, don't make their job easier. Let them just fight it out with the big-guys[ yahoo, google, msft et al].

I reiterate. If you desperately want to put your email id on the net, use images like these.

email id
And, a word of caution. Even this is not safe. Within 1-2 years, google image search is going to search the contents in images. And character recognition. Piece of cake.
So, what do you do next. Do not fear, I have the ideas.
  1. When you must, put your email id's with re-captcha. I'd written a post about this some time back. My email id through this schema would be abhi...@gmail.com .Go to their website and register for their free service. The only reason this idea is safe, however, is because Spammers, like all of us, are average-ramesh hard-working people. They do not have time to fill your captchas. Where as, your friends and people who want to see your email id so desperately, do.
  2. Use images, but this time, write them with 3-d blocks. Even by extreme image processing hacking standards, this is nearly safe for 5-6 years.
  3. Do not put email id's on the internet.

This is probably the first in a series of spam-related posts to come.

P.S: I kinda remembered the first word of my blog's title, and how I hadn't paid any attention to it, for the past few months.
And, let me clarify. I love gmail.

2 comments:

  1. dude seriously are u so jobless that u spam urself with viagra mails ? thats very disturbing...but the post made for a very nice read...be careful in the future though..publishinkg such posts might attract the attention of the govt [US] and they might deny u a visa considering u to be some kind of a pervert or something..so watch out...

    ReplyDelete
  2. The point was never about vi-agra. I was just using typical keywords of an average-spammer's vocabulary.
    About Pervert-itude, ya. Probably got to refine it a bit.

    ReplyDelete