Coping With Ridiculous Spam

My email addresses are all over the Internet.

I was webmaster for a pretty early web server, I worked on a number of open source projects, and I had a fairly visible role at Apple for a while.  As a result, whatever email address I used for each project ended up online.  This is good, because it enables people who need to reach me via email to do so.

But, of course, I get an awful lot of spam.

For a long time, this was pretty manageable by using some simple tools, but spam has become rather more difficult to filter out of my inbox over time, and things got a bit out of control about a year ago.

Trying to run your own spam filtering tools is a major pain. They are hard to set up and tricky to train. The software I used ended up a being huge consumer of CPU cycles, which was a problem for my kind hosts at Red-Bean. What many of the other users of Red-Bean had already decided to do was to outsource the problem to Google.

Outsourcing the Problem to Google

Google, of course, has a hugely popular email service, and a lot more time and hardware to spend on filtering spam out.  So an easy strategy is to forward all of your email to Gmail.com and read it there.  However, some of us still want our (non-spam) mail to end up on our server so we can do things like set up server-side Procmail rules and the like.

A solution to that is to set up your Gmail account to forward to your server, and then set up your server to forward mail to Gmail, unless it’s already been forwarded by Gmail. The former is easy enough to do in Gmail’s settings, but the latter requires some magic on your server.

Red-Bean uses Exim, an email server (a good one, it seems, but unfortunately GPL’ed, so I’d use Postfix if I were setting it up myself) which includes a built-in filtering engine. To get the desired forwarding to Gmail, you can set up your .forward file like so:

# Exim filter ** DO NOT EDIT OR REMOVE THIS LINE **
if $h_X-Forwarded-For: contains my_gmail_id@gmail.com
then
  deliver my_local_id
else
  deliver my_gmail_id@gmail.com
endif

If you want to use Procmail, you’ll need to change “deliver my_local_id” to “pipe /usr/bin/procmail”.

OK, What If But I Don’t Trust Google?

There is one problem left here for me: I’m not excited about Google having access to all of my email. Perhaps you think that Google will do no wrong as long as Larry, Sergey, and Eric are in charge. You might be right. But you might be wrong.

I don’t mind them getting my mailing list traffic (which is often archived on the web anyway), so I just want to keep my personal mail from leaving my server. Fortunately, Exim has an easy way to filter personal mail:

# Exim filter ** DO NOT EDIT OR REMOVE THIS LINE **

##
# Mail sent to one of my personal addresses can go straight
# to procmail.
##
if personal alias my_local_id@example.com
	    alias my_other_email@other_domain.net
	    alias another@another_domain.org
then
  pipe "/usr/bin/procmail"
  finish
endif

##
# Other mail gets forwarded to Google, which forwards what
# it thinks are non-spam messages back to here, which we
# then send to procmail.
##
if $h_X-Forwarded-For: contains my_gmail_id@gmail.com
then
  pipe "/usr/bin/procmail"
else
  deliver my_gmail_id@gmail.com
endif

This seems to work pretty well for me. The documentation on Exim filter files describes many more filtering features built into Exim.

Tuesday, November 3, 2009