« Creating a Framework of Guidance for Building Good Digital Collections | Home | CIO comments from Wharton »

May 12, 2002

How to Good-bye spam

Spam fighting info from Mitch Wagner's 24-hour drive-thru blog.

HOW TO GOOD-BYE SPAM..

Elsewhere in cyberspace, I've been participating in a discussion about measures to block spam, and I posted Uncle Mitch's Handy Spam-Fighting Tips. I thought you all might be interested, too, so here's a revised version.

I'll start by saying that I find most of the common wisdom for dealing with spam to be useless, more trouble than it's worth, or both. Most so-called spam experts advise keeping two e-mail addresses. One of them, a private one, would be guarded like you would an unlisted phone number, given only to close friends and business associates with need-to-know. The other one, a public e-mail address, would be used for mailing lists, buying stuff from Web storefronts, registering software, Usenet, and other activity which attracts spam. I've never seen the sense of having two e-mail addresses, though; yeah, sure, you might block spam from one of them, but you have to check two e-mail addresses, so where's the reduction in work in that? Morever, I have no evidence to support this but I'm convinced that eventually your private e-mail address will get out, and once you get on one spam list you'll be on all of them, and you'll be right back where you started.

So, while I have several e-mail addresses, they're all forwarded to my main address, mwagner@TheWorld.com.

Likewise, the experts say to block any e-mail address you receive spam from. The problem with that is that the spammers change e-mail addresses rapidly; you'll seldom get spam from the same address twice.

The measures I take to avoid spam are two-fold. First off, my two main e-mail providers, The World and sff.net, are very good at filtering out most spam. For the remaining four to half-dozen spams I receive every day, I have a set of Eudora filters that route the traffic efficiently.

These filters are pretty complex--I have about 75 of them--but I've set them up one at a time over a course of several years. Each filter takes only about 15 seconds to put in place, and each filter is ITSELF valuable. In other words, you don't have to take the trouble to set up 75 filters all at once, just one filter will help you out.

Here's how the filters work:

1) First, Eudora looks at the From address to see whether the e-mail is sent from a mailing list. If it's from a mailing list, the e-mail is sent to its own folder, and Eudora stops processing that message.

2) Then Eudora looks at the From address again to see whether the e-mail comes from an e-mail address that I've flagged as a source of high-priority mail. These are the e-mail addresses of my friends and family, colleagues, and some whole domains from small ISPs whom I know are really good at keeping spammers off, such as sff.net, dm.net, panix.com and TheWorld.com. If the e-mail is from one of those addresses, Eudora changes the color of the e-mail to red, plays a sound - because I want to jump on those e-mails right away - and then stops processing that message. The message is not moved, it stays in the In box.

3) Then, Eudora once again looks at the From: address to see whether the e-mail comes from an address that I've flagged as a source of middle-priority mail--stuff that's important, but not as important as the previous groups. Mostly, I set up this group of rules when I was working as a staff writer for computer trade pubs, and mostly these were the e-mail addresses of the companies I was responsible for covering and their PR agencies. Eudora changes the color of the e-mail to blue, doesn't play a sound, leaves the message in the in box, and stops processing that message.

4) Then comes the fun part: the spam trap. Eudora searches the remaining messages for the following keywords: "make money" "unsubscribe" "to be removed" "to remove" "webcam" "shoes" "china." It also checks the From address to see whether the e-mail comes from known spam sources: the top-level domains .tw, .mx, .ro, .it, .cn, and hotmail accounts. I do get some legitimate e-mail from hotmail accounts, but that's handled with previous filters. Eudora will also filter for e-mail formatted in HTML - mostly, HTML-formatted e-mail is spam. Some legitimate e-mail also comes formatted in HTML, but, again, those e-mails are handled by previous filters.

All of the e-mail caught by this level of filters gets sent to a folder called "junk." Some people have their spam filters set up to automatically delete junk mail, but I don't do that--every few weeks, my spam filter will accidently trap a piece of legitimate e-mail.

5) Now the NEXT spam trap: The remaining e-mail gets looked at to see if my e-mail addresses appear in the To: or CC: line. Much spam gets sent to BCC'd or undisclosed recipients. If Eudora doesn't see me when it looks at the e-mail headers, then into the junk folder it goes.

Like I said, this is hairy, but I've been working on it for years. And it's not as hairy as it looks--about a year ago, I switched e-mail clients, and I was able to convert all my filters over in a couple of hours. And your spam filters don't have to be as hairy as that, you'll start getting value from the first filter you put in.

The end result is that, as I sit at my computer during the day, Eudora runs in background the whole time checking e-mail. When an e-mail comes from an important person, Eudora makes a sound and I know to check it. I also check a couple of times a day to see if any OTHER e-mail has come in. I check mailing lists once or twice a day.

If I've been away from e-mail for a day or more, I can glance at my in-box, look at the e-mail marked red, and tackle that first. Then, I tackle the blue e-mail next, then the stuff that's not marked in any color. Later on, when I'm good and ready, I tackle the junk mailbox.

Probably this system is too complicated to be worthwhile for most people, but I used to get literally HUNDREDS of e-mail every day when I was a staff writer at computer magazines during the dotcom boom. Now I get a lot less, but these rules still come in handy for me.

[24-hour drive-thru]