Minh’s Notes

Human-readable chicken scratch

Minh Nguyễn
June 3rd, 2006


Going your own way with spam

Peter’s starting to have some trouble with spam. It would’ve happened a long time ago, but he wrote his own blogging software, which eluded the solicitors for awhile. Since his software works differently than the more popular blogging packages, he escaped the algorithms that cover larger swaths of the Internet. But not anymore. His situation is a good example of the pains that spammers will go through to make sure that they are heard, if not by us, than by the likes of Google.

Short of requiring registration or going through the trouble of implementing a CAPTCHA system (which has serious accessibility problems), the best way to avoid a deluge of unwanted comments is to stay under the radar, by having your blog work ever so slightly different. These ideas only work for more configurable blogging systems, such as your own homegrown blogging software, and some work much more effectively than others:

  1. Ask a challenge question. Require the user to answer a randomly changing question that is easy for humans to understand, but hard for robots to predict the answer to. This solution presents a number of problems. Avoid asking math questions, since computers are quicker at figuring out simple arithmetic than humans are, and more advanced math will turn away even intelligent visitors who don’t want to do extra homework on a Saturday morning. Asking any question that requires a textual answer may require you to parse the answer, since the visitor may give you a complete sentence, or capitalize in a weird way, or write in Scottish English, ad nauseum. If you, for example, ask what the capital of United States is, you might turn off commenters from other countries, even if they know the answer – it’s easy for Americans to appear Americentric.
  2. Give your blog original permalinks. When bots finally figure out how to submit a comment successfully, if they find out that your blog entries’ permalinks are only differentiated by a number tacked onto the end, you’ve got a problem: a bot can simply cycle through the numbers and spam every single entry you have. On the other hand, if the permalinks include information like the post date and a post slug, then the bot has to go through the trouble of spidering your site.
  3. Switch markup schemes. Much of the spam these days depends on being able to include HTML links in the comment. If your blog uses a language like Markdown instead, however, you can prevent spammers from accomplishing their goal. They’ll still come, but now you can more easily distinguish the human posts from the bot-generated ones, just by looking at which markup language the author uses. Markup is easier to type than HTML is, so a simple guide to your preferred markup language above the comment box should typically suffice.
  4. Turn your comment form into a contact form. This is the solution I ultimately suggested to Peter. Add a question “Do you want to make this comment public?” and set it to “No” by default. Any “private” comments are directed to your inbox, where a decent mail client such as Thunderbird or a decent mail service such as Gmail will happily filter out all the spam. This way, your comment form doubles as an easily-discoverable way for your visitors to get in touch with you, without having to make all their thoughts known. On the other hand, if they do want their thoughts known, they have only to check the box, which you can make big and bold to remind them.

The last idea seems the most elegant for a smaller blog for which a feedback form might make sense, though it has the potential to channel discussion away from your blog and into your inbox – certainly not a desirable outcome if you intend to improve the signal-to-noise ratio in your comment section. If this idea is implemented, you have to make sure that your blog targets an audience that is comfortable with making public comments in the first place.

Of course, if your blog runs on a more managed system, such as Movable Type or WordPress, the existing comment moderation features render such an idea less useful. Instead, you might consider installing plugins that address the other ideas once spam becomes a problem.

Regardless of which idea suits your site best, it’s important to remember that the success of a lesser-known blog depends primarily on how original you are. There are so many millions of blogs out there that you need to consistently deliver an original style, if not original ideas and content. Expect all of that from this blog this summer.


  1. Hundreds of words of pointless drivel, seasoned generously with outbound hyperlinks. Thank the spambots.


  1. First of all, I hope my comment makes it through your rigorous filter... and second of all, I'm excited to see the supposedly uniqe ideas and content this summer :)