Posted January 14, 2007 5:27 pm by with 13 comments

Tweet about this on TwitterShare on LinkedInShare on Google+Share on FacebookBuffer this page

Marshall Kirkpatrick’s post about the amount of spam getting thru his Gmail filter – 19 of 26 emails were spam – has a comment from blogging buddy, Jeremiah, that reminded me of an issue I’ve noticed with Akismet.

While Jeremiah’s finding Akismet to be reliable, I’m finding it to be less so. With the number of comments increasing on Marketing Pilgrim, I’m thankful for the spam filtered out by Akismet, but I’m having to spend too much time “de-spamming” false positives.

This was magnified with the Digging of a recent article, resulting in many comments. I found many legitimate comments appearing in Akismet – including some from people who had commented before – and I had to manually de-spam them, often the same person multiple times.

It’s become impossible to simply let Akismet do its thing and trust that it’s not falsely flagging comments as spam.

If anyone knows of a better solution, I’m all ears. In the meantime, here’s what Akismet needs (especially the WordPress plugin).

  • Manually enter trusted IPs or email addresses.
  • Include my own list of trigger words to prevent spam that still gets past Akismet
  • Sort and filter options for viewing flagged spam. Let me sort by email, name, IP etc, so I can find “friendly” comments a lot easier.

So while Marshall is having problems with Gmail – it’s doing fine for me, although 3 of the past 12 emails were spam – and Jeremiah is happy with Akismet, I’m certainly finding it to be straining at the seams.

Anyone else finding an increase in false positives, or more spam getting through? Any other features you’d like to see added? Perhaps you have a better solution for spam than Akismet – I’m all ears.

  • I’ve noticed the same trends with both Akismet and Gmail.

    I’ve heard that Spam Karma 2 does a good job with WordPress comments. If Akismet doesn’t start cleaning up it’s act soon, I am going to give it a shot.

  • I’ve been having the same issue with Akismet. I’ve had about 1 comment a day or every other day in there, which forces me to go through them. It’s better than getting a moderation email with each one, but not as reliable as it was a month or more ago.

    Gmail becomes more disappointing with junk mail. I have it combined with Mail (Mac OS X), which is a pretty strong combination, but I still get these stock alert emails and RSET or QUIT emails that always make it through.

    Thanks for the blog Andy!

  • I am very happy with SK2 – only one major problem when someone managed to get their IP banned by commenting so fast they resembled a bot.

    Most first time posters have no problem at all if commenting on new content. On older content they sometimes have to fill out the captcha.

    As for Gmail, it handled the Christmas spam session fairly well, the only annoying thing was when my contact form emails started appearing in the spam folder though I have that cleared up.

  • rick gregory

    Then there’s me… GMail misses maybe 1 in 50 spams and rarely does false positives. I get about 50 pieces of spam a day in GMail – 1-3 make it through. Maybe one or 2 false positives a week. I can’t complain about that.

    The thing about GMail and Akismet is that you have to train them – and keep doing it. If you simply delete the spam or move the false positives back manually you’re not populating the filter’s database correctly.

    Now, if someone is doing this and still seein a lot of false positives or missed spam, I’m a bit at a loss

  • Rick:

    While that is true, Akismet uses a centralized database with a neural net of some sort. People would have to radically change their blacklisting/despamming behaviour ‘en masse’ in order to affect Akismet to that extent. Same case for Gmail, I believe.

  • rick gregory

    Hmm… true of course. The interesting question then is why GMail would show such radically different results for people (I can’t speak in detail to Akismet since I don’t blog). The question is whether they ONLY use a central database of whether filtering also factors in individual training as do some belief network based systems. After all, while much of our email may look the same I doubt my email looks much like that of a 14 teenage girl (I’m 48 and male).

  • I would think that Gmail uses a centralized database/neural net for spam. I think some of the individual differences in the amount of false positives coming through may just be a function of the relative amounts of spam coming into an account.

    For example, an account which normally gets 200 spam per day may get a single false positive per day while an account that gets 20 spam per day may only get a false positive every two weeks.

    Although your email may look different from that of a 14 year old girl, you still are, in all likelyhood, recieving the same spam, which I am sure is tripping the same filters and getting the same handling.

  • A few issues.

    First, obviously we need to go after the originators of spam – follow the money, as someone is getting rich or else it wouldn’t continue (lately, it’s gotten much worse).

    Second, speaking generally, while some people might actually leave legitimate comments on one site, at other times, they’re spamming other sites from the same machine. Hence, they get snagged by the spam filter.

    I had a client once who tried to run a small email campaign for his home business from his home computer. He was a Comcast cable subscriber, so any originating email from his personal machine quickly got his IP address blacklisted on SORBS and many others.

    A reality check for us to see how good we actually have it is to turn off Akismet for one day and watch the spam sludge ooze back in.

  • Hi Michael, I agree with your thoughts. I couldn’t operate with some kind of spam filter and a person leaving a legitimate comments on my blog, could be spamming dozens of others.

    Seeing as Akismet is a plugin, surely it could be modified to let me overide what it thinks is spam, on a local level only.

  • A HUGE vote here for Spam Karma 2. I couldn’t tell you the last time I had a false positive or negative, and my blogs get thousands of spams every day.

  • SK2 is definitely the way to go if you are having problems with Akismet. It has a lot more control than I have found with Akismet.

    However, I really suggest is that you setup an additional line of defence in the form of the math plugin for wordpress or a captcha plugin. It prevents a lot of bot generated comments so SK2 or Akismet don’t even need to process them.

    As for Gmail, its starting to irritate me because the comments I make on my blog are landing up in the spam bin!

  • I’ve noticed a lot of false-positives since upgrading from WP 2.0.4 to 2.1.3 and moving to a new host last week. I’d say about 80% of my legit comments in the last several days have been flagged by Akismet as being spam even though most of them have been left by frequent users that never had been flagged before. It could be a fluke from the entries I’ve written, it could be that the upgraded Akismet is far more aggressive than it needs to be.

    I agree that Akismet desperately needs a “whitelist” function so you can skip checks on comments from specific IP addresses or any other set of criteria. It also needs to give you a “blacklist” threshold so that repeat offenders can be given a 403 error to shut them out, possibly in conjunction with Bad Behavior or http:BL.

  • Very informative about spam emails thru Gmail filter.

    However, email spamming in general is definitely a big dollar consumer (billions) and bandwidth eater. A free software, very simple to install at your server; edit spam words file and spam email addresses to delete spam emails; can be downloaded FREE from our website to delete spam emails at your server itself.

    Satya (a.k.a