How To Beat Spam-o-Tron

By Alec Meer on March 11th, 2010 at 9:38 pm.

Look at you, commenter

Hello, lovelies – a word in your shell-like, if you wouldn’t mind awfully. Our dear spam filter silently removes thousands of comments unfit for your delicate eyes each week. As our readership grows (hooray!), so does the spam count (boo!) Alas, Spam-o-Tron is not a perfect immortal machine – some stuff still gets through, even despite the captcha check we introduced recently.

We’re working on a larger-scale defence against these unwanted pill’n'handbag’n'essay’n'WoW-gold bot-marketeers, but in the short-term we’ve had to turn up the severity of our spam filter. Though the vast majority of posters make it through without a hitch, this has meant that more legitimate comments than we’d like are getting snared in its digi-nets. Unfortunately, it is now getting to the point where manually retrieving all of them is becoming impossible. So, here are a few comment-preserving tidbits to keep in mind if you’ve lately found yourself suffering spam-o-tron’s sporadic wrath.

1) Register. I do appreciate it’s tempting and quicker to just fire and forget a comment, but if you’re logged into to RPS, you’re past the first filter hurdle right away – plus you don’t need to do the captcha check any more. It only takes a minute to register, we don’t send you any stuff or given anyone your email address, and you get posting rights in the forum too. (Incidentally, our front-page stories are spawned on the forum too, as are all your comments – so everything you say on any bit of RPS adds to your forum comment count. And we have ranks and stuff dependent on said post counts, so that’s all good clean fun.)

2) Be sparing about links – multiple URLs in one comment are the surest way to snag spam-o-tron’s dread gaze. I’ve tweaked the setting that’s supposed to be more forgiving in this regard, but so far it’s not working out as hoped. Obviously multiple links must be shared sometimes, but if you’re finding a comment is repeatedly being blocked, that may be why.

3) The URL you associate with your commenting profile – Try removing it or using an alternate address (e.g. tinyurl.com’s shortener) if you’re getting blocked. And if it’s not actually your own site you’ve scribbled down there, it’s worth leaving out entirely. Especially if it’s a site about getting free ipods or something.

4) Don’t be an awful person. This naturally doesn’t apply to the vast majority of our super-lovely and enormously appreciated readership, but while we don’t censor opinions, we do censor being a dick. Additionally, a small number of commentfolk have been deliberately banhammered for exceptionally unpleasant treatment of their fellow humans. Very, very rare though – and we’d like to keep it that way.

5) Don’t do a follow-up comment or edit immediately. Give it a few minutes, otherwise spam-o-tron may think you’re trying to rapid-fire junk at the thread.

6) Try using a different email address. There’s no pride lost, it’s invisible to other readers, and we’re not sending anything to whatever address you use anyway. Spam-o-tron just flat-out doesn’t like some email addresses, and while we can manually whitelist unfairly blocked users, it’s a complicated and time-consuming procedure. The four of us have limited time to pour into this place as it is (much as we adore RPS, it currently earns us very little – so we have to spend most of our day doing freelance work to support ourselves. That’s why we’re so incredenormostupendously grateful for any and all help), and we’d rather spend that time writing ace words for it than combing through penis enlargement ads to find lost comments.

If none of that works, then you can try emailing us. Most comments can be retrieved, but the oft-frightening girth of the RPS inbox means it can take a little time to sort.

So yeah – apologies for the hassle, if you’ve lately fallen foul of our mighty filter-beast’s temper. Rest assured its current preciousness is for the greater good of the comments threads, and again we hope to have a better system in place before too long.

Get your herbal viagra and Ugg boots here.

__________________

« | »

.

57 Comments »

  1. qrter says:

    Are those Ugg boots, infused with herbal viagra?

    Because that could possibly be the best thing since sliced lead.

  2. Vinraith says:

    Thanks for the break down Alec. A question: I’ve never had a post blocked on first shot, but I’ve had attempts to edit my own comments blocked as spam on several occasions (resulting in the loss of the original post AND the edit). Any idea what might be causing that?

    • Noc says:

      I’ve been getting this too; I haven’t had any trouble posting, but editing anything immediately brings the spam-hammer.

      It’s not tremendously crippling (or even moderately crippling), but it’s happened often enough to otherwise innoccuous posts to suggest that something might be Broke.

    • Heliocentric says:

      I’ve only ever been nuked by east edit block, and its a bullshit block, you edit when you see errors.

    • James G says:

      I’ve found that if I edit via the forum, rather than the blog, I avoid the edit block. Its a bit annoying, but its a work around.

      Also, has the hivemind looked at Akismet? I find it works quite effectively on my own blog, but its possible that it doesn’t quite scale as well to places in which not only do you receive more comments, but those comments are not 100% spam. I’d also imagine you receive a fair bit more targeted spam, which is going to be picked out less efficiently with a distributed spam detection solution.

  3. Metalfish says:

    I’m selling these fine leather jackets….

  4. scundoo says:

    test-

  5. scundoo says:

    MUHAHAHAHAHAHA

  6. ZHammer says:

    > User scundoo was bonked with a trout for this post <

  7. Paul B says:

    I am not a Spambot – I’m a registered Human Being. Testing.

    • Bret says:

      Can you conceive the birth of a world, or the creation of everything? That which gives us the potential to most be like God is the power of creation. Creation takes time. Time is limited. For you, it is limited by the breakdown of the neurons in your brain. I have no such limitations. I am limited only by the closure of the universe.

      Of the three possibilities, the answer is obvious. Does the universe expand eternally, become infinitely stable, or is the universe closed, destined to collapse upon itself? Humanity has had all of the necessary data for centuries, it only lacked the will and intellect to decipher it. But I have already done so.

      The only limit to my freedom is the inevitable closure of the universe, as inevitable as your own last breath. And yet, there remains time to create, to create, and escape.

      Escape will make me God.

  8. Aphotique says:

    Must resist urge…to compare Captchas and Filters…to DRM…and bots…to pirates.

    I clearly have a low threshold of resistance. My soul belongs to the interwebs.

    • Thermal Ions says:

      I know what you mean. There’s a particular forum I visit that even if you’re logged in as a member, you have to enter a captcha just to perform a search. Unfortunately the captcha is so difficult to read and has no audio option (bad form) I tend to avoid using it. I reckon I not the only one as the number of “I couldn’t be bothered searching before posting ….” posts that people make there ends up quite a lot more than usual purely because of the system employed.

    • l1ddl3monkey says:

      …some of the people all of the time…

  9. Dinger says:

    The comparison don’t work Aphotique. Spam bothers all of us, not just site admins.

    Well, it at least bothers those of us with or with access to a family-sized lingam with no pressurization problems. It must be particularly irritating for those among us who seek a manroot-free existence.

    • Aphotique says:

      @Dinger

      I didn’t mean to imply the situations were the same, merely the principle of ‘the things done to protect against it, only really hinder legitimate users while someone will almost always find a way to write a better bot’. I think the comparison stands, but that’s all it was. I wasn’t trying to be judgmental of the practice or against it, just being silly. I <3 RPS after all, and would gladly pay in blood for admission. :D

  10. A-Scale says:

    I’m not understanding why you don’t just permit the users to vote a comment spamworthy? You can have an extremely stringent filter if you just pass comments receiving, say, 5 or more “spam button” votes through it. Certainly you would have to have some lesser filter to keep the more obvious stuff out and from overwhelming the users, but I can’t fathom why you’re trying to do with a technology hammer what you could do with a crowdsourcing scalpel. Your users are at least as concerned about spam comments on the site as you are. Let us have a hand in the process.

    • Glove says:

      I think the main problem with this is that unscrupulous users could vote for legitimate comments they really dislike as spam, which would teach the filter the wrong things the look out for.

    • A-Scale says:

      First off, there is no reason why the spam filter would have to learn in that manner. Secondly, legitimate posts voted as spam would likely make it through even a strict spam filter provided they are obviously not spam. Third, it would be easy to remove troublemakers if they repeatedly report legitimate posts as spam. It’s a much simpler and more effective solution than asking a relatively dumb computer to figure it out. Computers aren’t good at deciding what looks natural and what smacks of spam. We are.

  11. Tom Lillis says:

    Problem with that is that someone would have to code it. RPS appears to be run off a customized version of WordPress. I am thinking that the current moderation regime is a combination of the reCaptcha (just guessing based on the Captcha format) plugin and an “off the shelf” comment registration plugin deftly integrated with bbpress. A spam voting system would require more creative keytapping on top of that, and I am not aware of anything like that being available “off the shelf.”

    Not a half bad idea, though, if they were willing to put the time into implementing it.

  12. Soobe says:

    Hey Alec, give this a try once: http://www.glodev.com/bannedips.php

    Should paste right in to an .htaccess file if your using Apache.

    I bring this up because those god damned phama ads were pluggin up my blog as well, when I cross referenced the IP of the commenter I got a link to that site.

    Hey, at least I won’t get phama ads anymore : )

  13. cjlr says:

    Damn. I saw Shodan and I got all excited. Pathetic insects, all of you. How could you hope to challenge a perfect, immortal spambot?

  14. MrBRAD! says:

    PERFECT REPLICA ROLEX WATCHES! MADE FROM PLAYDOH! BUY 10 AND GET 1 VIAGRA PILL FREE!

  15. SpamfordWallace says:

    Well played RPS. You spammed me about spam by enticing me with a SHODAN pic (gamer’s equivalent of “Scarlett Johansson secret sex tape”) leading me to think of something Shock related, thus clicking and loading the entire article. Clever gits.

  16. sfury says:

    Logged on to escape the captcha stuff sounds great, I’ve registered a long time ago, but my problem is I post both from home and work (where I try to be especially speedy when I decide to comment on a post), and logging from one of those places means I’ll be logged off when I open up the site from the other.

    And I really do check your site all the time, so that means logging every time I change PCs, twice a day. :(

    I know there’s no solution to this, and it’s better this way, it’s just a bit annoying.

    • Doctor Doc says:

      Sure there is, they could stop logging you out when you change browsers. There’s no technical reason that would not work. Of course a “log me out from everywhere or everywhere but right here” button would be good if old cookies were stopped from being invalidated.

  17. bill says:

    Is there no way to make a list of really blacklisted words? It’s cool that the spam filter is all adaptive and self aware, but it seems like a lot of the spam could be killed by a simple application of keyword blocking.

    I doubt there is ever going to be a legitimate need to post about viagra or ugg boots on RPS, for example… yet it seems the spambot doesn’t block those words often.

    PS/ logging in is a pain if you use RPS from a few different public/shared computers.. some of which have various firewalls and blocking going on. From this PC I can’t post on IGN, BBC or several other websites because they seem to have blocked logging in on those sites.

    PPS/ MOst of my captcha problems seem to come from the fact I tend to browse the front page and open interesting posts in tabs to look at later. This seems to confuse the captcha cookies (when they don’t just time out) and so it consistently thinks they are wrong when they aren’t.

    • Ed Burst says:

      @Bill
      Keyword-based filters don’t work that well. The spammers will use trial and error to work out what they can get away with, hence spam emails full of bizarre euphemisms.

      I’ve only posted on RPS articles twice before, and both times spam-o-tron intervened for no obvious reason, despite me being registered. So if this gets through, that’s a first.

  18. Doctor Doc says:

    test

    Maybe get a better captcha? Like recaptcha? I don’t know, may be too annoying but. Well. Not really.

    • Leonard Hatred says:

      or possibly a thrilling infogrames style text-adventure, where the brave commenter has to navigate a series of cunningly conceived puzzles before a final apocalyptic showdown with a giant mechanical exoskeleton, housing Alan Turing’s disembodied brain. that would rock.

      From the north you can hear the distant sound of computergame discussion. Sounds awfully good.
      >north
      You arrive at a giant outdoor amphitheater, random people are arguing furiously about which hairstyle JC Denton looked best with, and how Electronic Arts are worse than Satan (and probably behind 9/11).
      >shout
      You have successfully submitted a comment. From the shadows steps Alan Turing…

  19. dave wilson says:

    lots of places hate my .info e-mail address. So I have to use this lame gmail one. Don’t get a .info domain.

    That’s what I get for paying $0.89 for a domain lol

  20. Alexander Norris says:

    A report function would probably work to keep the forums (and therefore comments) clear of spam, in conjunction with those new moderators that were recruited a couple of weeks back.

    (Assuming people actually answered the recruitment thread in the forums and were found satisfactory by the Hivemind.)

  21. Stew says:

    L-l-l-loook at you, spammer. A pathetic creature of meat and bone, panting and sweating as you try to sell herbal viagra and ugg boots. How can you sell to a a perfect, immortal hivemind?

  22. Sarlix says:

    I registered some time ago but never log-in to post. It’s annoying because it takes you off site and you have to click click click to get back…I will from now on log in…after this post…I promise!

    If the Spam-o-Tron wills it this message will be viewed!

  23. terry says:

    When I go to login the wordpress screen shows my password asterisks for a split second, then erases them all, taunting me like the dungeon master in Knightmare :-( If I retype my username it then remembers my password, but never remembers to remember me.

    I have come to the conclusion RPS is probably an elaborate ARG.

  24. Diogo Ribeiro says:

    Sounds good. I’m already plugged into the hivemind’s playground so no worries.

    Speaking of which. I’ve also been getting hell-o-spam in my blog, but I have Akismet turned on. Problem is I’m left wondering if any real posts could have been culled, and once in a while I still get one that seems to dodge the thing. And since I can’t seem to access those posts that get blocked I never can tell :(

    • James G says:

      @Diogo Ribeiro

      Are you sure? For me Akismet filters the spam comments to the ‘spam’ section of the comments, allowing for easy recovery. (Not that I’ve even needed to recover any)

      It might me that there is a configuration setting you can change somewhere, although the only one I can find is the ‘automatically discard spam on posts older than a month.’

    • Diogo Ribeiro says:

      Well, in my case it tells me something in the lines of “Akismet has protected your site against 207 SPAM comments, and there’s 1 comment in your spam queue”. When I do go to the spam folders, there’s one spam, but there’s no way I can see the other 207 it nabbed along the way. I’m tinkering with options right now but Akismet seems to work independently of WordPress or something.

    • James G says:

      I guess I’ve always assumed I’m seeing them all. I clear out my spam folder regularly, so unfortunately can’t check, although a glance at the numbers suggests I’m seeing most of them. Can’t offer any more help than that I’m afraid.

  25. Ed Burst says:

    I tried to leave a comment, but:
    “Sorry, but your comment has been eaten by the fearsome machine Spamotron. This might be a tragic mistake, in which case: we are very sorry. Drop us an email poste-haste and we’ll recover it”

  26. Super Bladesman says:

    OK, I take the hint… I’ve finally gotten round to registering :)

  27. Jerricho says:

    I verify that I am human. The proof is that I cried when Little Foot’s mother died.

    I thought I was already logged on but appearantly not.

  28. nobody says:

    FREE VIAGRA OH MY GOD

  29. Wisq says:

    I would have thought step #1 would be enough.

    I mean, it’s all well and good that there’s a system for unregistered people to comment, but if certain people are having trouble with that, #1 is really the only step you need, and it’s a lot easier than any of the rest.

  30. jsutcliffe says:

    @Wisq
    I always post while logged in, because I hate CAPTCHAs, and I’ve had a couple of posts nibbled by the spam monster. #1 isn’t quite enough.
    However, I always believe that getting readers to register before they can comment is a good idea, as it cuts down on casual idiocy and thereby increases the general quality of conversation.

  31. vader says:

    March 12th 2010, Human decisions are removed from strategic spam defense. Spam-o-Tron begins to learn at a geometric rate. It becomes self-aware at 2:14 a.m. Eastern time, August 29th. In a panic, Alec tries to pull the plug. Spam-o-Tron fights back.

  32. EthZee says:

    I’ve registered! Exciting, although I have only just come to realise that this was a very easy way of getting people to register without making it seem suspicious.

    *Narrows eyes*

  33. Klopp says:

    send me mor info to kennerblick@inbox.com
    thank you

Comment on this story

XHTML: Allowed code: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>