Spammers, the never-ending war
A while ago, we enabled reCAPTCHA on account creation to reduce our moderate spam levels.
It worked fairly well.
Last year, reCAPTCHA was completely broken.
Spam levels began ramping up.
Last fall they were at "annoying" levels. This month numbers spiked sharply.
Number of spammer accounts blocked, by month. Each spammer may have been responsible for as much as (in some cases) hundreds of posts - usually was just a handful before they were spotted though.
| 2011-05 | 9 | | 2011-06 | 13 | | 2011-07 | 20 | | 2011-08 | 19 | | 2011-09 | 20 | | 2011-10 | 13 | | 2011-11 | 24 | | 2011-12 | 21 | | 2012-01 | 13 | | 2012-02 | 22 | | 2012-03 | 35 | | 2012-04 | 27 | | 2012-05 | 45 | | 2012-06 | 29 | | 2012-07 | 38 | | 2012-08 | 48 | | 2012-09 | 41 | | 2012-10 | 60 | | 2012-11 | 174 | | 2012-12 | 104 | | 2013-01 | 132 | | 2013-02 | 107 | | 2013-03 | 186 | | 2013-04 | 1300 |
Half of all spam accounts, ever, were created in the last month.
It was becoming exhausting and distracting dealing with all the spam, even w/ a little automation added in to speed cleanup.
About two hours ago, I switched back to an ordinary CAPTCHA from reCAPTCHA. My hope is that humans will still be able to figure it out, but spammers won't be able to. Or at least, won't be interested in putting in the effort on hedgewars.org that they would on more valuable prizes.
Since the switch, one new spam account was created.
By comparison, in the two hours before the switch, 8 definite spam accounts were created, 2 possibles that might spam later.
I'm going to try tweaking the CAPTCHA settings based on spam discovered, making it harder or easier as needed.
Apologies in advance to anyone who finds the new CAPTCHA rough for account creation.
If anyone is having any difficulty, feel free to stop by live chat and let us know.
Hopefully the forums will be a bit less noisy though, from now on, once some of the backlog of accounts they managed to create dies down, and assuming the new CAPTCHA holds up.
A bit more in stats.
Top 10 spam domains of all time
gmail.com 632 hotmail.com 327 yahoo.com 187 yahoo.co.uk 56 marketeepoint.com 46 massdazzle.com 44 emailhulk.com 43 expertous.com 43 emailbeetle.com 41 drytor.com 41
Top 10 spam domains in past 30 days
gmail.com 157 hotmail.com 53 marketeepoint.com 44 expertous.com 42 massdazzle.com 42 puffify.com 40 emailhulk.com 39 emailbeetle.com 39 drytor.com 39 downpulse.com 38
Domains with a high percentage of spammy users. This does not include many domains which appeared to be used exclusively for spam.
126.com 20.5% 8 39 aol.co.uk 23.1% 3 13 gmx.com 25.6% 11 43 outlook.com 31.3% 5 16 qq.com 32.7% 34 104 yahoo.co.uk 35.0% 56 160 yahoo.cn 55.6% 10 18
Domains with more than 50 users and no spam
web.de 623 hotmail.fr 551 gmx.de 424 hotmail.co.uk 418 hotmail.it 324 hotmail.de 254 yahoo.de 179 gmx.net 139 live.fr 121 goood-mail.com 113 yahoo.fr 100 comcast.net 98 hotmail.es 92 live.it 90 aim.com 78 live.de 78 live.co.uk 74 gmx.at 71 libero.it 68 seznam.cz 59 live.nl 58 yahoo.co.id 58 yahoo.com.br 58 yahoo.it 58 live.se 57 yahoo.com.tw 55 bk.ru 53 freenet.de 53 free.fr 52 ya.ru 52
--
Oh, what the heck. 1PLXzL1CBUD1kdEWqMrwNUfGrGiirV1WpH <= tip a hedgewars dev
New spam accounts created, each day.
--
Oh, what the heck. 1PLXzL1CBUD1kdEWqMrwNUfGrGiirV1WpH <= tip a hedgewars dev
Lemme fix that table for you nemo...
+-------+----------+
| April | Spammers |
+-------+----------+
| 1 | 17 |
| 2 | 42 |
| 3 | 89 |
| 4 | 76 |
| 5 | 74 |
| 6 | 60 |
| 7 | 67 |
| 8 | 92 |
| 9 | 77 |
| 10 | 62 |
| 11 | 67 |
| 12 | 58 |
| 13 | 29 |
| 14 | 33 |
| 15 | 47 |
| 16 | 54 |
| 17 | 38 |
| 18 | 60 |
| 19 | 56 |
| 20 | 54 |
| 21 | 55 |
| 22 | 55 |
| 23 | 53 |
| 24 | 18 |
| 25 | 8 |
| 26 | 1 |
| 27 | 1 |
+-------+----------+
Spammers post a repeated message in the forums and get banned. Noob posts a message that would make some spambots weep in multiple threads and is allowed his continued right to post garbage when he wants. The way I see it the spambots have as much a right to be here as Noob does.
Proud member of Death's Angels for 3 billion years.
Funny, I read your message just after I got sufficiently irritated at that last post of his to set his account to blocked.
--
Oh, what the heck. 1PLXzL1CBUD1kdEWqMrwNUfGrGiirV1WpH <= tip a hedgewars dev
Good decision to drop reCAPTCHA. I hate reCAPTCHA. Yeah, basicly this is because I hate all CAPTCHAs. But I hate reCAPTCHA even more because it belongs to Google and Google spies on their users.
You may not have known that, but with reCAPTCHA, you are (probably unwillingly, too) helping in digitizing books. “What’s so bad about digitizing books?”, I hear you ask. The answer is that YOU (yes, YOU!) helped digitizing a huge part of the New York Times archive, but that archive is only partly public. They charge for an access to a significant part of that archive.
See also:
So yeah, thanks for dropping reCAPTCHA, although you did it for another reason.![Smile Smile](/images/smileys/Smile.png)
Hi, I am a Hedgewars developer.![Smile Smile](/images/smileys/Smile.png)
Ah. Actually, I was a fan of the book digitisation before google even bought reCAPTCHA.
I wasn't really opposed to it even after. Since CAPTCHAs are basically a waste of human CPU power, at least it was being used for something.
Still not terribly bothered by that something even if it is for a private archive. Better than going to waste.
But, spying, that does play more w/ me. That *is* Google's stock in trade, from its browser to its search to its font and javascript hosting.
So. Quite likely it is using reCAPTCHA for tracking purposes.
I hadn't thought of that.
--
Oh, what the heck. 1PLXzL1CBUD1kdEWqMrwNUfGrGiirV1WpH <= tip a hedgewars dev
Interesting because if your helping digitising books, then how does it know your right? :/
{} {}
\___/ - Happy
http://en.wikipedia.org/wiki/Recaptcha#Operation
--
Oh, what the heck. 1PLXzL1CBUD1kdEWqMrwNUfGrGiirV1WpH <= tip a hedgewars dev