Spammers, the never-ending war

8 replies [Last post]
nemo
nemo's picture
User offline. Last seen 6 days 19 hours ago. Offline
Joined: 2009-01-28
Posts: 1861

A while ago, we enabled reCAPTCHA on account creation to reduce our moderate spam levels.
It worked fairly well.

Last year, reCAPTCHA was completely broken.
Spam levels began ramping up.
Last fall they were at "annoying" levels. This month numbers spiked sharply.
Number of spammer accounts blocked, by month. Each spammer may have been responsible for as much as (in some cases) hundreds of posts - usually was just a handful before they were spotted though.

| 2011-05 |        9 |
| 2011-06 |       13 |
| 2011-07 |       20 |
| 2011-08 |       19 |
| 2011-09 |       20 |
| 2011-10 |       13 |
| 2011-11 |       24 |
| 2011-12 |       21 |
| 2012-01 |       13 |
| 2012-02 |       22 |
| 2012-03 |       35 |
| 2012-04 |       27 |
| 2012-05 |       45 |
| 2012-06 |       29 |
| 2012-07 |       38 |
| 2012-08 |       48 |
| 2012-09 |       41 |
| 2012-10 |       60 |
| 2012-11 |      174 |
| 2012-12 |      104 |
| 2013-01 |      132 |
| 2013-02 |      107 |
| 2013-03 |      186 |
| 2013-04 |     1300 |

Half of all spam accounts, ever, were created in the last month.
It was becoming exhausting and distracting dealing with all the spam, even w/ a little automation added in to speed cleanup.
About two hours ago, I switched back to an ordinary CAPTCHA from reCAPTCHA. My hope is that humans will still be able to figure it out, but spammers won't be able to. Or at least, won't be interested in putting in the effort on hedgewars.org that they would on more valuable prizes.
Since the switch, one new spam account was created.
By comparison, in the two hours before the switch, 8 definite spam accounts were created, 2 possibles that might spam later.

I'm going to try tweaking the CAPTCHA settings based on spam discovered, making it harder or easier as needed.

Apologies in advance to anyone who finds the new CAPTCHA rough for account creation.
If anyone is having any difficulty, feel free to stop by live chat and let us know.

Hopefully the forums will be a bit less noisy though, from now on, once some of the backlog of accounts they managed to create dies down, and assuming the new CAPTCHA holds up.

A bit more in stats.
Top 10 spam domains of all time

gmail.com               632
hotmail.com             327
yahoo.com               187
yahoo.co.uk             56
marketeepoint.com       46
massdazzle.com          44
emailhulk.com           43
expertous.com           43
emailbeetle.com         41
drytor.com              41

Top 10 spam domains in past 30 days

gmail.com               157
hotmail.com             53
marketeepoint.com       44
expertous.com           42
massdazzle.com          42
puffify.com             40
emailhulk.com           39
emailbeetle.com         39
drytor.com              39
downpulse.com           38

Domains with a high percentage of spammy users. This does not include many domains which appeared to be used exclusively for spam.

126.com                 20.5%   8       39
aol.co.uk               23.1%   3       13
gmx.com                 25.6%   11      43
outlook.com             31.3%   5       16
qq.com                  32.7%   34      104
yahoo.co.uk             35.0%   56      160
yahoo.cn                55.6%   10      18

Domains with more than 50 users and no spam

web.de                  623
hotmail.fr              551
gmx.de                  424
hotmail.co.uk           418
hotmail.it              324
hotmail.de              254
yahoo.de                179
gmx.net                 139
live.fr                 121
goood-mail.com          113
yahoo.fr                100
comcast.net             98
hotmail.es              92
live.it                 90
aim.com                 78
live.de                 78
live.co.uk              74
gmx.at                  71
libero.it               68
seznam.cz               59
live.nl                 58
yahoo.co.id             58
yahoo.com.br            58
yahoo.it                58
live.se                 57
yahoo.com.tw            55
bk.ru                   53
freenet.de              53
free.fr                 52
ya.ru                   52

--
Oh, what the heck. 1PLXzL1CBUD1kdEWqMrwNUfGrGiirV1WpH <= tip a hedgewars dev

nemo
nemo's picture
User offline. Last seen 6 days 19 hours ago. Offline
Joined: 2009-01-28
Posts: 1861

New spam accounts created, each day.

+-------+----------+
| April | Spammers |
+-------+----------+
|     1 |       17 |
|     2 |       42 |
|     3 |       89 |
|     4 |       76 |
|     5 |       74 |
|     6 |       60 |
|     7 |       67 |
|     8 |       92 |
|     9 |       77 |
|    10 |       62 |
|    11 |       67 |
|    12 |       58 |
|    13 |       29 |
|    14 |       33 |
|    15 |       47 |
|    16 |       54 |
|    17 |       38 |
|    18 |       60 |
|    19 |       56 |
|    20 |       54 |
|    21 |       55 |
|    22 |       55 |
|    23 |       53 |
|    24 |       18 |
|    25 |        8 |
|    26 |        1 |
|    27 |        0 |
|    28 |        0 |
|    29 |        0 |
|    30 |        0 |
+-------+----------+
+-------+----------+
|  May  | Spammers |
+-------+----------+
|     1 |        0 |
|     2 |        0 |
|     3 |        0 |
|     4 |        0 |
|     5 |        0 |
|     6 |        0 |
|     7 |        1 |
|     8 |        0 |
|     9 |        0 |
|    10 |        0 |
|    11 |        0 |
|    12 |        0 |
|    13 |        0 |
|    14 |        0 |
|    15 |        0 |
|    16 |        1 |
|    17 |        0 |
+-------+----------+

--
Oh, what the heck. 1PLXzL1CBUD1kdEWqMrwNUfGrGiirV1WpH <= tip a hedgewars dev

glum reaper
glum reaper's picture
User offline. Last seen 11 weeks 3 days ago. Offline
Joined: 2011-03-24
Posts: 56

Noob.com_123-321 allegedly wrote:

Polymascotfoamalate
Feed it to the babies
Polymascotfoamalate
Or as a topping on soured cream!
(Repeat approximately 27 more times)

Lemme fix that table for you nemo...

+-------+----------+
| April | Spammers |
+-------+----------+
| 1 | 17 |
| 2 | 42 |
| 3 | 89 |
| 4 | 76 |
| 5 | 74 |
| 6 | 60 |
| 7 | 67 |
| 8 | 92 |
| 9 | 77 |
| 10 | 62 |
| 11 | 67 |
| 12 | 58 |
| 13 | 29 |
| 14 | 33 |
| 15 | 47 |
| 16 | 54 |
| 17 | 38 |
| 18 | 60 |
| 19 | 56 |
| 20 | 54 |
| 21 | 55 |
| 22 | 55 |
| 23 | 53 |
| 24 | 18 |
| 25 | 8 |
| 26 | 1 |
| 27 | 1 |
+-------+----------+

Spammers post a repeated message in the forums and get banned. Noob posts a message that would make some spambots weep in multiple threads and is allowed his continued right to post garbage when he wants. The way I see it the spambots have as much a right to be here as Noob does.

Proud member of Death's Angels for 3 billion years.

nemo
nemo's picture
User offline. Last seen 6 days 19 hours ago. Offline
Joined: 2009-01-28
Posts: 1861

Funny, I read your message just after I got sufficiently irritated at that last post of his to set his account to blocked.

--
Oh, what the heck. 1PLXzL1CBUD1kdEWqMrwNUfGrGiirV1WpH <= tip a hedgewars dev

Wuzzy
Wuzzy's picture
User offline. Last seen 25 weeks 5 days ago. Offline
Joined: 2012-06-20
Posts: 1271

Good decision to drop reCAPTCHA. I hate reCAPTCHA. Yeah, basicly this is because I hate all CAPTCHAs. But I hate reCAPTCHA even more because it belongs to Google and Google spies on their users.

You may not have known that, but with reCAPTCHA, you are (probably unwillingly, too) helping in digitizing books. “What’s so bad about digitizing books?”, I hear you ask. The answer is that YOU (yes, YOU!) helped digitizing a huge part of the New York Times archive, but that archive is only partly public. They charge for an access to a significant part of that archive.

See also:

So yeah, thanks for dropping reCAPTCHA, although you did it for another reason. Smile

Hi, I am a Hedgewars developer. Smile

nemo
nemo's picture
User offline. Last seen 6 days 19 hours ago. Offline
Joined: 2009-01-28
Posts: 1861

Ah. Actually, I was a fan of the book digitisation before google even bought reCAPTCHA.
I wasn't really opposed to it even after. Since CAPTCHAs are basically a waste of human CPU power, at least it was being used for something.
Still not terribly bothered by that something even if it is for a private archive. Better than going to waste.

But, spying, that does play more w/ me. That *is* Google's stock in trade, from its browser to its search to its font and javascript hosting.

So. Quite likely it is using reCAPTCHA for tracking purposes.

I hadn't thought of that.

--
Oh, what the heck. 1PLXzL1CBUD1kdEWqMrwNUfGrGiirV1WpH <= tip a hedgewars dev

oranebeast
User offline. Last seen 8 years 25 weeks ago. Offline
Joined: 2013-06-10
Posts: 104

nemo allegedly wrote:

Ah. Actually, I was a fan of the book digitisation before google even bought reCAPTCHA.
I wasn't really opposed to it even after. Since CAPTCHAs are basically a waste of human CPU power, at least it was being used for something.
Still not terribly bothered by that something even if it is for a private archive. Better than going to waste.

But, spying, that does play more w/ me. That *is* Google's stock in trade, from its browser to its search to its font and javascript hosting.

So. Quite likely it is using reCAPTCHA for tracking purposes.

I hadn't thought of that.

Interesting because if your helping digitising books, then how does it know your right? :/

{} {}
\___/ - Happy

nemo
nemo's picture
User offline. Last seen 6 days 19 hours ago. Offline
Joined: 2009-01-28
Posts: 1861

http://en.wikipedia.org/wiki/Recaptcha#Operation

--
Oh, what the heck. 1PLXzL1CBUD1kdEWqMrwNUfGrGiirV1WpH <= tip a hedgewars dev

Copyright © 2004-2021 Hedgewars Project. All rights reserved. [ contact ]