Search code examples
spam-prevention

Spam from multiple user agents, same IP


I have a lot of spam posts in a forum that I moderate that I can't quite figure out.

(1) The spammer seems to be getting through Captcha

(2) I have logged the same IP (a Charter/Spectrum address -- so I can't block the ASN) for the following User Agents:

  [
  {
    "userAgent": "Nokia7250/1.0 (3.14) Profile/MIDP-1.0 Configuration/CLDC-1.0"
  },
  {
    "userAgent": "Mozilla/5.0 (PlayBook; U; RIM Tablet OS 2.1.0; en-US) AppleWebKit/536.2+ (KHTML like Gecko) Version/7.2.1.0 Safari/536.2+"
  },
  {
    "userAgent": "P3P Validator"
  },
  {
    "userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:40.0) Gecko/20100101 Firefox/40.0"
  },
  {
    "userAgent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:55.0) Gecko/20100101 Firefox/55.0"
  },
  {
    "userAgent": "Bloglines/3.1 (http://www.bloglines.com)"
  },
  {
    "userAgent": "SonyEricssonK810i/R1KG Browser/NetFront/3.3 Profile/MIDP-2.0 Configuration/CLDC-1.1"
  },
  {
    "userAgent": "SonyEricssonT610/R201 Profile/MIDP-1.0 Configuration/CLDC-1.0"
  },
  {
    "userAgent": "Mozilla/5.0 (Linux; U; Android 1.5; de-de; Galaxy Build/CUPCAKE) AppleWebKit/528.5  (KHTML, like Gecko) Version/3.1.2 Mobile Safari/525.20.1"
  },
  {
    "userAgent": "Baiduspider ( http://www.baidu.com/search/spider.htm)"
  },
  {
    "userAgent": "Mozilla/5.0 (Windows NT 6.2; ARM; Trident/7.0; Touch; rv:11.0; WPDesktop; NOKIA; Lumia 920) like Geckoo"
  },
  {
    "userAgent": "Mozilla/5.0 (Linux; U; Android 1.5; de-de; Galaxy Build/CUPCAKE) AppleWebKit/528.5  (KHTML, like Gecko) Version/3.1.2 Mobile Safari/525.20.1"
  },
  {
    "userAgent": "SEC-SGHX210/1.0 UP.Link/6.3.1.13.0"
  },
  {
    "userAgent": "SEC-SGHX210/1.0 UP.Link/6.3.1.13.0"
  },
  {
    "userAgent": "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US) AppleWebKit/534.14 (KHTML, like Gecko) Chrome/9.0.601.0 Safari/534.14"
  },
  {
    "userAgent": "Mozilla/5.0 (Windows NT 6.3; Trident/7.0; rv:11.0) like Gecko"
  },
  {
    "userAgent": "Gaisbot/3.0 ([email protected]; http://gais.cs.ccu.edu.tw/robot.php)"
  },
  {
    "userAgent": "Mozilla/5.0 (Maemo; Linux armv7l; rv:10.0.1) Gecko/20100101 Firefox/10.0.1 Fennec/10.0.1"
  },
  {
    "userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:47.0) Gecko/20100101 Firefox/47.0"
  },
  {
    "userAgent": "Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_3 like Mac OS X; de-de) AppleWebKit/533.17.9 (KHTML, like Gecko) Mobile/8F190"
  }
]
]

This is just an example, but the pattern is common. Multiple UA's from the same IP over a period of time, and the IP is almost always tied to a common consumer ISP. Any thoughts on this?

Let me know if you can think of something else I could log that would also be useful. Thanks!


Solution

  • But then why wouldn't the spammer just use something super common like the current Safari iPhone UA?

    Spammers usually use specific tools like Xrumer, which allows automatic change useragent, email registration, solve Captcha etc.

    Anti-spam efficiency is based on resource consumption. A moderator must waste a few seconds to remove spam, but a spammer must waste a few minutes to made his durty things.
    Therefore, it is necessary to deprive the spammer of the opportunity to automate his process.

    1. Use a serious captcha - reCaptcha, hCaptcha, etc.

    2. Close the ability to post without registration.

    3. Prohibit the use of automatic mail services such as mailforspam.com for registration.

    4. If we are dealing with a bot and not a person, invisible fields are added to the registration form, which the person will not fill, but the bot will see these in the HTML code and fill in.
      Replacing the Submit button with the corresponding image. <input type = 'submit' value='Post'> remains in the HTML code, but the form is not submitted by it. Submitting the form is done by clicking on the picture, which the robot does not see, but the person sees.

    There are many tricks, but it all depends on the capabilities of the forum engine.
    To begin with, it would be nice to determine whether a person is spamming or a robot.