Jump to content
  • Checkout
  • Login
  • Get in touch

osCommerce

The e-commerce.

How to handle bad crawlers, bad scripts, spam IP addresses etc.


Andreas2003

Recommended Posts

Posted

Hi there,

 

I found out, that sometimes I got hits from

- bad crawlers

- bad scripts

- spam IP addresses

 

and now I want to know from you experts, how you handle this ?

 

For example for bad script as well as spam IP address, I have a support ticket system on my site, and sometimes, I'm receiving support tickets from a fake email address, where someone tried to paste a code into the form, trying to use the form as a relay.

 

For example, I had hits from IP addresses, where the referer (where they are coming from) seens strange. Checked the IP address and it was listed by www.sorbs.net.

 

I dont know, if these hits are a problem or not. Perhaps you have some answers for me ?!

 

If it is a problem, how do you handle this ?

 

Thanks in advance,

kind regards

Andreas

Posted

You can block IP's easily enough from your site; put this in your .htaccess file:

 

order allow,deny

deny from 1.2.3.4

deny from 123.456.789.

allow from all

 

and put each IP you want to block on a seperate line.

See my profile for our main OSCommerce site. My views are not necessarily the views of my employer.

Posted

I already know the usage of the htaccess.

In this case its not possible because the hits came from several IP's.

 

Any idea how the Rewrite Rule should look like ?

Posted

You can use a capcha (?) image system but that takes work and also puts people off filling in forms.

 

It's impossible to stop then using up bandwidth but in exteam cases what I've done is rename the the form and used a redirect

 

Redirect /links_submit.php domain/This_button_is_there_to_fool_spammers_please_use_our_contact_page.html

 

Here's an example of an htaccess file that can stop a fair few of them, if you use it test it out and keep an eye on your stats.

 

<Files 403.shtml>

order allow,deny

allow from all

</Files>

 

 

RewriteEngine On

RewriteCond %{HTTP_USER_AGENT} ^CherryPicker [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^Extreme\ Picture\ Finder [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^HTTrack [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^JoBo [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^MSIECrawler [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^ninja [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^SiteCopy [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^SiteSucker [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^teleport [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^WebBandit [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^WebCopier [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^Webdup [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^WebReaper [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^WebSnake [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^WebStripper [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^WebMiner [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^x-Tractor [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^WebZIP [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^mister\ pix [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^PICgrabber [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^psbot [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^Mozilla/2.0\ \(compatible;\ NEWT\ ActiveX;\ Win32\) [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^WebCollector [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^WebPix [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^EmailCollector [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^EmailMagnet [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^EmailReaper [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^NICErsPRO [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^WebEMailExtractor [NC,OR]

RewriteCond %{REMOTE_ADDR} ^63.148.99.2(2[4-9]|[3-4][0-9]|5[0-5])$ [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^NPBot [NC,OR]

RewriteCond %{REMOTE_ADDR} ^12.148.196.(12[8-9]|1[3-9][0-9]|2[0-4][0-9]|25[0-5])$ [NC,OR]

RewriteCond %{REMOTE_ADDR} ^12.148.209.(19[2-9]|2[0-4][0-9]|25[0-5])$ [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^TurnitinBot [NC,OR]

RewriteCond %{REMOTE_ADDR} ^64.140.49.6([6-9])$ [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^ClariaBot [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^Diamond [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^[a-z]+$ [NC]

# User-Agents with no privileges (mostly spambots/spybots/offline downloaders that ignore robots.txt)

# see http://diveintomark.org/archives/2003/02/2...s_to_go_to_hell

RewriteCond %{REMOTE_ADDR} ^220\.181\.33\.225 [OR] #rude bot

RewriteCond %{REMOTE_ADDR} ^60\.28\.252\.77 [OR] #rude bot

RewriteCond %{REMOTE_ADDR} ^69\.31\.1\.154 [OR] #rude bot

RewriteCond %{REMOTE_ADDR} ^24\.86\.103\.176 [OR] #spammer

RewriteCond %{REMOTE_ADDR} ^81\.95\.146\.162 [OR] #spammer

RewriteCond %{REMOTE_ADDR} ^193\.252\.177\.186 [OR] #spammer

RewriteCond %{REMOTE_ADDR} "^63\.148\.99\.2(2[4-9]|[3-4][0-9]|5[0-5])$" [OR] # Cyveillance spybot

RewriteCond %{REMOTE_ADDR} ^12\.148\.196\.(12[8-9]|1[3-9][0-9]|2[0-4][0-9]|25[0-5])$ [OR] # NameProtect spybot

RewriteCond %{REMOTE_ADDR} ^12\.148\.209\.(19[2-9]|2[0-4][0-9]|25[0-5])$ [OR] # NameProtect spybot

RewriteCond %{REMOTE_ADDR} ^64\.140\.49\.6([6-9])$ [OR] # Turnitin spybot

RewriteCond %{HTTP_REFERER} iaea\.org [OR] # spambot

RewriteCond %{HTTP_REFERER} neopets\.com [OR] # referrer spam

RewriteCond %{HTTP_REFERER} spampoison\.com [OR] # looks exactly like a spambot

RewriteCond %{HTTP_REFERER} riaa\.com [OR] # some bot

RewriteCond %{HTTP_REFERER} cxa\.de [OR] # porn site

RewriteCond %{HTTP_REFERER} filthserver\.com [OR] # porn site

RewriteCond %{HTTP_REFERER} wastedpartygirls\.com [OR] # porn site

RewriteCond %{HTTP_REFERER} amateurxpass\.com [OR] # porn site

RewriteCond %{HTTP_REFERER} mature--young\.com [OR] # porn site

RewriteCond %{HTTP_REFERER} bloglisting\.com [OR] # porn site

RewriteCond %{HTTP_REFERER} nudecelebblogs\.com [OR] # porn site

RewriteCond %{HTTP_REFERER} sexrabbit\.de [OR] # porn site

RewriteCond %{HTTP_REFERER} busty2\.com [OR] # porn site

RewriteCond %{HTTP_REFERER} adult-models\.biz [OR] # porn site

RewriteCond %{HTTP_REFERER} freenudecelebrity\.net [OR] # porn site

RewriteCond %{HTTP_REFERER} limolimo\.net [OR] # dont know

RewriteCond %{HTTP_REFERER} shatteredreality\.net [OR] # spammer site

RewriteCond %{HTTP_USER_AGENT} ^[A-Z]+$ [OR] # spambot

RewriteCond %{HTTP_USER_AGENT} anarchie [NC,OR] # OD

RewriteCond %{HTTP_USER_AGENT} cherry.?picker [NC,OR] # spambot

RewriteCond %{HTTP_USER_AGENT} "compatible ; MSIE 6.0" [OR] # spambot (note extra space before semicolon)

RewriteCond %{HTTP_USER_AGENT} crescent [NC,OR] # OD

RewriteCond %{HTTP_USER_AGENT} "^DA \d\.\d+" [OR] # OD

RewriteCond %{HTTP_USER_AGENT} "DTS Agent" [OR] # OD

RewriteCond %{HTTP_USER_AGENT} "^Download" [OR] # OD

RewriteCond %{HTTP_USER_AGENT} EasyDL/\d\.\d+ [OR] # OD

RewriteCond %{HTTP_USER_AGENT} e?mail.?(collector|magnet|reaper|siphon|sweeper|harvest|collect|wolf) [NC,OR] # spambot

RewriteCond %{HTTP_USER_AGENT} express [NC,OR] # OD

RewriteCond %{HTTP_USER_AGENT} extractor [NC,OR] # OD

RewriteCond %{HTTP_USER_AGENT} "Fetch API Request" [OR] # OD

RewriteCond %{HTTP_USER_AGENT} flashget [NC,OR] # OD

RewriteCond %{HTTP_USER_AGENT} FlickBot [OR] # rude bot

RewriteCond %{HTTP_USER_AGENT} FrontPage [OR] # stupid user trying to edit my site

RewriteCond %{HTTP_USER_AGENT} getright [NC,OR] # OD

RewriteCond %{HTTP_USER_AGENT} go.?zilla [NC,OR] # OD

RewriteCond %{HTTP_USER_AGENT} "efp@gmx\.net" [OR] # rude bot

RewriteCond %{HTTP_USER_AGENT} grabber [NC,OR] # OD

RewriteCond %{HTTP_USER_AGENT} imagefetch [OR] # rude bot

RewriteCond %{HTTP_USER_AGENT} httrack [NC,OR] # OD

RewriteCond %{HTTP_USER_AGENT} "Indy Library" [OR] # spambot

RewriteCond %{HTTP_USER_AGENT} "^Internet Explore" [OR] # spambot

RewriteCond %{HTTP_USER_AGENT} ^IE\ \d\.\d\ Compatible.*Browser$ [OR] # spambot

RewriteCond %{HTTP_USER_AGENT} "LINKS ARoMATIZED" [OR] # rude bot

RewriteCond %{HTTP_USER_AGENT} "Microsoft URL Control" [OR] # spambot

RewriteCond %{HTTP_USER_AGENT} "mister pix" [NC,OR] # rude bot

RewriteCond %{HTTP_USER_AGENT} "^Mozilla/4.0$" [OR] # dumb bot

RewriteCond %{HTTP_USER_AGENT} "^Mozilla/\?\?$" [OR] # formmail attacker

RewriteCond %{HTTP_USER_AGENT} MSIECrawler [OR] # IE's "make available offline" mode

RewriteCond %{HTTP_USER_AGENT} ^NG [OR] # unknown bot

RewriteCond %{HTTP_USER_AGENT} offline [NC,OR] # OD

RewriteCond %{HTTP_USER_AGENT} net.?(ants|mechanic|spider|vampire|zip) [NC,OR] # OD

RewriteCond %{HTTP_USER_AGENT} nicerspro [NC,OR] # spambot

RewriteCond %{HTTP_USER_AGENT} ninja [NC,OR] # Download Ninja OD

RewriteCond %{HTTP_USER_AGENT} NPBot [OR] # NameProtect spybot

RewriteCond %{HTTP_USER_AGENT} PersonaPilot [OR] # rude bot

RewriteCond %{HTTP_USER_AGENT} snagger [NC,OR] # OD

RewriteCond %{HTTP_USER_AGENT} Sqworm [OR] # rude bot

RewriteCond %{HTTP_USER_AGENT} SurveyBot [OR] # rude bot

RewriteCond %{HTTP_USER_AGENT} tele(port|soft) [NC,OR] # OD

RewriteCond %{HTTP_USER_AGENT} TurnitinBot [OR] # Turnitin spybot

RewriteCond %{HTTP_USER_AGENT} web.?(auto|bandit|collector|copier|devil|downloader|fetch|hook|mole|miner|mirror|reap

er|sauger|sucker|site|snake|stripper|weasel|zip) [NC,OR] # ODs

RewriteCond %{HTTP_USER_AGENT} vayala [OR] # dumb bot, doesn't know how to follow links, generates lots of 404s

RewriteCond %{HTTP_USER_AGENT} zeus [NC,OR]

# Below are filtered requests (mostly virus and other security holes sniffers)

RewriteCond %{REQUEST_URI} formmail [NC,OR]

RewriteCond %{REQUEST_URI} _vti_bin [NC,OR]

RewriteCond %{REQUEST_URI} MSOffice [OR]

RewriteCond %{REQUEST_URI} mail.?(pl|cgi) [NC]

RewriteRule ^.* - [F,L]

-----------------------------------------------------------------------------

OSC user for years and no coder, so I've earned my stripes.

 

Feel free to private message me.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...