rs2k Posted October 15, 2008 Posted October 15, 2008 I checked the server logs and have a found a robot that has index about 8,000 products and categories in the last few days and is still going at it. We have a huge robots.txt file, but this one is not in it. The problem is this robot has a session. If this is a problem can oscommerce be modified to not allow that session ID any more? Maybe create a new random session ID if that session ID is seen again? This bot has been downloading a page every 30 seconds for about 3 days now. Should this bot be blocked all together?
FIMBLE Posted October 15, 2008 Posted October 15, 2008 What is its name? libwww-perl by any chance? Sometimes you're the dog and sometimes the lamp post [/url] My Contributions
rs2k Posted October 15, 2008 Author Posted October 15, 2008 What is its name?libwww-perl by any chance? Just an IP: 91.205.124.10 The whois info lists GIGABASE-NET, but I can't find any info on them.
FIMBLE Posted October 15, 2008 Posted October 15, 2008 GIGABASE Ltd. is a private company specializing on research and development of internet services that need to deal with large databases quickly. Company's staff doesn't exceed 6 co-workers. The company was registered on August 2008 and is financed by it's foundators solely. Yanga - the search engine - is our main development. Yanga is multilingual and searches information all over the world. Advertisements are placed through it's own context advertising service. Search engine index value is over 12 billions of pages and can be expanded simply by increasing servers' amount. Currently we are working on improvements in the quality of service and search. source http://www.google.co.uk/search?rlz=1C1CHMI...;q=GigaBase+Ltd Sometimes you're the dog and sometimes the lamp post [/url] My Contributions
Guest Posted October 15, 2008 Posted October 15, 2008 Set 'prevent spider sessions' to true in the backend of your site in Admin > Sessions.
FIMBLE Posted October 15, 2008 Posted October 15, 2008 Yes thats good advice :-) Sometimes you're the dog and sometimes the lamp post [/url] My Contributions
rs2k Posted October 16, 2008 Author Posted October 16, 2008 Yes thats good advice :-) . Thank you, but I already have that set to true. This spider just wasn't in our spiders.txt. I said robots.txt in the original post, but I meant spiders.txt.
Guest Posted October 16, 2008 Posted October 16, 2008 .Thank you, but I already have that set to true. This spider just wasn't in our spiders.txt. I said robots.txt in the original post, but I meant spiders.txt. add yanga to the spiders.txt file.
rs2k Posted October 16, 2008 Author Posted October 16, 2008 add yanga to the spiders.txt file. I did, thanks.
Recommended Posts
Archived
This topic is now archived and is closed to further replies.