Jump to content
  • Checkout
  • Login
  • Get in touch

osCommerce

The e-commerce.

Prevent Spider Sessions Not Working


YSC

Recommended Posts

Posted

For some reason, my prevent known spiders from creating sessions is not working. I have it set to false in the control panel. Here is the text from my spiders.txt file:

$Id: spiders.txt,v 1.2 2003/05/05 17:58:17 dgw_ Exp $

almaden.ibm.com

appie 1.1

architext

ask jeeves

asterias2.0

augurfind

baiduspider

bannana_bot

bdcindexer

crawler

crawler@fast

docomo

fast-webcrawler

fluffy the spider

frooglebot

geobot

googlebot

gulliver

henrythemiragorobot

ia_archiver

infoseek

kit_fireball

lachesis

lycos_spider

mantraagent

mercator

moget/1.0

muscatferret

nationaldirectory-webspider

naverrobot

ncsa beta

netresearchserver

ng/1.0

osis-project

polybot

pompos

scooter

seventwentyfour

sidewinder

sleek spider

slurp/si

[email protected]

steeler/1.3

szukacz

t-h-u-n-d-e-r-s-t-o-n-e

teoma

turnitinbot

ultraseek

vagabondo

voilabot

w3c_validator

Yahooseeker

YahooSeeker/1.1

zao/0

zyborg/1.0

I am using oscommerce ms2 on this site. Come to think of it, I don't think that it has ever worked correctly. Any help or direction would be greatly appreciated.

 

Best,

 

Rob

Posted

I forgot to ask if anyone could tell me where to manually set the session_block_spiders. I checked a number of files but can't seem to locate it.

 

Thanks Again,

 

Rob

Posted

that one worked okay, but for somereason other ones that i use do not.

for example:

 

http://www.gritechnologies.com/tools/spider.go

xenu link sleuth tool

 

among others

 

However I went back and turned off the prevent know spiders and the tool that you sent me to did show session id's, so the module must be working but perhaps I just need to update my spiders.txt file? I know that when MS2 was first released yahoo was still getting their results from google and inktomi. Does anyone have an updated spiders.txt they would like to share? Maybe we should post the updated deffinitions as a contribution. Thoughts?

Posted

I believe that if you add : Poodle predictor 1.0 to your spiders.txt the sessions id's will disappear.

 

p.s. mine spiders.txt is about the same as yours. Should work most the most popular spiders.

Posted

couldn't get it to block the poodle predictor, however I am more satisfied that it is working then when I started this post. I wonder is there a way to see how a spider is identifying itself? Anyways I thank you for all your help in this matter, I will have to wait to have my pages indexed by the engines to see if it is working.

Posted

Have you noticed that your URL of pages in the shop end with something similar as : osCsid=5e086ff3e5e98c4a48fe3463ad2f4fd0

This is an session ID that oscommerce uses to track customers. What products they have added into the cart for example.

You don't need this voor spiders. There is a whole lot of info on this forum about spiders. Worthwile to read if you want your site to be well indexed by the search engines.

Posted

just a quick question on the spiders i have mine set to true ...... but in the text file it shows a bunch of them do i delete them ? i have had to set them false then i just reset my spiders to true today

Posted

No, this is a list of known spiders. There are a whole bunch of spiders out there but these are the most common once. If a spider visist your site and its indentification is in this list, the sessionID are left out.

If you happen to come across a spider that is not in the list you can add this spider to the list.

I have recently added for example the dutch Wiseguys spider to mine spiders.txt

Look at your logfiles and do for example a search on googlebot. You will be able to see how it identifies itself. This name should also be found in the spiders.txt.

Posted

I just added Poodel predictor 1.0 to my list but they still get the session code.

I'm curious as to why and or how.

Anyone know?

Posted

Curious, indeed you are right. I have done this with a couple of other spider simulators and they all worked fine. Wonder why this one doesn't.

Posted

my spiders.txt files conatains a lot of entries for search eniges crawler. All the things are working fine except inktomi. Whenever I view the page who is online, I view that inktomi is indexing my site but with session ids. How to overcome this sessions Ids problem with inktomi yahoo slurp. I think I have the proper entry for this search engine in my spiders.txt file.

Posted

Please help me to solve this problem. I have enabled the safe url for search engine value to true. The site is indexing properly at google, msn and other crawler search engines. But only yahoo shows session ids. I have run the search engine simulator. it shows the url without session ids. But in actual yahoo shows session ids. Please help me.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...