Jump to content
  • Checkout
  • Login
  • Get in touch

osCommerce

The e-commerce.

osCsid's for msnbot


tlelliott77

Recommended Posts

Posted

MSNBot has been crawling my site regularly recently. Even though I have added msnbot to my spider.txt file it continues to append osCsid's to the end of the page names. It always seems to have the same osCsid on every visit, as far back as i can go on my user tracking.

 

Has this hapepned to anyone else?

 

I thought maybe it was happening because it was following links it made itself before I added it to the spiders.txt. If this is the case, will it correct itself in the end and remove the sids or just continue to spider the site with this sid?

 

Anyone know how I can fix this?

 

Thanks

Tim

  • 2 weeks later...
Posted

I have been having the same problem. I put msnbot in my spiders.txt and checked this morning, it's still pulling osCsid's I just put msnbot/0.11 and I'll see if that fixes it. Looking in my logs today the tag is "msnbot/0.11" and not "msnbot" so I'm not sure how specific the spiders.txt file has to be.

Posted

I've had the same problem with msnbot.

 

I've added to spiders.txt and still the same result. spiders.txt only has to have the first few letters of the spider and it SHOULD catch all that contain the same info. However, MSN and Inktomi both seem to get oscid's MOST of the time.

 

Find a solution, please pass it along.

 

Best Regards - John

Posted

On myu site msnbot is still occasionally using an oscsid that it had previously used. I guess it is revisiting the same links that it previously created.

 

Also getting the Yahoo Slurp (Inktomi) doing the same thing.

 

I'm hoping this will stop in the end but if anyone has any tried and tested ways of getting rid of these osCsid's I'd appreciate hearing.

 

Thanks

Tim

Posted

I see the same with msnbot and Yahoo, but the msn problem seems to have disappeared. Eventually I guess the Yahoo problem will too.

 

I do find it helpful to put in robots.txt disallow clauses for pages I never want spiders to visit, such as shopping cart, login, my account, etc.

Posted

Guess I wrote too soon. msnbot is well-behaved, but Yahoo Slurp keeps collecting SIDs. I can't figure it out. I have "slurp" in spiders.txt, and a test shows that it seems to correctly pick it up and not issue a session, yet Yahoo Slurp continues to rack up sessions - or so it seems, anyway...

 

I did have one customer place an order starting with a link from Yahoo that included an SID. I deleted that session so that it couldn't be reused, and a few others Yahoo had created, but I'd like to find a way to have such sessions deleted automatically. I looked at the "check user agent", but that redirects to the login page and is probably not what I want in the long run. But for now I've enabled recording the user agent in the sessions and will monitor to see if I am getting new Yahoo sessions.

Posted

User-agent: *

Disallow: /shopping_cart.php

Disallow: /advanced_search.php

Disallow: /login.php

Disallow: /checkout_shipping.php

Disallow: /account.php

Disallow: /login.php

Disallow: /create_account.php

Disallow: /password_forgotten.php

  • 3 weeks later...
Posted

I too have the list of msnbot and Yahoo! Slurp in my spiders.txt in lower case. But Yahoo is still getting the SID for my site. MSN was ok for sometimes but its again showing hte SIDs in the url. Any idea how to get rid of this SID situation with only these two search engines. No other search engine is showing this behavior. Any idea to prevent sids for msnbot and yahoo! slurp.

 

I have also put the the robots.txt file on my site. properly.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...