Jump to content
  • Checkout
  • Login
  • Get in touch

osCommerce

The e-commerce.

yahoo spider getting sessionid


211655

Recommended Posts

Posted

You know,

I get Yahoo! showing up to my site with osC ID's that it picked up in the 3 months I had it running without invoking "spiders.txt". I thought it would drop them, but Yahoo! continues to visit with those osC ID's for the last 9 months. I know for a fact that when Yahoo! is there now, that it does not have a session created (as verified by a modified Who's Online since I can see that "yahoo" and "slurp" prevent that from happening).

 

But, I can't seem to understand how to tell Yahoo! to stop using the URL's with osC ID's it archived from almost a year ago. Does anyone have in idea?

 

Is it simply writing down all the different Yahoo! visits with URL's with osC ID's and creating something in application_top that then just "breaks" the PHP at the beginning (if the requested URL has said osC ID) so that Yahoo! can see that the page (with the unwanted osC ID) is no longer available? I mean, the page is available of course, but I do not want it indexed with an osC ID.

 

Anyone have an idea how to rid Yahoo! listings containing osC ID's? Thanks,

BD

Posted

This has been a thorn in my side too. Yahoo started using a new spider called "seeker" which was not in the standard apiders.txt, and it picked up a bunch of pages with session IDs. Unfortunately, once it has the session ID, it's hard to get it to drop it. You can call tep_destroy_session which will kill the session for that visit, but it will get recreated on a future visit.

 

Yahoo also has no effective support channel. Ideally, you'd like to give a 301 "permanent" redirect to Yahoo, but from other evidence, it ignores that too.

Posted

Do you have a User Agent for "seeker"? Wouldn't it still have "yahoo" somewhere in it's string (whereby "yahoo" was originally included in the original "spiders.txt" file)?

 

I did read about "seeker" appearing some time in Feb. '04, but I haven't had it on my site (that I have seen anyway). Also, looks like maybe there is something that can be done about the 301 soon... maybe:

 

http://www.oscommerce.com/forums/index.php?showtopic=118219

 

Join in!

BD

Posted

My Updated spiders.txt contribution detects Yahoo Seeker and many other spiders not detected by the stock spiders.txt. I update this as I discover new spiders.

 

Yes, I think that the only way to send the 301 to these spiders is to do it with a rewrite rule. I'll watch that thread to see how it goes. For my store, it's only Yahoo that has kept the SIDs.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...