211655 Posted January 11, 2005 Posted January 11, 2005 what to do. google doesnt get though. i have slurp and almost allfamous yahoospiders in spiders.txt. plz help or post ur spiders if u dont get any sids. 211655 SEO Optimization Export Orders into CSV file
BoulderDash Posted January 12, 2005 Posted January 12, 2005 You know, I get Yahoo! showing up to my site with osC ID's that it picked up in the 3 months I had it running without invoking "spiders.txt". I thought it would drop them, but Yahoo! continues to visit with those osC ID's for the last 9 months. I know for a fact that when Yahoo! is there now, that it does not have a session created (as verified by a modified Who's Online since I can see that "yahoo" and "slurp" prevent that from happening). But, I can't seem to understand how to tell Yahoo! to stop using the URL's with osC ID's it archived from almost a year ago. Does anyone have in idea? Is it simply writing down all the different Yahoo! visits with URL's with osC ID's and creating something in application_top that then just "breaks" the PHP at the beginning (if the requested URL has said osC ID) so that Yahoo! can see that the page (with the unwanted osC ID) is no longer available? I mean, the page is available of course, but I do not want it indexed with an osC ID. Anyone have an idea how to rid Yahoo! listings containing osC ID's? Thanks, BD
stevel Posted January 13, 2005 Posted January 13, 2005 This has been a thorn in my side too. Yahoo started using a new spider called "seeker" which was not in the standard apiders.txt, and it picked up a bunch of pages with session IDs. Unfortunately, once it has the session ID, it's hard to get it to drop it. You can call tep_destroy_session which will kill the session for that visit, but it will get recreated on a future visit. Yahoo also has no effective support channel. Ideally, you'd like to give a 301 "permanent" redirect to Yahoo, but from other evidence, it ignores that too. Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description
BoulderDash Posted January 13, 2005 Posted January 13, 2005 Do you have a User Agent for "seeker"? Wouldn't it still have "yahoo" somewhere in it's string (whereby "yahoo" was originally included in the original "spiders.txt" file)? I did read about "seeker" appearing some time in Feb. '04, but I haven't had it on my site (that I have seen anyway). Also, looks like maybe there is something that can be done about the 301 soon... maybe: http://www.oscommerce.com/forums/index.php?showtopic=118219 Join in! BD
stevel Posted January 13, 2005 Posted January 13, 2005 My Updated spiders.txt contribution detects Yahoo Seeker and many other spiders not detected by the stock spiders.txt. I update this as I discover new spiders. Yes, I think that the only way to send the 301 to these spiders is to do it with a rewrite rule. I'll watch that thread to see how it goes. For my store, it's only Yahoo that has kept the SIDs. Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description
Recommended Posts
Archived
This topic is now archived and is closed to further replies.