Jump to content
  • Checkout
  • Login
  • Get in touch

osCommerce

The e-commerce.

Another SID/Search Engine Discussion


wizardsandwars

Recommended Posts

Posted

Ok, now that we all know how to kill SIDs for bots in our lists, I have a new question.

 

It appears that whne a bot first come to your website, they gather as many of your URLs as possible. Usually it will return later to parse them. If the spider that visits is not in your spider catcher list, the urls that it gathers will include SIDs. We all know that this will create an spider infinate loop where the spider thinks that it keeps finding new URLs because the ISDs are different.

 

What happens if there is a new bot, such as the new inktomi bot, that is not in your list, and gaters thousands of url (with sids attached) before you recognize it as a bot, and add the ip and/or useragent to the array?

 

Apparently, this new inktomi bot will return to the URL with the SID again and again and again, even thought it doesn't generate any *new* urls with the sids.

 

I guess Im thinking that perhaps we could check to see how old a session is, and after a cetain amount of time 'expire' it. If the session is expired, then parhaps it should be stripped from the url completly.

 

Anyone have any thoughts on this? Burt?

-------------------------------------------------------------------------------------------------------------------------

NOTE: As of Oct 2006, I'm not as active in this forum as I used to be, but I still work with osC quite a bit.

If you have a question about any of my posts here, your best bet is to contact me though either Email or PM in my profile, and I'll be happy to help.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...