Jump to content
  • Checkout
  • Login
  • Get in touch

osCommerce

The e-commerce.

Trouble with stripping the session ID for almaden IBM spider


networkdad

Recommended Posts

Posted

I've got this piece of code in my html_output.php file to strip off the session ID of any spiders. It appears to work well, with the exception of the almaden IBM spider.

 

// Add more Spiders as you find them.  MAKE SURE THEY ARE LOWER CASE! 

$spiders = array("googlebot", "teomaagent", "zyborg", "gulliver", "architext", "fast-WebCrawler", 

"slurp", "ask jeeves", "ia_archiver", "scooter", "mercator", "crawler@fast", 

"crawler", "infoseek sidewinder", "lycos_spider", "fluffy the spider", "ultraseek", 

"mantraagent", "moget", "t-h-u-n-d-e-r-s-t-o-n-e", "muscatferret", "voilabot", 

"sleek spider", "kit_fireball", "webcrawler", "http://www.almaden.ibm.com/cs/crawler"); 



// get useragent and force to lowercase just once 

$useragent = strtolower(getenv("HTTP_USER_AGENT")); 



foreach($spiders as $Val) { 

   if (!(strpos($Val, $useragent) === false)) { 

     // found a spider, kill the sid/sess 

     // Edit out one of these as necessary depending upon your version of html_output.php 

     $sess = NULL; 

     // $sid = NULL; 

     break; 

   } 

}

 

Can anyone help me ?? I'm guessing some how im misnaming the almaden IBM spider. Here is a direct listing out of my log file for this spider:

66.147.154.3 - - [08/Dec/2002:13:52:35 -0600] "GET /robots.txt HTTP/1.0" 200 195 "-" "http://www.almaden.ibm.com/cs/crawler   [c01]"

 

How should i list this spider ??? Or am i going about this all wrong? I say i *think* this works, as the google spider no longer has a session ID when it comes to my site.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...