krnl Posted June 9, 2005 Share Posted June 9, 2005 I am having problems with spiders continuing to be assigned session IDs even after setting "Prevent Spider Sessions" to yes and updating .../includes/spiders.txt. Then I learned about the Spider Session Remover contribution which uses mod_rewrite to remove the session ID when the user-agent matches a specific string (googlebot, msnbot, slurp, etc). When I added the lines from the contribution to my .htaccess, I started getting Internal Server Errors (http-500). I copied the rules exactly from the contribution, but it won't work. Has anyone else had this problem or able to provide guidance? Here are the rewrite rules from the contribution: RewriteEngine on RewriteBase / # # Skip the next two rewriterules if NOT a spider RewriteCond %{HTTP_USER_AGENT} !(msnbot|slurp|googlebot) [NC] RewriteRule .* - [s=2] # # case: leading and trailing parameters RewriteCond %{QUERY_STRING} ^(.+)&osCsid=[0-9a-z]+&(.+)$ [NC] RewriteRule (.*) $1?%1&%2 [R=301,L] # # case: leading-only, trailing-only or no additional parameters RewriteCond %{QUERY_STRING} ^(.+)&osCsid=[0-9a-z]+$|^osCsid=[0-9a-z]+&?(.*)$ [NC] RewriteRule (.*) $1?%1 [R=301,L] Thanks, Rick Quote Link to comment Share on other sites More sharing options...
hakan haknuz Posted June 16, 2005 Share Posted June 16, 2005 I am having problems with spiders continuing to be assigned session IDs even after setting "Prevent Spider Sessions" to yes and updating .../includes/spiders.txt. Then I learned about the Spider Session Remover contribution which uses mod_rewrite to remove the session ID when the user-agent matches a specific string (googlebot, msnbot, slurp, etc). When I added the lines from the contribution to my .htaccess, I started getting Internal Server Errors (http-500). I copied the rules exactly from the contribution, but it won't work. Has anyone else had this problem or able to provide guidance? Here are the rewrite rules from the contribution: RewriteEngine on RewriteBase / # # Skip the next two rewriterules if NOT a spider RewriteCond %{HTTP_USER_AGENT} !(msnbot|slurp|googlebot) [NC] RewriteRule .* - [s=2] # # case: leading and trailing parameters RewriteCond %{QUERY_STRING} ^(.+)&osCsid=[0-9a-z]+&(.+)$ [NC] RewriteRule (.*) $1?%1&%2 [R=301,L] # # case: leading-only, trailing-only or no additional parameters RewriteCond %{QUERY_STRING} ^(.+)&osCsid=[0-9a-z]+$|^osCsid=[0-9a-z]+&?(.*)$ [NC] RewriteRule (.*) $1?%1 [R=301,L] Thanks, Rick <{POST_SNAPBACK}> Maybe you didn't edit the file properly, did you edit your htaccess file with notepad? Did you leave empty spaces, did you try to use the original supplied.. In any case i tried the original as well as modifying my own. But i am unable to see any sessions id being stripped off.. My site does work normal, but spiders still have the id's added to their urls :huh: Quote Kind regards Hakan Haknuz Link to comment Share on other sites More sharing options...
selectronics4u Posted June 16, 2005 Share Posted June 16, 2005 Set some options Options +FollowSymLinks Options -Indexes try this i had to modify to get it to work try putting the line options +followsymlinks above the other line see above. the original file had the lines set up the other way around.i also had to change the - to a plus sign+.has something to do with the server so it will work if your using appache server. see if that helps or do a google search lots of help there. Don Quote Link to comment Share on other sites More sharing options...
krnl Posted June 16, 2005 Author Share Posted June 16, 2005 After working with tech support at my hosting company, I found that the reason for the 500 Internal Server error when accessing SSL was because they did not have the mod_rewrite module loaded in their secure server config. They fixed that and now SSL is working fine. I haven't had a chance to see if sessions are still being assigned to crawlers yet though. Quote Link to comment Share on other sites More sharing options...
selectronics4u Posted June 17, 2005 Share Posted June 17, 2005 just so u know it can take weeks or even months for the sid to be taken out of search engine links. just keep checking them once and a while. do u have a robots.txt file you may want to install that as well if u dont have one you can get one at the contributions site then modify for what you need. Don Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.