hbcloud Posted February 21, 2005 Share Posted February 21, 2005 Just trying to figure out why I get so many hits to cookie usage.php. Happens on 2 sites and I dont use cookies on either one, not enabled. Any ideas? most seem to coming from search engines. Thanks. Link to comment Share on other sites More sharing options...
♥Vger Posted February 21, 2005 Share Posted February 21, 2005 You probably had lots of visits from Search engines before you set 'Prevent Spider Sessions' to true. Now they are coming back with those session ids and are being sent to the cookie_usage.php page (seems to be a default action of osCommerce). Don't worry they'll soon get the message and start spidering the site without session ids and then all these cookie_usage hits will fade away. Vger Link to comment Share on other sites More sharing options...
hbcloud Posted February 21, 2005 Author Share Posted February 21, 2005 what I found is that the robots are adding things to the shopping cart and then following the path to checkout. Thats whats generating the 302. Any idea how to stop the bots from adding to the cart? Session ids are off. BTW, The site has been up for several months and its been am issue from day one. It happens on 2 sites. Thanks Link to comment Share on other sites More sharing options...
stevel Posted February 21, 2005 Share Posted February 21, 2005 If Prevent Spider Sessions is on and you have kept your spiders.txt updated (see my contrib), search engines should not be getting new sessions. They may have indexed pages with session IDs before and keep accessing them - there is a "Spider Session Remover" contrib that helps for that. You should also have a robots.txt that keeps robots out of pages such as shopping_cart, login, etc. Last, consider converting Buy Now buttons from simple links to forms, so that spiders won't try to use them. Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
boxtel Posted February 21, 2005 Share Posted February 21, 2005 If Prevent Spider Sessions is on and you have kept your spiders.txt updated (see my contrib), search engines should not be getting new sessions. They may have indexed pages with session IDs before and keep accessing them - there is a "Spider Session Remover" contrib that helps for that. You should also have a robots.txt that keeps robots out of pages such as shopping_cart, login, etc. Last, consider converting Buy Now buttons from simple links to forms, so that spiders won't try to use them. <{POST_SNAPBACK}> Still, if you have links that do not use forms to add things to the cart, search engines will reach the cookie usage page. They will not do any harm that way but still. The best way is to make sure all of your links to add things to the cart are via form actions or you need to remove those links if a spider is active. The last thing is what I use (it seemed simpler) as in just not showing the link (button) add to basket if the spider is active. Treasurer MFC Link to comment Share on other sites More sharing options...
hbcloud Posted February 21, 2005 Author Share Posted February 21, 2005 I'm not a php guru by any stretch of the imagination, so...How would one go about changing to forms for the buy now buttons. It seems strange that I have not 1 but 2 sites with the same problem. And I have the spiders.txt file on both. It seems that the google bot stops indexing when it gets the 302 however, the msn bot just keeps going and going...I have thousands of pages indexed on msn that are all the cookie usage page. thanks for all the help and suggestions. Link to comment Share on other sites More sharing options...
stevel Posted February 21, 2005 Share Posted February 21, 2005 It's probably because you are still using the stock spiders.txt, which doesn't have msnbot listed. Use the updated one in my contrib below, and subscribe to the announcement topic for updates. 302 is just a temporary redirect, and search engines don't mind it. Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
hbcloud Posted February 22, 2005 Author Share Posted February 22, 2005 Just got an email from the msnbot support people. They have the cookie usage page indexed 2028 times in the msn index. I did put your spiders.txt file in and also the spider session rewrite mod. Is it possible that the prevent spider sessions option is not toggling when I change it and what file would I find that in to see if it really is set to prevent sessions? Link to comment Share on other sites More sharing options...
stevel Posted February 22, 2005 Share Posted February 22, 2005 Easy way to test - install Firefox and the extension User Agent Switcher. Set the user agent to "msnbot" and visit your site - see the behavior. The suggestion to not display "Buy Now" links when no session is present is a good one. Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
hbcloud Posted February 22, 2005 Author Share Posted February 22, 2005 "The suggestion to not display "Buy Now" links when no session is present is a good one." How would one do that? Link to comment Share on other sites More sharing options...
stevel Posted February 22, 2005 Share Posted February 22, 2005 Well, one way would be the following edit in includes/modules/product_listing.php. Change: case 'PRODUCT_LIST_BUY_NOW': $lc_align = 'center'; $lc_text = '<a href="' . tep_href_link(basename($PHP_SELF), tep_get_all_get_params(array('action')) . 'action=buy_now&products_id=' . $listing['products_id']) . '">' . tep_image_button('button_buy_now.gif', IMAGE_BUTTON_BUY_NOW) . '</a> '; break; to case 'PRODUCT_LIST_BUY_NOW': $lc_align = 'center'; if ($session_started) { $lc_text = '<a href="' . tep_href_link(basename($PHP_SELF), tep_get_all_get_params(array('action')) . 'action=buy_now&products_id=' . $listing['products_id']) . '">' . tep_image_button('button_buy_now.gif', IMAGE_BUTTON_BUY_NOW) . '</a> '; } else {$lc_text = ' ' } break; You can use a similar technique to keep spiders away from other links. Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
hbcloud Posted February 22, 2005 Author Share Posted February 22, 2005 I tried using your code change and got this error Parse error: parse error, unexpected '}' in /home/newyork/public_html/includes/modules/product_listing.php on line 135 Link to comment Share on other sites More sharing options...
stevel Posted February 22, 2005 Share Posted February 22, 2005 Next to last line should be: } else {$lc_text = ' '; } Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
hbcloud Posted February 22, 2005 Author Share Posted February 22, 2005 That seemed to work, THANKS! How would I get rid of the add to cart button? where would I find that code? Link to comment Share on other sites More sharing options...
stevel Posted February 22, 2005 Share Posted February 22, 2005 You don't need to get rid of Add to Cart - it's a form button and spiders won't follow it. Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
hbcloud Posted March 2, 2005 Author Share Posted March 2, 2005 Just wanted to say thanks! Looks like the google-bot has been camped on my site for a few days now. I'm still getting many redirects to the cookie usage page because the bots are trying to access /product_reviews_write.php/products_id/988. Any ideas how to stop that? Link to comment Share on other sites More sharing options...
stevel Posted March 2, 2005 Share Posted March 2, 2005 Add /product_reviews_write.php to your robots.txt Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
springroll Posted March 2, 2005 Share Posted March 2, 2005 I want to use the new spiders.txt in steve's contribution. But when I unzipped the folder, there is a spiders.txt and a spiders-large.txt. Should I be using spiders-large.txt and then rename it to spiders.txt? Link to comment Share on other sites More sharing options...
stevel Posted March 2, 2005 Share Posted March 2, 2005 Please read the readme. I recommend using spiders.txt from the contrib directly. Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.