willross Posted October 14, 2005 Share Posted October 14, 2005 Like adding: && ($session_started) To: if (isset($HTTP_GET_VARS['products_id']) && ($session_started)) { In: /includes/boxes/product_notifications.php, etc... Quote · willross ·········· Link to comment Share on other sites More sharing options...
n_e_w_s Posted October 21, 2005 Share Posted October 21, 2005 Posted Oct 14 2005, 04:02 PM Stevel: "...This option will keep some customers from purchasing at your store and will break your store if your domain name for HTTPS is not the same as for HTTP..." I'll guess using the sub domain name for HTTPS is allowed like https://admin.mydomain.com http://www.mydomain.com And this sub domain setting will work with the option "Force Cookie Use = TRUE" without breaking my store ? ------------- I found this (if it's valid still): http://www.oscommerce.info/kb/osCommerce/D...plementations/4 ------------- "As the cookie is set on the top level domain of the web server, the secured https server must also exist on the same domain. For example, the force cookie usage implementation will work for the following servers: http://www.domain-one.com https://www.domain-one.com, or https://ssl.domain-one.com but not for the following servers: http://www.domain-one.com https://ssl.hosting_provider.com/domain-one/" Quote Link to comment Share on other sites More sharing options...
stevel Posted October 21, 2005 Author Share Posted October 21, 2005 I think so. Someone elsewhere suggested setting the COOKIE_DOMAIN values to '.mydomain.com' with a leading dot. Apparently that covers subdomains if present. Quote Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
kunal247 Posted October 28, 2005 Share Posted October 28, 2005 simply use : if ($spider_flag) { do not show js } else { show js } I am also using coolmenu. Can some one please give me a sample of there page as to where this code should go. I am completely new to the whole thing and really amazed by the whole stuff. Would really appreciate your help. Do you thing updating the spiders.txt would reduce the bandwith of my site, currently its going up to 9GB and costing me a fortune? Thanks in advance. Kunal Quote Link to comment Share on other sites More sharing options...
stevel Posted October 28, 2005 Author Share Posted October 28, 2005 spiders.txt does not, on its own, reduce the bandwidth used by spiders. It simply prevents spiders from getting a session registered so spiders don't do things such as add-to-cart and, more important, keeps session IDs out of links they record. You can use the information that a spider is visiting and not present the spider with links that you want a human to see, such as product listing column sort links. Turning those off WILL cut down on bandwidth considerably. Quote Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
kunal247 Posted October 28, 2005 Share Posted October 28, 2005 spiders.txt does not, on its own, reduce the bandwidth used by spiders. It simply prevents spiders from getting a session registered so spiders don't do things such as add-to-cart and, more important, keeps session IDs out of links they record. You can use the information that a spider is visiting and not present the spider with links that you want a human to see, such as product listing column sort links. Turning those off WILL cut down on bandwidth considerably. Steve, Thank you for your prompt response. Can you advise how do go about making the changes you have suggested. Appreciate your help. Regards, Kunal Quote Link to comment Share on other sites More sharing options...
willross Posted October 28, 2005 Share Posted October 28, 2005 Some people have mentioned about another "googlebot" and I have found it. It is not actually a bot. It is a direct allocation from Google that is used for evaluating sites using or applying for their services (abuse also). Here is the run-down: OrgName: Google Inc. OrgID: GOGL Address: 1600 Amphitheatre Parkway City: Mountain View StateProv: CA PostalCode: 94043 Country: US NetRange: 66.249.64.0 - 66.249.95.255 CIDR: 66.249.64.0/19 NetName: GOOGLE NetHandle: NET-66-249-64-0-1 Parent: NET-66-0-0-0-0 NetType: Direct Allocation NameServer: NS1.GOOGLE.COM NameServer: NS2.GOOGLE.COM Comment: RegDate: 2004-03-05 Updated: 2004-11-10 OrgTechHandle: ZG39-ARIN OrgTechName: Google Inc. OrgTechPhone: +1-650-318-0200 OrgTechEmail: arin-contact@google.com Hope this clears up some confusion... Quote · willross ·········· Link to comment Share on other sites More sharing options...
Irin Posted October 28, 2005 Share Posted October 28, 2005 Hello, I use a latest spiders.txt and in my Configuration/Sessions I have: Session Directory /tmp Force Cookie Use True Check SSL Session ID True Check User Agent True Check IP Address True Prevent Spider Sessions True Recreate Session True However, all spiders that visit my store receive session ID. Why is it happening? I'll appreciate any ideas. Thanks, Irina. Quote Link to comment Share on other sites More sharing options...
stevel Posted October 28, 2005 Author Share Posted October 28, 2005 Willross, the only thing that would be relevant here is if this other Googlebot spiders a site and has a user agent string not detected by spiders.txt. What user agent does it use? Irina, it's difficult to tell without actually testing your store and looking at the files. I will comment that you should set all the "Check" values to False, or else many customers will be unable to use your store. The way I would diagnose this is to add some code to a page to do a print_r of the user agent string (I'd have to look up the variable name) and perhaps add some diagnostic code to the code that uses spiders.txt. I have not heard of a general problem with this feature, though. Quote Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
Irin Posted October 28, 2005 Share Posted October 28, 2005 Willross, the only thing that would be relevant here is if this other Googlebot spiders a site and has a user agent string not detected by spiders.txt. What user agent does it use? Irina, it's difficult to tell without actually testing your store and looking at the files. I will comment that you should set all the "Check" values to False, or else many customers will be unable to use your store. The way I would diagnose this is to add some code to a page to do a print_r of the user agent string (I'd have to look up the variable name) and perhaps add some diagnostic code to the code that uses spiders.txt. I have not heard of a general problem with this feature, though. Thanks for your reply, stevel. I set all my "Check" values to False as you recommended. What else can I do to solve this problem. Thanks a lot, Irina. Quote Link to comment Share on other sites More sharing options...
stevel Posted October 28, 2005 Author Share Posted October 28, 2005 Sorry, I had not meant to suggest that the "Check" values were related to the problem. It was just something I thought you should know. Send me a Private Message with a link to your store and I can try it out. To actually debug it, though, I'd need permission to modify files on your site server. If you want me to do that, send me the FTP server name, login name and password in a private message. Quote Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
Guest Posted October 29, 2005 Share Posted October 29, 2005 Hello: ::!Newb Alert!:: :blush: I am trying to use gsitemap by Vigos http://www.vigos.com/products/gsitemap/ When I use the spider to make the map, it gets a session ID I have no idea how to find the user agent string. I have all session values to false (except prevent spider session) :thumbsup: I don't know if it is just gsitemap getting the session ID or all the bots. How do I also check this? Quote Link to comment Share on other sites More sharing options...
stevel Posted October 29, 2005 Author Share Posted October 29, 2005 Just check your access log and find the accesses from this spider. Figure out what an appropriate string would be, for example, "gsitemap", and add it to spiders.txt. Remember that the string added to spiders.txt must be lower case - it will match any case in the user agent. Quote Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
taosan Posted November 1, 2005 Share Posted November 1, 2005 Hi there, I got this spider "User Agent: mozilla/5.0 (compatible; yahoo! slurp; http://help.yahoo.com/hel IP Adres 68.142.250.112" on my website. I want to get rid of it but failed everytime. How can I put this in my robots.txt file? B.t.w. anyone idea where this spider comes from? regards, Taosan Quote Link to comment Share on other sites More sharing options...
stevel Posted November 1, 2005 Author Share Posted November 1, 2005 You mean spiders.txt and "slurp" is already there. It is Yahoo's spider. Why do you want to "get rid of it"? Do you not want your site indexed by Yahoo? Note that spiders.txt does not prevent spiders from indexing your site - it helps them do it better. Quote Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
taosan Posted November 2, 2005 Share Posted November 2, 2005 Steve, I wasn`t sure if it is yahoo spider because he was all the time indexing my site. I mean 24x7, so I was/am suspicious about it. Look at the url... If I click on it I get a 404 error. I know that "slurp" is in the spiders.txt file therefore I thought that this spider is a new one. regards, Taosan Quote Link to comment Share on other sites More sharing options...
stevel Posted November 2, 2005 Author Share Posted November 2, 2005 The user agent has been truncated in what you posted. It should be "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)". The IP you give is Yahoo's. If your site is new, Yahoo is trying to find it all. See http://help.yahoo.com/help/us/ysearch/slurp/slurp-03.html for how to slow it down. You may also want to consider reducing the number of redundant links a spider can find on your site. In particular, the standard product listing has links at the top of each column for sorting, ascending or descending. That is potentially 3**ncolumns combinations of URLs that the spiders could see, all with really the same information. An easy way to deal with this is to edit the function tep_create_sort_heading in includes/functions/general.php. Change the line: if ($sortby) { to: if ($sortby && $session_started) { This will suppress the sort links for spiders while still leaving them for human visitors. Quote Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
Guest Posted December 1, 2005 Share Posted December 1, 2005 I have spiders.txt in includes, it's chmoded 740, that correct? Other issues hopefully can get help with. Google has only indexed my home page. I have over 1200 Products. IM me for a link to the url if you need it. I installed SEO Sitemap contribution http://www.oscommerce.com/community/contributions,2076 and SEO for the osCommerce (2.2 Milestone 2) http://www.jjwdesign.com/seo_oscommerce.html I don't know much about htaccess files but I followed the instructions for the SEO Sitemap Contribution. My store is on the root. My htaccess file (Chmod 644) is as follows, Any recommendations?: # $Id: .htaccess,v 1.3 2003/06/12 10:53:20 hpdl Exp $ # # This is used with Apache WebServers # # For this to work, you must include the parameter 'Options' to # the AllowOverride configuration # # Example: # # <Directory "/usr/local/apache/htdocs"> # AllowOverride Options # </Directory> # # 'All' with also work. (This configuration is in the # apache/conf/httpd.conf file) # The following makes adjustments to the SSL protocol for Internet # Explorer browsers <IfModule mod_setenvif.c> <IfDefine SSL> SetEnvIf User-Agent ".*MSIE.*" \ nokeepalive ssl-unclean-shutdown \ downgrade-1.0 force-response-1.0 </IfDefine> </IfModule> # If Search Engine Friendly URLs do not work, try enabling the # following Apache configuration parameter # # AcceptPathInfo On # Fix certain PHP values # (commented out by default to prevent errors occuring on certain # servers) # #<IfModule mod_php4.c> # php_value session.use_trans_sid 0 # php_value register_globals 1 #</IfModule> RewriteEngine on Options +FollowSymlinks DirectoryIndex home.html home.php index.php index.html AddType application/x-httpd-php php php4 php3 html htm RewriteRule ^sitemap_categories.html$ sitemap_categories.php [L] RewriteRule ^sitemap_products.html$ sitemap_products.php [L] RewriteRule ^category_([1-9][0-9]*)_([1-9][0-9]*)_([1-9][0-9]*)_([1-9][0-9]*)_([1-9][0-9]*)_([1-9][0-9]*)\.html$ index.php?cPath=$1_$2_$3_$4_$5_$6 [L] RewriteRule ^category_([1-9][0-9]*)_([1-9][0-9]*)_([1-9][0-9]*)_([1-9][0-9]*)_([1-9][0-9]*)\.html$ index.php?cPath=$1_$2_$3_$4_$5 [L] RewriteRule ^category_([1-9][0-9]*)_([1-9][0-9]*)_([1-9][0-9]*)_([1-9][0-9]*)\.html$ index.php?cPath=$1_$2_$3_$4 [L] RewriteRule ^category_([1-9][0-9]*)_([1-9][0-9]*)_([1-9][0-9]*)\.html$ index.php?cPath=$1_$2_$3 [L] RewriteRule ^category_([1-9][0-9]*)_([1-9][0-9]*)\.html$ index.php?cPath=$1_$2 [L] RewriteRule ^category_([1-9][0-9]*)\.html$ index.php?cPath=$1 [L] RewriteRule ^product_([1-9][0-9]*)\.html$ product_info.php?&products_id=$1 [L] Thanks in advance, Brady Quote Link to comment Share on other sites More sharing options...
Guest Posted December 1, 2005 Share Posted December 1, 2005 The user agent has been truncated in what you posted. It should be "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)". The IP you give is Yahoo's. If your site is new, Yahoo is trying to find it all. See http://help.yahoo.com/help/us/ysearch/slurp/slurp-03.html for how to slow it down. You may also want to consider reducing the number of redundant links a spider can find on your site. In particular, the standard product listing has links at the top of each column for sorting, ascending or descending. That is potentially 3**ncolumns combinations of URLs that the spiders could see, all with really the same information. An easy way to deal with this is to edit the function tep_create_sort_heading in includes/functions/general.php. Change the line: if ($sortby) { to: if ($sortby && $session_started) { This will suppress the sort links for spiders while still leaving them for human visitors. This suppressed the sort header links for me ont he page even for a regular visitor. Could it be due to the "Force Cookie Use" being enabled? I do not have the osc session ID showing in the URL with "Force Cookie Use" enabled so that spiders can better index the site. Is there something else to add to this to enable it to be used when "Force Cookie Use" is enabled? Thanks, John Quote Link to comment Share on other sites More sharing options...
stevel Posted December 5, 2005 Author Share Posted December 5, 2005 John, If it suppressed links for you as a normal visitor, you are not getting a session, which is bad. Force Cookie Use does not make spiders "better index the site", but it does drive away some customers. Can you send me a link to your site? Brady, spiders.txt can have the same protection as other files in includes - 644 or 755 is fine. Using spiders.txt would not prevent Google from indexing your site. It can take some time for Google to index new sites - weeks or even months. Is it visiting your product pages? (Look at the access log.) Quote Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
Guest Posted December 5, 2005 Share Posted December 5, 2005 John, If it suppressed links for you as a normal visitor, you are not getting a session, which is bad. Force Cookie Use does not make spiders "better index the site", but it does drive away some customers. Can you send me a link to your site? Brady, spiders.txt can have the same protection as other files in includes - 644 or 755 is fine. Using spiders.txt would not prevent Google from indexing your site. It can take some time for Google to index new sites - weeks or even months. Is it visiting your product pages? (Look at the access log.) Thanks Steve for the info. For now I have turned off the Force Cookie Use option because it was causing other issues as well which I haven't found an answer to on the forums here yet. You are probably right about it driving customers away, the biggest issue I was having with the cookies forced was that when you went to log in, you get the "Cookie Usage" warning page, because as you noted above a session hadn't started. Strange thing is, if I click on "My Account" I could get the log-in page like normal, but if you try to go straight to log-in, you get the cookie page. Then after the cookie is set, no more problem. Like I said, seems like too much trouble. I was under the impression that spiders did not like the "oscid" in the URL, and having it there would not be a good thing. Or will they just not get that when they crawl the site if I have spider friendly URL's enabled? How can I test what they will see vs. a regular customer? If you want to check my site anyway, it is www.greenmountainspecialties.com. Right now I haven't added the previous fix back in, but will in the next day or so to see what happens. Your thoughts on the oscid and cookies is also appreciated. Thanks, John Quote Link to comment Share on other sites More sharing options...
Guest Posted December 5, 2005 Share Posted December 5, 2005 i didn't notice "BecomeBot" on the contribution i use, but i have gotten recent hits from it: http://www.become.com/site_owners.html Quote Link to comment Share on other sites More sharing options...
stevel Posted December 5, 2005 Author Share Posted December 5, 2005 John, your problem seems to be an incorrect configure.php so that the cookie cannot be set. Please make sure that HTTP_COOKIE_DOMAIN is 'www.greenmountainspecialties.com' and nothing more. I'd guess that you have a similar problem in the HTTPS defines - the COOKIE_DOMAIN defines must match the domain (can be hostname) only of the corresponding _SERVER define. ewww, Becomebot is detected by the line "ebot". Quote Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description Link to comment Share on other sites More sharing options...
Guest Posted December 8, 2005 Share Posted December 8, 2005 John, your problem seems to be an incorrect configure.php so that the cookie cannot be set. Please make sure that HTTP_COOKIE_DOMAIN is 'www.greenmountainspecialties.com' and nothing more. I'd guess that you have a similar problem in the HTTPS defines - the COOKIE_DOMAIN defines must match the domain (can be hostname) only of the corresponding _SERVER define. ewww, Becomebot is detected by the line "ebot". OK, I have messed around with all different combinations in my configure.php file, and they don't seem to make a real difference. I am setting a cookie, but not until the first attempt to access "My Account" page. In other words, no test cookie is ever sent until I click on "My Account". Once that is done, the browser has the test cookie, and a session will start, so that when "My Account" redirects to "Login", the test cookie exists and "Login" can load (if you click on "Login" first, you get the cookie_usage page). No problem with cookies after that, unless I delete the cookie file. I guess I don't really understand how this is supposed to work. When I have FORCE_COOKIES turned off, as soon as someone clicks on a page other than the home page, the session ID is generated in the URL. Then that session ID carries through to all pages. With FORCE COOKIES turned on, shouldn't we serve the test cookie upon load of the index page? Then it would be available for any subsequent page load. Or is this the way it IS supposed to work, and my shop isn't doing it correctly? Any thoughts or clarification on how this should all work would be greatly appreciated. Thanks, John Quote Link to comment Share on other sites More sharing options...
boxtel Posted December 8, 2005 Share Posted December 8, 2005 OK, I have messed around with all different combinations in my configure.php file, and they don't seem to make a real difference. I am setting a cookie, but not until the first attempt to access "My Account" page. In other words, no test cookie is ever sent until I click on "My Account". Once that is done, the browser has the test cookie, and a session will start, so that when "My Account" redirects to "Login", the test cookie exists and "Login" can load (if you click on "Login" first, you get the cookie_usage page). No problem with cookies after that, unless I delete the cookie file. I guess I don't really understand how this is supposed to work. When I have FORCE_COOKIES turned off, as soon as someone clicks on a page other than the home page, the session ID is generated in the URL. Then that session ID carries through to all pages. With FORCE COOKIES turned on, shouldn't we serve the test cookie upon load of the index page? Then it would be available for any subsequent page load. Or is this the way it IS supposed to work, and my shop isn't doing it correctly? Any thoughts or clarification on how this should all work would be greatly appreciated. Thanks, John how it works by default (and how I use it now): http://www.oscommerce.com/forums/index.php?showtopic=182189 Quote Treasurer MFC Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.