nitedeposit1 Posted March 28, 2006 Posted March 28, 2006 Hi People, Could someone please take a look at my robots.text file and let me know what I did wrong? It would be greatly appreciated. I am somewhat new to osCommerce and havent quite got the code thing down yet. I searched and read the forums on the robots.text files and maybe I missed something because I have Yahoo crawling all over my admin and I need to stop this from happening. Im not finished with the site quite yet and haven't even put in any keywords or metatags yet but they are still crawling twice a day and putting bad listings. So if anyone could take a look at my file and steer me in the right direction I sure would appreciate it. Thanks in advance. # robots.txt file User-agent:* Disallow/login Disallow/account Disallow/privacy Disallow/tell_a_friend Disallow/images Disallow/shipping Disallow/checkout Disallow/cookie_usage Disallow/contact_us Disallow/email Disallow/includes Disallow/backup Disallow/invoice Disallow/orders Disallow/stats
jasonabc Posted March 28, 2006 Posted March 28, 2006 You need to insert a semi-colon. So change this: Disallow/login to: Disallow: /login/ Jason My Contributions: Paypal Payflow PRO | Rollover Category Images | Authorize.net Invoice Number Fix
custodian Posted March 28, 2006 Posted March 28, 2006 You need to insert a semi-colon. So change this: Disallow/login to: Disallow: /login/ This is true, though in the root directory there should be Disallow: /admin/ Also.. how is ANY search engine getting into your admin directory? It should be password protected. My Contributions Henry Smith
jasonabc Posted March 28, 2006 Posted March 28, 2006 Also.. how is ANY search engine getting into your admin directory? It should be password protected. Not only that but none of the directories the OP lists are OSC folders anyway..?? /login/ ? /orders/? /account/? etc etc Jason My Contributions: Paypal Payflow PRO | Rollover Category Images | Authorize.net Invoice Number Fix
nitedeposit1 Posted March 29, 2006 Author Posted March 29, 2006 This is true, though in the root directory there should be Disallow: /admin/ Also.. how is ANY search engine getting into your admin directory? It should be password protected. I dont know how they are getting in there. The admin is password protected. Is there anything I may have made a mistake on that would let them in even though admin is password protect? Any place you can suggest? Thanks for your reply. If you can think of anything please let me know. I'm Stumped.... I will add the semi colan and see if that makes a differance.
nitedeposit1 Posted March 29, 2006 Author Posted March 29, 2006 This is true, though in the root directory there should be Disallow: /admin/ Also.. how is ANY search engine getting into your admin directory? It should be password protected. Thanks for the tips, Much appreciated. I noticed you put the second / at the end, is this necessary? I read in the robots.txt section of the forum that if you leave the last slash/ off that it would Disallow anything with that particuler word. for example if you would put Disallow:/checkout and leave the / off the end it would Disallow everything with checkout in it such as checkout_process, checkout_shipping_, checkout_process, ect..ect... Is this correct or was I going in the wrong direction? Thanks again for your reply, I'm headed to go straighten out my file now.
custodian Posted March 29, 2006 Posted March 29, 2006 Thanks for the tips, Much appreciated.I noticed you put the second / at the end, is this necessary? I read in the robots.txt section of the forum that if you leave the last slash/ off that it would Disallow anything with that particuler word. for example if you would put Disallow:/checkout and leave the / off the end it would Disallow everything with checkout in it such as checkout_process, checkout_shipping_, checkout_process, ect..ect... Is this correct or was I going in the wrong direction? Thanks again for your reply, I'm headed to go straighten out my file now. /admim/ will disallow EVERYTHING in the admin so www.yourdomain.com/admin/[any file or directory] I believe if you state /admin is may look for the file call www.yourdomain.com/admin (no a directory) though www.yourdomain.com/admin* will should (I'd have to check) take care of www.yourdomain.com/admin_resouce www.yourdomain.com/admin.html www.yourdomain.com/admin.php etc BUT it is NOT needed You do not need to block the spiders from checkout_shipping.php it's just one more page they will index. They are not creating an account, they are not logging in, and if you disable spider sessions in your admin the ONLY thing you need to worry about for your store is /admin/ directory, those are the only files that contain items you do not want the search engines to get. I'd delete all of this Disallow/loginDisallow/account Disallow/privacy Disallow/tell_a_friend Disallow/images Disallow/shipping Disallow/checkout Disallow/cookie_usage Disallow/contact_us Disallow/email Disallow/includes Disallow/backup Disallow/invoice Disallow/orders Disallow/stats and just have Disallow: /admin/ I mean why would you want them not on your contact_us which contains information about you - your doing yourself more bad than good. My Contributions Henry Smith
nitedeposit1 Posted March 29, 2006 Author Posted March 29, 2006 /admim/ will disallow EVERYTHING in the admin so www.yourdomain.com/admin/[any file or directory] I believe if you state /admin is may look for the file call www.yourdomain.com/admin (no a directory) though www.yourdomain.com/admin* will should (I'd have to check) take care of www.yourdomain.com/admin_resouce www.yourdomain.com/admin.html www.yourdomain.com/admin.php etc BUT it is NOT needed You do not need to block the spiders from checkout_shipping.php it's just one more page they will index. They are not creating an account, they are not logging in, and if you disable spider sessions in your admin the ONLY thing you need to worry about for your store is /admin/ directory, those are the only files that contain items you do not want the search engines to get. I'd delete all of this and just have Disallow: /admin/ I mean why would you want them not on your contact_us which contains information about you - your doing yourself more bad than good. Thanks for all your help and taking the time to explain it a little. It does make sense that you would want all the info you can out in the engines. I'll give this a shot and see what happens. Im just starting out with this new store and I guess I got a little paranoid when I seen the admin all in the listings. Thanks again
custodian Posted March 29, 2006 Posted March 29, 2006 Thanks for all your help and taking the time to explain it a little. It does make sense that you would want all the info you can out in the engines. I'll give this a shot and see what happens. Im just starting out with this new store and I guess I got a little paranoid when I seen the admin all in the listings. Thanks again I did a thorough search (to my ability) of your store in both yahoo and google and I don't see admin anywhere... where are you seeing this??? Are you searchng in yahoo for your store when you see this?? If so what are the eact search terms you are using to produce these results My Contributions Henry Smith
nitedeposit1 Posted March 29, 2006 Author Posted March 29, 2006 I did a thorough search (to my ability) of your store in both yahoo and google and I don't see admin anywhere... where are you seeing this??? Are you searchng in yahoo for your store when you see this?? If so what are the eact search terms you are using to produce these results Yes I was using yahoo for my search. This has been over the last couple of days and last night and this morning is when I seen the admin listings. I tried this evening after I changed the robots.text file and can't find anything regarding admin. Maybe because the slurp was here again this evening. Not real sure how that part works yet either. I did the search at first in yahoo: R&T's Discount Depot, Which is our store name. Then I searched http://www.rntsdiscountdepot.com , thats when I had 16 listings and I think 9 or 10 was admin listings. Prior to replying to you this time I again did a search and had 8 listings but none listed admin paths. So I'm confused. Do the prior listings get dropped after a new crawl? Also, is there something I need to add to the robots.txt file to draw the robots into my product info or will they find it on their own eventually?
custodian Posted March 29, 2006 Posted March 29, 2006 Yes I was using yahoo for my search. This has been over the last couple of days and last night and this morning is when I seen the admin listings. I tried this evening after I changed the robots.text file and can't find anything regarding admin. Maybe because the slurp was here again this evening. Not real sure how that part works yet either. I did the search at first in yahoo: R&T's Discount Depot, Which is our store name. Then I searched http://www.rntsdiscountdepot.com , thats when I had 16 listings and I think 9 or 10 was admin listings. Prior to replying to you this time I again did a search and had 8 listings but none listed admin paths. So I'm confused. Do the prior listings get dropped after a new crawl? Also, is there something I need to add to the robots.txt file to draw the robots into my product info or will they find it on their own eventually? Indexed links get added and dropped all the time. It's possible that the admin links you saw happened hen you first set up your site. Yahoo may have came to you web site before you had the chance to secure the admin, therefore they indexed it, since it is secure (which I verified) and it's no longer showing up in yahoo, I'd safe it safe to say your ok now. I had searched yesterday for your company and name and address as well and I didn't see the admin either. My Contributions Henry Smith
Recommended Posts
Archived
This topic is now archived and is closed to further replies.