Jump to content
  • Checkout
  • Login
  • Get in touch

osCommerce

The e-commerce.

Robots.txt


xnewbi3x

Recommended Posts

Hi, i want to place robots.txt in my htdocs, root

i'm wondering do i need to allow all search engine to access /catalog/includes?

or can i block them from indexing the /includes/ or /admin/ directory?

 

and if i use Ultimate SEO constribution, do i block those directory or no?

 

and i notice that there was a spyders.txt and tld.txt in my catalog/includes/ folder

what does it do? thanks

Link to comment
Share on other sites

do not add your admin directory to robots.txt, this file is usually a target for hackers.

 

rename your admin to a secret name (myadminfile4545412 for example) and if you're paranoid, add the "noindex,nofollow" robot meta tags

 

by default, osc disables anyone directly accessing /includes

 

if you do not create a robots.txt, the spiders/bots will roam free on your site until they are denied by script (for example, password protect via php) or htaccess

 

if you don't want your customers' files showing up on google, your best bet is to disallow all of the account & checkout pages via robots.txt

Link to comment
Share on other sites

here is a copy of my robot text...

i intend to manually added bots, and disallow * (wildcard bots)

I chop off the rest so the list look short and easier for you guy to look. i add disallow to every user-agent. and disallow / for user-agent : * please let me know if this is the right way to do it?

 

 

User-agent: Mozilla/3.0 (compatible;miner;mailto:[email protected])
Disallow: 
Disallow: /images/
Disallow: /admin/
Disallow: /shop/
Disallow: /includes/

User-agent: WebFerret
Disallow: 
Disallow: /images/
Disallow: /admin/
Disallow: /shop/
Disallow: /includes/

User-agent: Due to a deficiency in Java it's not currently possible 
to set the User-agent. 
Disallow: 
Disallow: /images/
Disallow: /admin/
Disallow: /shop/
Disallow: /includes/

User-agent: no 
Disallow: 
Disallow: /images/
Disallow: /admin/
Disallow: /shop/
Disallow: /includes/

User-agent: 'Ahoy! The Homepage Finder' 
Disallow: 
Disallow: /images/
Disallow: /admin/
Disallow: /shop/
Disallow: /includes/

User-agent: Arachnophilia 
Disallow: 
Disallow: /images/
Disallow: /admin/
Disallow: /shop/
Disallow: /includes/


User-agent: *
Disallow: /
Disallow: /images/
Disallow: /admin/
Disallow: /shop/
Disallow: /includes/

Link to comment
Share on other sites

what does it mean do not add admin direcotr to robots.txt?

 

can i use this syntax instead?

 

disallowed: /admin/ ????

let's say i'm a bad guy and want to hack your site.

a quick way to know your private directories is to read your robots.txt (which is public - ANYONE can read it)

 

i would advise against keeping your admin directory named "admin" and putting it in robots.txt

i would also advise against putting any private filenames in robots.txt as well.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...