Contributions
robots.txt Sample File
A robots.txt file lets search engines (Google, Yahoo, MSN, etc) know which pages on your site you don't want them to index. This is helpful in keeping them from going to pages that will confuse them and/or won't provide any added benefit to you from having indexed.
Since nobody has put one in here before, I've put together a simple version you can simply copy into your web site's root directory (usually called "public_html" or "httpdocs" or "www").
Once you've copied the file into place and made any desired changes, you're done.
Expand All / Collapse All
This robots text file will also help you by removing one way for hackers to find your images folder as a lot of stores seem to get hacked via it. It will also help hide your admin from everyone but you.
This is not earth shattering but hopefully some will find it useful. Please do not bug me with questions, it is not that hard to figure out.
Cheers
A robots.txt file can tip a hacker off to files that they might find interesting and would otherwise not know about. It is much safer to edit the PHP files themselves to include a robots meta tag instead.
Use your HTML editor to search all files in your osCommerce install in source mode for "<head>" (quotes just delimit what you are looking for and should not be included in the actual search). Immediately under this tag add one of the following two tags:
If the file in question is located in the admin directory it should not get indexed at all so you will want to add the meta tag to prevent any robot access to the file as follows:
<meta name="robots" content="none">
For files in your main catalog that you don't want indexed you will still want the robot to continue indexing the remainder of the site so you would use:
<meta name="robots" content="noindex, follow">
Full robots.txt file as below, but also includes the command to disallow Google Image bot from scanning your site (and hence saving tons of bandwidth for sites with large amounts of images).
A robots.txt file lets search engines (Google, Yahoo, MSN, etc) know which pages on your site you don't want them to index. This is helpful in keeping them from going to pages that will confuse them and/or won't provide any added benefit to you from having indexed.
Since nobody has put one in here before, I've put together a simple version you can simply copy into your web site's root directory (usually called "public_html" or "httpdocs" or "www").
Once you've copied the file into place and made any desired changes, you're done.
Note: Contributions are used at own risk.