Jump to content
  • Checkout
  • Login
  • Get in touch

osCommerce

The e-commerce.

Is it healthy to have this robots.txt in my site?


shaytaan

Recommended Posts

Disallow: /admin

Disallow: /account.php

Disallow: /advanced_search.php

Disallow: /checkout_shipping.php

Disallow: /create_account.php

Disallow: /login.php

Disallow: /login.php

Disallow: /password_forgotten.php

Disallow: /popup_image.php

Disallow: /shopping_cart.php

 

Disallow: /catalog/admin

Disallow: /catalog/account.php

Disallow: /catalog/advanced_search.php

Disallow: /catalog/checkout_shipping.php

Disallow: /catalog/create_account.php

Disallow: /catalog/login.php

Disallow: /catalog/login.php

Disallow: /catalog/password_forgotten.php

Disallow: /catalog/popup_image.php

Disallow: /catalog/shopping_cart.php

 

# IF YOU DO NOT WISH TO HAVE THE GOOGLE IMAGE BOT SCAN YOUR DOMAIN FOR IMAGES

 

User-agent: Googlebot-Image

Disallow: /

 

 

# Part 2 (shaytaan)

 

 

User-agent: Mozilla/3.0 (compatible;miner;mailto:[email protected])

Disallow:

 

User-agent: WebFerret

Disallow:

 

User-agent: Due to a deficiency in Java it's not currently possible

to set the User-agent.

Disallow:

 

User-agent: no

Disallow:

 

User-agent: 'Ahoy! The Homepage Finder'

Disallow:

 

User-agent: Arachnophilia

Disallow:

 

User-agent: ArchitextSpider

Disallow:

 

User-agent: ASpider/0.09

Disallow:

 

User-agent: AURESYS/1.0

Disallow:

 

User-agent: BackRub/*.*

Disallow:

 

User-agent: Big Brother

Disallow:

 

User-agent: BlackWidow

Disallow:

 

User-agent: BSpider/1.0 libwww-perl/0.40

Disallow:

 

User-agent: CACTVS Chemistry Spider

Disallow:

 

User-agent: Digimarc CGIReader/1.0

Disallow:

 

User-agent: Checkbot/x.xx LWP/5.x

Disallow:

 

User-agent: CMC/0.01

Disallow:

 

User-agent: combine/0.0

Disallow:

 

User-agent: conceptbot/0.3

Disallow:

 

User-agent: Crescent Internet ToolPak HTTP OLE Control v.1.0

Disallow:

 

User-agent: root/0.1

Disallow:

 

User-agent: CS-HKUST-IndexServer/1.0

Disallow:

 

User-agent: CyberSpyder/2.1

Disallow:

 

User-agent: Deweb/1.01

Disallow:

 

User-agent: DragonBot/1.0 libwww/5.0

Disallow:

 

User-agent: EIT-Link-Verifier-Robot/0.2

Disallow:

 

User-agent: Emacs-w3/v[0-9\.]+

Disallow:

 

User-agent: EmailSiphon

Disallow:

 

User-agent: EMC Spider

Disallow:

 

User-agent: explorersearch

Disallow:

 

User-agent: Explorer

Disallow:

 

User-agent: ExtractorPro

Disallow:

 

User-agent: FelixIDE/1.0

Disallow:

 

User-agent: Hazel's Ferret Web hopper,

Disallow:

 

User-agent: ESIRover v1.0

Disallow:

 

User-agent: fido/0.9 Harvest/1.4.pl2

Disallow:

 

User-agent: H?m?h?kki/0.2

Disallow:

 

User-agent: KIT-Fireball/2.0 libwww/5.0a

Disallow:

 

User-agent: Fish-Search-Robot

Disallow:

 

User-agent: Mozilla/2.0 (compatible fouineur v2.0;

fouineur.9bit.qc.ca)

Disallow:

 

User-agent: Robot du CRIM 1.0a

Disallow:

 

User-agent: Freecrawl

Disallow:

 

User-agent: FunnelWeb-1.0

Disallow:

 

User-agent: gcreep/1.0

Disallow:

 

User-agent: ???

Disallow:

 

User-agent: GetURL.rexx v1.05

Disallow:

 

User-agent: Golem/1.1

Disallow:

 

User-agent: Gromit/1.0

Disallow:

 

User-agent: Gulliver/1.1

Disallow:

 

User-agent: yes

Disallow:

 

User-agent: AITCSRobot/1.1

Disallow:

 

User-agent: wired-digital-newsbot/1.5

Disallow:

 

User-agent: htdig/3.0b3

Disallow:

 

User-agent: HTMLgobble v2.2

Disallow:

 

User-agent: no

Disallow:

 

User-agent: IBM_Planetwide,

Disallow:

 

User-agent: gestaltIconoclast/1.0 libwww-FM/2.17

Disallow:

 

User-agent: INGRID/0.1

Disallow:

 

User-agent: IncyWincy/1.0b1

Disallow:

 

User-agent: Informant

Disallow:

 

User-agent: InfoSeek Robot 1.0

Disallow:

 

User-agent: Infoseek Sidewinder

Disallow:

 

User-agent: InfoSpiders/0.1

Disallow:

 

User-agent: inspectorwww/1.0

http://www.greenpac.com/inspectorwww.html

Disallow:

 

User-agent: 'IAGENT/1.0'

Disallow:

 

User-agent: IsraeliSearch/1.0

Disallow:

 

User-agent: JCrawler/0.2

Disallow:

 

User-agent: Jeeves v0.05alpha (PERL, LWP, [email protected])

Disallow:

 

User-agent: Jobot/0.1alpha libwww-perl/4.0

Disallow:

 

User-agent: JoeBot,

Disallow:

 

User-agent: JubiiRobot

Disallow:

 

User-agent: jumpstation

Disallow:

 

User-agent: Katipo/1.0

Disallow:

 

User-agent: KDD-Explorer/0.1

Disallow:

 

User-agent: KO_Yappo_Robot/1.0.4(http://yappo.com/info/robot.html)

Disallow:

 

User-agent: LabelGrab/1.1

Disallow:

 

User-agent: LinkWalker

Disallow:

 

User-agent: logo.gif crawler

Disallow:

 

User-agent: Lycos/x.x

Disallow:

 

User-agent: Lycos_Spider_(T-Rex)

Disallow:

 

User-agent: Magpie/1.0

Disallow:

 

User-agent: MediaFox/x.y

Disallow:

 

User-agent: MerzScope

Disallow:

 

User-agent: NEC-MeshExplorer

Disallow:

 

User-agent: MOMspider/1.00 libwww-perl/0.40

Disallow:

 

User-agent: Monster/vX.X.X -$TYPE ($OSTYPE)

Disallow:

 

User-agent: Motor/0.2

Disallow:

 

User-agent: MuscatFerret

Disallow:

 

User-agent: MwdSearch/0.1

Disallow:

 

User-agent: NetCarta CyberPilot Pro

Disallow:

 

User-agent: NetMechanic

Disallow:

 

User-agent: NetScoop/1.0 libwww/5.0a

Disallow:

 

User-agent: NHSEWalker/3.0

Disallow:

 

User-agent: Nomad-V2.x

Disallow:

 

User-agent: NorthStar

Disallow:

 

User-agent: Occam/1.0

Disallow:

 

User-agent: HKU WWW Robot,

Disallow:

 

User-agent: Orbsearch/1.0

Disallow:

 

User-agent: PackRat/1.0

Disallow:

 

User-agent: Patric/0.01a

Disallow:

 

User-agent: Peregrinator-Mathematics/0.7

Disallow:

 

User-agent: Duppies

Disallow:

 

User-agent: Pioneer

Disallow:

 

User-agent: PGP-KA/1.2

Disallow:

 

User-agent: Resume Robot

Disallow:

 

User-agent: Road Runner: ImageScape Robot ([email protected])

Disallow:

 

User-agent: Robbie/0.1

Disallow:

 

User-agent: ComputingSite Robi/1.0 ([email protected])

Disallow:

 

User-agent: Roverbot

Disallow:

 

User-agent: SafetyNet Robot 0.1,

Disallow:

 

User-agent: Scooter/1.0

Disallow:

 

User-agent: not available

Disallow:

 

User-agent: Senrigan/xxxxxx

Disallow:

 

User-agent: SG-Scout

Disallow:

 

User-agent: Shai'Hulud

Disallow:

 

User-agent: SimBot/1.0

Disallow:

 

User-agent: Open Text Site Crawler V1.0

Disallow:

 

User-agent: SiteTech-Rover

Disallow:

 

User-agent: Slurp/2.0

Disallow:

 

User-agent: ESISmartSpider/2.0

Disallow:

 

User-agent: Snooper/b97_01

Disallow:

 

User-agent: Solbot/1.0 LWP/5.07

Disallow:

 

User-agent: Spanner/1.0 (Linux 2.0.27 i586)

Disallow:

 

User-agent: no

Disallow:

 

User-agent: Mozilla/3.0 (Black Widow v1.1.0; Linux 2.0.27; Dec 31

1997 12:25:00

Disallow:

 

User-agent: Tarantula/1.0

Disallow:

 

User-agent: tarspider

Disallow:

 

User-agent: dlw3robot/x.y (in TclX by http://hplyot.obspm.fr/~dl/)

Disallow:

 

User-agent: Templeton/

Disallow:

 

User-agent: TitIn/0.2

Disallow:

 

User-agent: TITAN/0.1

Disallow:

 

User-agent: UCSD-Crawler

Disallow:

 

User-agent: urlck/1.2.3

Disallow:

 

User-agent: Valkyrie/1.0 libwww-perl/0.40

Disallow:

 

User-agent: Victoria/1.0

Disallow:

 

User-agent: vision-search/3.0'

Disallow:

 

User-agent: VWbot_K/4.2

Disallow:

 

User-agent: w3index

Disallow:

 

User-agent: W3M2/x.xxx

Disallow:

 

User-agent: WWWWanderer v3.0

Disallow:

 

User-agent: WebCopy/

Disallow:

 

User-agent: WebCrawler/3.0 Robot libwww/5.0a

Disallow:

 

User-agent: WebFetcher/0.8,

Disallow:

 

User-agent: weblayers/0.0

Disallow:

 

User-agent: WebLinker/0.0 libwww-perl/0.1

Disallow:

 

User-agent: no

Disallow:

 

User-agent: WebMoose/0.0.0000

Disallow:

 

User-agent: Digimarc WebReader/1.2

Disallow:

 

User-agent: [email protected]

Disallow:

 

User-agent: webvac/1.0

Disallow:

 

User-agent: webwalk

Disallow:

 

User-agent: WebWalker/1.10

Disallow:

 

User-agent: WebWatch

Disallow:

 

User-agent: Wget/1.4.0

Disallow:

 

User-agent: w3mir

Disallow:

 

User-agent: no

Disallow:

 

User-agent: WWWC/0.25 (Win95)

Disallow:

 

User-agent: none

Disallow:

 

User-agent: XGET/0.7

Disallow:

 

User-agent: Nederland.zoek

Disallow:

 

User-agent: BizBot04 kirk.overleaf.com

Disallow:

 

User-agent: HappyBot (gserver.kw.net)

Disallow:

 

User-agent: CaliforniaBrownSpider

Disallow:

 

User-agent: EI*Net/0.1 libwww/0.1

Disallow:

 

User-agent: Ibot/1.0 libwww-perl/0.40

Disallow:

 

User-agent: Merritt/1.0

Disallow:

 

User-agent: StatFetcher/1.0

Disallow:

 

User-agent: TeacherSoft/1.0 libwww/2.17

Disallow:

 

User-agent: WWW Collector

Disallow:

 

User-agent: processor/0.0ALPHA libwww-perl/0.20

Disallow:

 

User-agent: wobot/1.0 from 206.214.202.45

Disallow:

 

User-agent: Libertech-Rover www.libertech.com?

Disallow:

 

User-agent: WhoWhere Robot

Disallow:

 

User-agent: ITI Spider

Disallow:

 

User-agent: w3index

Disallow:

 

User-agent: MyCNNSpider

Disallow:

 

User-agent: SummyCrawler

Disallow:

 

User-agent: OGspider

Disallow:

 

User-agent: linklooker

Disallow:

 

User-agent: CyberSpyder ([email protected])

Disallow:

 

User-agent: SlowBot

Disallow:

 

User-agent: heraSpider

Disallow:

 

User-agent: Surfbot

Disallow:

 

User-agent: Bizbot003

Disallow:

 

User-agent: WebWalker

Disallow:

 

User-agent: SandBot

Disallow:

 

User-agent: EnigmaBot

Disallow:

 

User-agent: spyder3.microsys.com

Disallow:

 

User-agent: www.freeloader.com.

Disallow:

 

User-agent: Googlebot

Disallow:

 

User-agent: METAGOPHER

Disallow:

 

User-agent: *

Disallow: /

?,???`???,?? God must love stupid people, he made so many ??,???`???,?

Link to comment
Share on other sites

Since the robots.txt file is viewable I would rename your Admin to something different and move

 

Disallow: /admin

 

To the admin directory robots.txt file.

 

User-agent: *
Disallow: /

Link to comment
Share on other sites

Since the robots.txt file is viewable I would rename your Admin to something different and move

 

Disallow: /admin

 

To the admin directory robots.txt file.

 

User-agent: *
Disallow: /

 

 

Ops missed that, becouse my admin isn't even in the same server :)

Is the rest ok?

?,???`???,?? God must love stupid people, he made so many ??,???`???,?

Link to comment
Share on other sites

Ops missed that, becouse my admin isn't even in the same server :)

Is the rest ok?

 

I think it is totally futile to list every single bot in that file.

If you want to keep bad bots out then I would use a different method as bad bots by definition do not give a hooch about your robots.txt file.

Treasurer MFC

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...