Guest Posted November 26, 2002 Posted November 26, 2002 I have no idea why the spiders (mainly fast) are not listening to my robots.txt. User-agent: * Disallow: /catalog/images/ Disallow: /catalog/news/ Disallow: /catalog/newsletter/ Disallow: /catalog/firstsite/ Disallow: /catalog/includes Disallow: /admin Disallow: /catalog/address_book_process.php Disallow: /catalog/account.php Disallow: /catalog/account_edit.php Disallow: /catalog/account_edit_process.php Disallow: /catalog/account_history.php Disallow: /catalog/account_history_info.php Disallow: /catalog/address_book.php Disallow: /catalog/checkout_process.php Disallow: /catalog/advanced_search.php Disallow: /catalog/advanced_search_result.php Disallow: /catalog/checkout_address.php Disallow: /catalog/checkout_confirmation.php Disallow: /catalog/checkout_payment.php Disallow: /catalog/checkout_success.php Disallow: /catalog/conditions.php Disallow: /catalog/contact_us.php Disallow: /catalog/create_account.php Disallow: /catalog/create_account_process.php Disallow: /catalog/create_account_success.php Disallow: /catalog/download.php Disallow: /catalog/info_shopping_cart.php Disallow: /catalog/login.php Disallow: /catalog/logoff.php Disallow: /catalog/password_forgotten.php Disallow: /catalog/popup_image.php Disallow: /catalog/popup_search_help.php Disallow: /catalog/privacy.php Disallow: /catalog/products_new.php Disallow: /catalog/product_notifications.php Disallow: /catalog/product_reviews.php Disallow: /catalog/product_reviews_info.php Disallow: /catalog/product_reviews_write.php Disallow: /catalog/produktberatung.php Disallow: /catalog/redirect.php Disallow: /catalog/reviews.php Disallow: /catalog/shipping.php Disallow: /catalog/shopping_cart.php Disallow: /catalog/specials.php Disallow: /catalog/tell_a_friend.php Disallow: /catalog/disclaimer.php Disallow: /catalog/admin/ Disallow: /catalog/download/ Disallow: /catalog/images/ Disallow: /catalog/includes/ Disallow: /catalog/pub/ The spider are browsing my images and for example: Disallow: /catalog/produktberatung.php Any ideas??? Torsten
Jan0815 Posted November 26, 2002 Posted November 26, 2002 I have no idea why the spiders (mainly fast) are not listening to my robots.txt. Any ideas??? Why do you think they should? There is no law forcing them to listen to your robots.txt. Some spiders do, some dont. Suggestion: Make notice of the spiders that ignore the robots.txt, find out where they come from and look there for policies that define what the spider will see and what not. Some spiders look at META tags with robot rules, some will look at robots.txt, some will look at HTML comments. HTH You can't have everything. That's why trains have difficulty crossing oceans, and hippos did not adapt to fly. -- from the OpenBSD mailinglist.
Guest Posted November 26, 2002 Posted November 26, 2002 Jan , thx for the info. It's look like, that the fast spider loves my site. After 2000 vistits on sunday, 1500 vistis yesterday - guess what: the fast spider is my guest today..... Torsten
Jan0815 Posted November 26, 2002 Posted November 26, 2002 Who is the "fast spider"? What IP-address is it coming from? What search-engine is it working for? You can't have everything. That's why trains have difficulty crossing oceans, and hippos did not adapt to fly. -- from the OpenBSD mailinglist.
Guest Posted November 26, 2002 Posted November 26, 2002 146.101.142.226 [spider.lon4.fastsearch.net] http://www.fastsearch.net/ See their partner site: http://www.fastsearch.net/products/partner...te/partners.asp
CC Posted November 26, 2002 Posted November 26, 2002 Ok this is something Jan maybe able to help with... I have been looking for a site that lists the bot names and IP addresses. I have found a couple but they are not very good. Anyone know of a decent site that offers this information. I just want to know the agent user name, ip address and name of bot if poss. Cheers guys. I have never heard of fastsearch tho I must say. CC.
Jan0815 Posted November 26, 2002 Posted November 26, 2002 IP-addresses can change quite fast, but the spider names seem to be a bit more consistent. http://tamingthebeast.net/articles2/search...ine-spiders.htm looks interesting. HTH You can't have everything. That's why trains have difficulty crossing oceans, and hippos did not adapt to fly. -- from the OpenBSD mailinglist.
CC Posted November 26, 2002 Posted November 26, 2002 Nice one mate. Good site, just what i needed. CC.
Recommended Posts
Archived
This topic is now archived and is closed to further replies.