Epic Mike Posted March 27, 2006 Posted March 27, 2006 I have my store set up to email me every time someone gets a 404 error with the IP address and the URL that they were trying to see. Usually it's just phishers looking for PHPBB or whatever, but lately I've been getting some that look like this (only mystore is my actual site): They always come in 4's: http://www.mystore.com/catalog/https://www.../catalog/accoun http://www.mystore.com/catalog/https://www...ut_shipping.php http://www.mystore.com/catalog/https://www...?my_account_f=1 http://www.mystore.com/catalog/images/mystore The IP address is different every time and they have all been from the US. Most of my other 404s come from overseas. I can checkout/login/etc. myself fine and I'm getting other orders, but I was wondering if something is causing these errors for others. If you have any idea, any help would be apppreciated. I was hoping someone has had this same problem before. Thanks
Epic Mike Posted March 29, 2006 Author Posted March 29, 2006 I was just using "mystore" as an example but I see that I linked it. Didn't mean to do that, but if anyone has any idea, it would be appreciated. thanks, Mike
Epic Mike Posted April 1, 2006 Author Posted April 1, 2006 Ok, got another one today and I've noticed that all of the user agents are "User Agent: Java/1.4.1_04" or some other version of Java. Can anyone explain this for me? The pages not found this time were: -http://www.mysite.com/catalog/https://www.mysite.com/catalog/checkout_shipping.php -http://www.mysite.com/catalog/https://www.mysite.com/catalog/account.php?my_account_f=1 Both of these came up 404 at the same time (1:26:06) and then they both came up 404 again at 1:26:07 from the same IP address. I tried to change my user agent string to something similar but haven't had any luck yet. If anybody could shed some light on this, it would be much appreciated. I feel like I'm losing sales because of this, but can't recreate the error myself which is frustrating. Thanks, Mike
Epic Mike Posted April 1, 2006 Author Posted April 1, 2006 Ok, 4 posts and all by me :lol: but I just realized that the Java agent is a bot so I might block it. Just wanted to post in case anyone else has this problem. It's not bothering anything, I was just worried that customers were having problems but it seems that user agent: java whatever version is just some homebrew bots from a quick google search.
mtechama Posted April 1, 2006 Posted April 1, 2006 Hey you need to be patient someone will get to ya. But Nothing I could help ya. Wade Morris Amarillo, Texas Before you do any changes on your site you need to do BACKUP! BACKUP!
stevel Posted April 1, 2006 Posted April 1, 2006 I've seen this sort of thing before and it was usually caused by an error in configure.php. If I knew your store's actual URL, I could look to see if I could spot the problem. One thing to look at is to view the HTML source of a page and make sure that the <base> tag has the correct base URL and is not blank. Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description
Epic Mike Posted April 1, 2006 Author Posted April 1, 2006 mtechama, I didn't mean to come off as impatient. The thread was a couple of days old, and I got a little more info that I thought would help. Stevel, the base tag is -->https://www.mysite.com/catalog/ I'll pm you my URL and if you get a chance to take a look, that would be great. Thanks to both of you, Mike
stevel Posted April 1, 2006 Posted April 1, 2006 I don't see anything wrong with the pages I looked at. I have noticed that some bots are buggy and try to look for incorrect URLs - you indicated in your PM that these were bots, so I would just ignore it. Be sure that you have Prevent Spider Sessions turned on in admin/sessions and that you keep your spiders.txt up to date. Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description
Epic Mike Posted April 2, 2006 Author Posted April 2, 2006 I don't see anything wrong with the pages I looked at. I have noticed that some bots are buggy and try to look for incorrect URLs - you indicated in your PM that these were bots, so I would just ignore it. Be sure that you have Prevent Spider Sessions turned on in admin/sessions and that you keep your spiders.txt up to date. Thanks steve I do have prevent spiders turned on and I appreciate your spider contribution. Just updated last night as it's been a couple of months. Thanks for everything, Mike
Guest Posted April 2, 2006 Posted April 2, 2006 do you happen to be using nimmit sef url's? i had a similar problem a few months ago
Epic Mike Posted April 2, 2006 Author Posted April 2, 2006 do you happen to be using nimmit sef url's? i had a similar problem a few months ago No, still have the original dynamic urls. After some further reading, tracking, and checking of my log, here's what's happening and this might be useful to others...maybe it needs its own thread: I've been hit by these now at least once each day and every time is a different version of the Java user agent and a different IP, but I was able to track 2 of the IPs to the same place. Each time, the bot doesn't get robots.txt, css, or image files, but it hits every single page on my site. Each page is requested within 1 second of the previous page. It seems the bot has trouble with any secure page as it adds a / before it. Who knows what it's looking for exactly, but I'm assuming vulnerabilities or emails? Either way, I'd like to block it, but need to read some more to make sure that there aren't any reasons why I shouldn't block a java user agent. According to my logs, this is the only a java agent is used on my site, but I don't want to block something or someone that I shouldn't. Thanks for the suggestion eww and hopefully this will help somebody else. Mike
stevel Posted April 2, 2006 Posted April 2, 2006 I have not seen a Java user agent from a "human" user, so I think that would be safe to block. Steve Contributions: Country-State Selector Login Page a la Amazon Protection of Configuration Updated spiders.txt Embed Links with SID in Description
Recommended Posts
Archived
This topic is now archived and is closed to further replies.