Jump to content
  • Checkout
  • Login
  • Get in touch

osCommerce

The e-commerce.

google overrides base href when caching my page


Biancoblu

Recommended Posts

can you help please?

 

Google overrides the base href code when caching thus causing a broken page. I have a multilanguage site, this issue happens on languages other than english, ie languages that use directories created by a url rewrite addon.

 

My french page is http://www.site.com/index.php/fr,'>http://www.site.com/index.php/fr, its base href code is http://www.site.com which is correct, the page loads fine in all browsers, but when google caches it, it overrides the base href code and takes the actual page's url as a true path, it looks for css, images etc in the /fr directory, it obviously can't find them in there so shows a broken page in the cache.

 

Why does google do this and how can I fix it?

~ Don't mistake my kindness for weakness ~

Link to comment
Share on other sites

can you help please?

 

Google overrides the base href code when caching thus causing a broken page. I have a multilanguage site, this issue happens on languages other than english, ie languages that use directories created by a url rewrite addon.

 

My french page is http://www.site.com/index.php/fr,'>http://www.site.com/index.php/fr, its base href code is http://www.site.com which is correct, the page loads fine in all browsers, but when google caches it, it overrides the base href code and takes the actual page's url as a true path, it looks for css, images etc in the /fr directory, it obviously can't find them in there so shows a broken page in the cache.

 

Why does google do this and how can I fix it?

 

Hi

 

Are you using <link rel canonical> tags on your pages? And are you saying that Google ignores these and is still listing the URLs you don't want listed?

Link to comment
Share on other sites

Hi

 

yes I use <link rel canonical>.

 

what I'm saying is google, in its cache, only in its cache, ignores my base href code, it takes the actual page's url as true path (not the base href code), so shows a broken page in its cached version of my page.

~ Don't mistake my kindness for weakness ~

Link to comment
Share on other sites

I had a look at the site you have listed in your profile

 

site:www.yoursite.com in google

 

Everything looks ok to me, no broken links on index.php/fr

 

There is a meta nocache tag that you could use, but I don't think it is recommended for SEO.

Link to comment
Share on other sites

It's obviously because of the unorthodox method you've chosen to handle the languages.

 

The "syntactically correct" method goes like this:

 

http://www.site.com/index.php?language=fr

 

When you do it like this:

 

http://www.site.com/index.php/fr

 

It makes g00gle think the index.php part of the link is a folder name instead of a page name (because of the slashes).

 

When you do something differently that almost everyone else sometimes there's a price to pay.

 

I think you just found it.

:blush:

 

You could try the g00gle "pi$$ and moan" department but I doubt you get any positive results.

 

On the bright side the Yahoo cache doesn't have it screwed up, and even the g00gle cache link worked OK for me using Firefox instead of IE.

 

I know this isn't what you wanted to hear, but it's probably the truth.

If I suggest you edit any file(s) make a backup first - I'm not perfect and neither are you.

 

"Given enough impetus a parallelogramatically shaped projectile can egress a circular orifice."

- Me -

 

"Headers already sent" - The definitive help

 

"Cannot redeclare ..." - How to find/fix it

 

SSL Implementation Help

 

Like this post? "Like" it again over there >

Link to comment
Share on other sites

I also checked Google's cache in four languages and a couple of different pages. All pages render correctly, with images, and the base href is correct. I object to URL rewriters on principle, but yours seems to be working correctly.

 

Regards

Jim

See my profile for a list of my addons and ways to get support.

Link to comment
Share on other sites

Thanks guys for replying, I appreciate it.

 

You're saying that most cached pages work ok for you, though Jim mentions it working better with Firefox. I use Firefox 4 and Explorer 8, all google cached pages are broken for me, but most of the yahoo ones render correctly. So could it be a browser issue?

 

Is it a bad point for a site if its google cached pages are broken? I mean, do you get penalized for that?

 

 

It's obviously because of the unorthodox method you've chosen to handle the languages.

 

well, it's the url rewrite addon that does this....it's advertised as being modern, efficient, etc, given that I am not a coder it's not easy to know what's good and what isn't.

 

You could try the g00gle "pi$$ and moan" department but I doubt you get any positive results.

 

I posted on the google webmaster forums and was told the site is all constructed wrong...

They said to add a backslash in front of links so that they are relative to the ROOT of the domain, like <link rel="/stylesheet".

 

On the other hand, the url rewrite coder tells me to "remove the language from base href"....but to my knowledge, base href already is correct....

 

What are your thoughts on this?

~ Don't mistake my kindness for weakness ~

Link to comment
Share on other sites

 

What are your thoughts on this?

 

You are not going to get a penalty from SEs if that is your concern. I personally cannot see any broken links listed in the index or on your site as I navigated it.

 

Be careful of people giving you advice about poor structure etc, sometimes you can do more harm than good following those advices. The way to improve your SEO is to improve the content on your pages and then look for links back to your site.

Link to comment
Share on other sites

sure, you are correct, I was just concerned because at my end, mostly all google cached pages are broken (no images, no css, just plain text), it wasn't happening on my old site.

I attach a screenshot of what I see in the google cache.

Please can you click this link and tell me if you see my page correctly? cache link

post-102418-0-24526800-1304257193_thumb.jpg

~ Don't mistake my kindness for weakness ~

Link to comment
Share on other sites

sure, you are correct, I was just concerned because at my end, mostly all google cached pages are broken (no images, no css, just plain text), it wasn't happening on my old site.

I attach a screenshot of what I see in the google cache.

Please can you click this link and tell me if you see my page correctly? cache link

 

You are correct the images don't display - (I don't know why that is), but the links are not broken, they go correctly to the pages on your site.

Link to comment
Share on other sites

I still don't see any broken images in Firefox, but all images are broken in Internet Explorer. IE also complains about "Errors in Page." The w3 validator finds 24 errors on the page.

 

Looking at the source of the page, there is a malformed section of tags at the top of the page. This is Google's header bar that appears above the cached page content. All of the errors that I looked at are in this header. So this is Google's problem -- they are generating a malformed HTML page. IE objects to the bad HTML, while other browsers ignore it.

 

You could try telling Google to clean up their act, but I doubt you'll get anywhere with that.

 

Regards

Jim

See my profile for a list of my addons and ways to get support.

Link to comment
Share on other sites

 

You could try telling Google to clean up their act, but I doubt you'll get anywhere with that.

 

Regards

Jim

 

 

somehow I doubt that too :rolleyes:

 

Personally I use FF 4 and IE 8, and all google cached pages are broken for me either in FF or IE.

 

Searching the net for this problem, it appears they have the same issue with open cart when they turn SEO on (so I read on their forums).

 

Thanks for looking though, I aprreciate it.

~ Don't mistake my kindness for weakness ~

Link to comment
Share on other sites

I did verify that the malformed base href tag g00gle adds is the problem.

 

I copied their HTML source, removed it, and opened the page locally and it's fine.

 

I went to the main page of your site and it has the language buttons at the upper/right and they work just like every other site.

 

My question is where is g00gle getting these links:

 

http://www.site.com/index.php/xx

(where xx = the language)

 

:unsure:

If I suggest you edit any file(s) make a backup first - I'm not perfect and neither are you.

 

"Given enough impetus a parallelogramatically shaped projectile can egress a circular orifice."

- Me -

 

"Headers already sent" - The definitive help

 

"Cannot redeclare ..." - How to find/fix it

 

SSL Implementation Help

 

Like this post? "Like" it again over there >

Link to comment
Share on other sites

 

My question is where is g00gle getting these links:

 

http://www.site.com/index.php/xx

(where xx = the language)

 

:unsure:

 

Those links happen when you are in a foreign language page, let's say you are on a french product page, then you click on the french home page, the site (or the url rewrite) will then give you

http://www.site.com/index.php/xx

~ Don't mistake my kindness for weakness ~

Link to comment
Share on other sites

I forgot to mention that those links also are in the canonical code:

 

<link rel="canonical" href="http://www.site.com/index.php/xx">

~ Don't mistake my kindness for weakness ~

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...