Jump to content
  • Checkout
  • Login
  • Get in touch

osCommerce

The e-commerce.

Google not deeplinking our site


Recommended Posts

Posted

Hi,

We have recently started looking in to ways of optimising our site www.refreshcartridges.co.uk and it has become apparent that Google isn't deeplinking our site.

 

This is despite:

- Creating a sitemaps.xml file

- Providing several PHP files which are immediatley accessable from the frontpage which deeplink right to the lowest level (check out for example the Canon Inkjet Cartridges link on the footer)

- Implimenting breadcrumb navigation

- Providing a best sellers list which links to our most popular products from within two clicks of the homepage

- Natual navigation to the deepest products would be at a most four clicks from the homepage through standard hyperlinks.

 

I have a theory as to why we may be suffering from this problem but I'd like someone to confirm this for me....

 

Using Google Webmaster Tools I see that Google is trying to navigate our site with the OSCid string after the URL; this is despite us implimenting cookies so that all users with cookie enabled browsers don't suffer from this. Is there any way to ensure that Google doesn't try to navigate our site in such a manner? I don't want to disable the OSCid string entirely as this would punish those who refuse cookies.

 

I could, of course be barking up the entirely wrong tree but I'd appreciate any advice.

Posted
Using Google Webmaster Tools I see that Google is trying to navigate our site with the OSCid string after the URL;

Did you set admin->Configuration->Sessions->Prevent Spider sessions to true and use the updated spiders.txt that Steve Lionle/stevel is maintaining?

Posted

What a stupid mistake to make.... Thanks for that Jan, we had a cookie issue several months ago and Prevent Spider sessions was changed to false in the troubleshooting process then not changed back. Have now set it to true and updated the spiders.txt file. Thanks for your help; very much appreciated.

Posted

I updated the spiders.txt and changed prevent spider sessions to true the same day it was suggested to me. Checking out Who's Online on a daily basis confirms that whenever a search engine is snooping around they don't appear to be getting OSCid's.

 

I've just checked Google Webmaster Tools again however and we have a heap of new errors relating to session ID's that were 'last calculated' on the 4th May; three days after making the alterations.

 

Does anyone know whether this last calculated date is that actual date the site was crawled or would it have possibly been crawled several days prior. I don't know if I should still be worrying about session ID's or whether the problem has been solved but Google is publishing old errors.

 

Can anyone shed any light? Thanks for your help!

Posted
I updated the spiders.txt and changed prevent spider sessions to true the same day it was suggested to me. Checking out Who's Online on a daily basis confirms that whenever a search engine is snooping around they don't appear to be getting OSCid's.

 

I've just checked Google Webmaster Tools again however and we have a heap of new errors relating to session ID's that were 'last calculated' on the 4th May; three days after making the alterations.

 

Does anyone know whether this last calculated date is that actual date the site was crawled or would it have possibly been crawled several days prior. I don't know if I should still be worrying about session ID's or whether the problem has been solved but Google is publishing old errors.

 

Can anyone shed any light? Thanks for your help!

 

If you want to be sure you could try the following in includes/application_top.php

 

Find ..

 

	if ($spider_flag == false) {

 

Add directly above ..

 

if($spider_flag == true && strpos($_SERVER['REQUEST_URI'], 'osCsid') ) {
$querystring = '';
$redirect_file = basename($_SERVER['SCRIPT_NAME']);
$querystring = str_replace('?', '&', strstr($_SERVER['REQUEST_URI'], '?'));
$querystring = explode('&', $querystring);
$count = count($querystring);
$seperator = '';
$new_qs = '';
for( $i=0; $i<$count; $i++ ) {
 if( !empty($querystring[$i]) && false === (strpos($querystring[$i], 'osCsid')) ) {
$new_qs .= $seperator . $querystring[$i];
$seperator = '&'; 
 }
}
// 301 redirect the bot to a NONSSL page
$redirect = tep_href_link($redirect_file, $new_qs, 'NONSSL', false);
header("HTTP/1.0 301 Moved Permanently"); // 301 header
header("Location: $redirect"); // 301 redirect
exit;
}

 

You could perhaps test this with the W3C validator which is seen as a bot.

Posted

Thanks for that Robert; I'm always amazed that someone takes time out to help - It is very much appreciated.

 

I will add that code to our site tomorrow. One curious thing is that I completely removed sessions almost a week ago and the most recent report from Google (9th of June) still shows 404 errors, for example:

 

http://www.refreshcartridges.co.uk/panason...hcljc7sof2.html

 

This confuses me for two reasons. Firstly sessions shouldn't be allowed at all, if you turn cookies off on the browser then you just get diverted to a cookie usage page. Secondly, surely this address wouldn't be correct anyway, even if Google was creating sessions the address would be

 

http://www.refreshcartridges.co.uk/panason...fd4ebhcljc7sof2

 

Something strange is definately afoot.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...