Jump to content
  • Checkout
  • Login
  • Get in touch

osCommerce

The e-commerce.

Is OSC ignored by search engines


Jack_mcs

Recommended Posts

The session keeps track of what the user has in their cart. The Session ID uniquely identifies the session.

 

The spiders.txt is a list of user agents of known spider. This is used for the 'prevent spider sessions' feature, so that spiders are not assigned a session. This is required if you want to use the 'prevent spider sessions' feature.

 

The robots.txt is a file that all spiders look for when they visit your site. You can use this file to tell the spiders what directories/file you do not want the spider to visit. This is not required.

-------------------------------------------------------------------------------------------------------------------------

NOTE: As of Oct 2006, I'm not as active in this forum as I used to be, but I still work with osC quite a bit.

If you have a question about any of my posts here, your best bet is to contact me though either Email or PM in my profile, and I'll be happy to help.

Link to comment
Share on other sites

  • Replies 70
  • Created
  • Last Reply

This is what I have as a spider.txt , I dont remember creating it so I assume it is part of a contribution or OSC? LOL I found it in includes/spider.txt

 

$Id: spiders.txt,v 1.2 2003/05/05 17:58:17 dgw_ Exp $

almaden.ibm.com

appie 1.1

architext

ask jeeves

asterias2.0

augurfind

baiduspider

bannana_bot

bdcindexer

crawler

crawler@fast

docomo

fast-webcrawler

fluffy the spider

frooglebot

geobot

googlebot

gulliver

henrythemiragorobot

ia_archiver

infoseek

kit_fireball

lachesis

lycos_spider

mantraagent

mercator

moget/1.0

muscatferret

nationaldirectory-webspider

naverrobot

ncsa beta

netresearchserver

ng/1.0

osis-project

polybot

pompos

scooter

seventwentyfour

sidewinder

sleek spider

slurp/si

[email protected]

steeler/1.3

szukacz

t-h-u-n-d-e-r-s-t-o-n-e

teoma

turnitinbot

ultraseek

vagabondo

voilabot

w3c_validator

zao/0

zyborg/1.0

Link to comment
Share on other sites

I have a quick question on the SEO issue. I picked up the recommendation to run my site through http://www.webconfs.com/search-engine-spider-simulator.php and have many of my links ending with

 

/catalog/index.php?cPath=22&osCsid=

 

I have a fealing that the ending with sid= will prevent the search engine from following the link. How do I get rid of this? It is only on the category links. The products are clean.

Laser labels, barcode labels, custom labels

Link to comment
Share on other sites

Clarification.... I do have "Prevent Spider Sessions" on in my admin, so I would think that the sids would be off.... Is that a right assumption? So why the sid= extention?

Laser labels, barcode labels, custom labels

Link to comment
Share on other sites

Jack,

 

Check and see how Google is seeing your page by using this link:

http://www.webconfs.com/search-engine-spider-simulator.php. Check to see if the spiders/bots are getting any errors on your pages.

 

I had a similar problem, it was caused by having the 'prevent spider sessons' option turned on. Since there was no session there was a variable getting a blank for the language rather than the language name, thus causing an error on my pages for Google and other spiders. I went through and made corrections to my code and now I'm getting traffic from the bots. I had this problem because when I'm using a development version of the software, or one of the mid-releases from late last fall.

 

Put a link to your website in your profile, that way when you ask questions people can look at your site.

Hey Can you tell us, well me, exactly what you did? What code changes you made? I am having the exact same problem. I get the language/session error when a spider comes to particular site i setup with a snapshot from late last fall. When i turn on spider sessions the error disappears but now the oscid is showing in the url.. What did you do? Thanks!!

Link to comment
Share on other sites

TO ALL,

 

About 4 months ago i was tearing my hair out, convinced my site(s) would never be spidered and indexed. I will say that YES OSC can be both spidered and indexed by google. I had many of the same exchanges with Burt and Wizard & Wars. If you are using a snapshot that contains the new session code you may have problems.. that can be resolved. MS2.2 out of box can and will be spidered and Indexed. Unique title tags is most important for indexing. Try your own custom code to generate title, description and keyword tags from the text in your pages.

 

This is not to say that some snapshots of OSC will not cause problems. They do. Track them down with the available tools and fix them.

 

There really needs to be a definitive resource for SEO issues with OSC. It can be difficult to wade through the conflicting information on these forums. Especially for beginners.

Link to comment
Share on other sites

This is not to say that some snapshots of OSC will not cause problems. They do. Track them down with the available tools and fix them.

 

What availabel tools are there to fix these pages? Can you point me to some recommendations?

Laser labels, barcode labels, custom labels

Link to comment
Share on other sites

I have the prevent spider session set to true and it is working fine for googlebot but not for YahooSeeker. I have added the Yahoo Seeker bot to spiders.txt and I still see many lines like this in my log file

 

[66.196.93.4 /product_info.php?cPath=3_23&products_id=73&osCsid=c41356a5f4d1673670a96036cc3a0de5 "YahooSeeker/1.1 ]

 

whereas googlebot looks like this

[64.68.82.201 /articles.php?tPath=1 "Googlebot/2.1]

 

my spiders.txt has these lines (partial list)

[slurp/si

[email protected]

steeler/1.3

.

.

.

ultraseek

vagabondo

voilabot

voila

w3c_validator

yahooseeker

YahooSeeker

YahooSeeker/1.1

zao/0

zyborg/1.0

]

 

Any idea what is wrong here?

 

thanks a lot

 

Ari

Link to comment
Share on other sites

It appears that nothign is wrong.

 

How do you know that those sessions weren't gathered earlier?

-------------------------------------------------------------------------------------------------------------------------

NOTE: As of Oct 2006, I'm not as active in this forum as I used to be, but I still work with osC quite a bit.

If you have a question about any of my posts here, your best bet is to contact me though either Email or PM in my profile, and I'll be happy to help.

Link to comment
Share on other sites

I have a question, I hope someone here can help me with.

 

I have my "store" in my root directory.

I have a subdomain for a message board forum.

I need to know the best way to link the forum from my Info box.

 

Currently, I created a php page and am using a redirect on that page. However, from everything Ive read redirects are a no-no with Google.

 

So, this is what I need to know: How can I add "Forum" to my Info box and have it go to the forum on my subdomain?

 

Thanks

Tammy

Link to comment
Share on other sites

I have a question, I hope someone here can help me with.

 

I have my "store" in my root directory.

I have a subdomain for a message board forum.

I need to know the best way to link the forum from my Info box.

 

Currently, I created a php page and am using a redirect on that page. However, from everything Ive read redirects are a no-no with Google.

 

So, this is what I need to know: How can I add "Forum" to my Info box and have it go to the forum on my subdomain?

 

Thanks

Tammy

By starting a new thread, or by posting in a thread with relevent discussion.

 

What in the world does that have to do with Search Engines?

-------------------------------------------------------------------------------------------------------------------------

NOTE: As of Oct 2006, I'm not as active in this forum as I used to be, but I still work with osC quite a bit.

If you have a question about any of my posts here, your best bet is to contact me though either Email or PM in my profile, and I'll be happy to help.

Link to comment
Share on other sites

It appears that nothign is wrong.

 

How do you know that those sessions weren't gathered earlier?

I thought that the spider is scanning the pages in real time and not really coming to the site armed with URL from previous scans. Anyway, this has been going on for more then a month. Yahoo Seeker comes to visit about once a week. I will say, that the result set in yahoo search does not have the session ID. I am just concerned that the spider is not doing as good a job as googlebot is doing - and I do have much better results with Google. Thanks -- Ari

Link to comment
Share on other sites

Most likily, it is returning to URLs it previously gathered. It may continue to do so for several months.

-------------------------------------------------------------------------------------------------------------------------

NOTE: As of Oct 2006, I'm not as active in this forum as I used to be, but I still work with osC quite a bit.

If you have a question about any of my posts here, your best bet is to contact me though either Email or PM in my profile, and I'll be happy to help.

Link to comment
Share on other sites

I think Wizard is right. The spider come armed with URLs. By the way, my spider.txt has it both in lower and title case and still the SID is in there. The results in Yahoo don't have the SID. Unless you have other ideas, I will just let time do its thing.

 

Thanks

 

Ari

Link to comment
Share on other sites

Did you try the link to submitexpress tool. It'll you exactly if the sids are killed for the user-agent specified.

I had the same problem with msn bot.

Link to comment
Share on other sites

I just did and the SID is not showing when I type YahooSeeker in the user agent box.

I did this test with the Mozila Firefox browser in the past http://www.mozilla.org/products/firefox/ with a user agent switcher which is doing the same thing. But then the real log file has the SID for YahooSeeker and not for Googlebot. That's what is so strange.Ari

Link to comment
Share on other sites

  • 1 month later...
I just did and the SID is not showing when I type YahooSeeker in the user agent box.

I did this test with the Mozila Firefox browser in the past http://www.mozilla.org/products/firefox/ with a user agent switcher which is doing the same thing. But then the real log file has the SID for YahooSeeker and not for Googlebot. That's what is so strange.Ari

I have the same problem when I try the submitexpress tester, but the other way round... Google (and some others) get a SID but Yahoo (and a few others) don't get one. Doesn't seem to be any logic why - all of them are listed in the spiders.txt file. Perhaps the submitexpress tester isn't working the way it should, or at least the way osC expects?

 

'Prevent spider sessions' is on.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...