Jump to content
  • Checkout
  • Login
  • Get in touch

osCommerce

The e-commerce.

Customer info with German characters corrupted in database


sturbek

Recommended Posts

Posted

I have a OSC v 2.2rc2 on MYSQL

 

Runs well, except that german characters get saved as oddities such as ü

 

I believe this is because it needs to save the data as UTF8. Is there a guide on how to update the site and database to UTF8?

 

even after I set one of the tables columns to utf8

via ALTER TABLE customers MODIFY customers_lastname VARCHAR(32) CHARACTER SET utf8;

 

øüîé is saved as øüîé

 

Any tips? This is driving my customers mad.

 

when I send show create table customers, I get

 

 

CREATE TABLE `customers` (

`customers_id` int(11) NOT NULL auto_increment,

`customers_gender` char(1) NOT NULL,

`customers_firstname` varchar(32) NOT NULL,

`customers_lastname` varchar(32) character set utf8 default NULL,

 

 

Thanks! Steve

Posted

You should never be specifying certain database columns to be a different character encoding! The entire database should be the same encoding.

 

In your 4 character example, øüîé is being saved as UTF-8 (C3B8 C3BC C3AE C3A9). The page is written out with those byte codes, but is being displayed as Latin-1 (ISO-8859-1), where those 8 bytes produce øüîé . The first task is to figure out why your database is (at least partially) UTF-8, and your page is being displayed in Latin-1. Once you have the two of them consistent, do you have any part of your database in Latin-1? I hesitate suggesting that you convert the entire database to UTF-8 (or Latin-1), for fear that text that is already in UTF-8 will be mangled.

 

Be sure to back up your database (and inspect the backup, to make sure it's complete). If you have a mix of Latin-1 and UTF-8, you can try converting table by table and field by field to the same consistent encoding. If all your data will work in Latin-1, it will take up a bit less space than UTF-8 (which takes 2 bytes per accented character), but if you expect to have any non-Latin-1 data, I'd go with UTF-8. Worst case, you'll have to make sure your backup has no encoding information in it, and is itself consistently UTF-8 (or Latin-1). Then you can empty out your tables, change the entire database to UTF-8 (or Latin-1), and import the backed-up data. Don't forget to change the page from Latin-1 to UTF-8, if that's what your database is. How about your language files -- are they Latin-1 or UTF-8? You need everything to be consistent.

Posted

Phil, thanks for the reply. I have a pretty standard OSC setup. Why would my DB be different from any other?

 

Also, I am seeing these characters in PHPAdmin as well, so I think it in the database, not the OSC display page.

 

In that example I listed, I changed one column to utf8 via the alter statement, but both it AND the unchanged column display the same mistaken characters.

 

When you mention changing the OSC page to UTF8:

For example, create_account.php page is saved as utf-8 and the charset set in the HTML via <?php echo CHARSET; ?> is UTF-8

 

Where else would it be set?

 

Thanks for your help!

Steve

Posted

I have a pretty standard OSC setup. Why would my DB be different from any other?

Well, you said that you've been manually altering the tables so that certain fields (columns) are UTF-8. Is the overall database encoding Latin-1 (the default) or UTF-8? German text could be done in either encoding, but you've got to be consistent in your character encoding, and it sounds like you aren't for some reason.

 

Also, I am seeing these characters in PHPAdmin as well, so I think it in the database, not the OSC display page.

Is phpMyAdmin displaying in Latin-1 or UTF-8? View > Page source and see in the header if it declares an encoding. If it doesn't, it's defaulting to Latin-1 (ISO-8859-1). If the overall database is still Latin-1, I think the page will come out in Latin-1, but I won't swear to it.

 

In that example I listed, I changed one column to utf8 via the alter statement, but both it AND the unchanged column display the same mistaken characters.

I don't like the idea of changing random columns to be a different encoding. You need your database text, your language files, and your page display all singing the same song. Still, if your database is UTF-8 anyway (have you checked in phpMyAdmin?), and your page is displayed in Latin-1 for some reason, you would see accented characters (multibyte in UTF-8) displayed incorrectly.

 

When you mention changing the OSC page to UTF8:

For example, create_account.php page is saved as utf-8 and the charset set in the HTML via <?php echo CHARSET; ?> is UTF-8

For any page, in your browser View > Page source and confirm that the encoding is specified as UTF-8. There's no need to "save" a non-language (just code) PHP file in UTF-8. Language files (particularly if they contain accented characters) need to be edited and saved with the proper encoding.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...