sturbek Posted November 11, 2009 Posted November 11, 2009 I have a OSC v 2.2rc2 on MYSQL Runs well, except that german characters get saved as oddities such as ü I believe this is because it needs to save the data as UTF8. Is there a guide on how to update the site and database to UTF8? even after I set one of the tables columns to utf8 via ALTER TABLE customers MODIFY customers_lastname VARCHAR(32) CHARACTER SET utf8; øüîé is saved as øüîé Any tips? This is driving my customers mad. when I send show create table customers, I get CREATE TABLE `customers` ( `customers_id` int(11) NOT NULL auto_increment, `customers_gender` char(1) NOT NULL, `customers_firstname` varchar(32) NOT NULL, `customers_lastname` varchar(32) character set utf8 default NULL, Thanks! Steve
MrPhil Posted November 12, 2009 Posted November 12, 2009 You should never be specifying certain database columns to be a different character encoding! The entire database should be the same encoding. In your 4 character example, øüîé is being saved as UTF-8 (C3B8 C3BC C3AE C3A9). The page is written out with those byte codes, but is being displayed as Latin-1 (ISO-8859-1), where those 8 bytes produce øüîé . The first task is to figure out why your database is (at least partially) UTF-8, and your page is being displayed in Latin-1. Once you have the two of them consistent, do you have any part of your database in Latin-1? I hesitate suggesting that you convert the entire database to UTF-8 (or Latin-1), for fear that text that is already in UTF-8 will be mangled. Be sure to back up your database (and inspect the backup, to make sure it's complete). If you have a mix of Latin-1 and UTF-8, you can try converting table by table and field by field to the same consistent encoding. If all your data will work in Latin-1, it will take up a bit less space than UTF-8 (which takes 2 bytes per accented character), but if you expect to have any non-Latin-1 data, I'd go with UTF-8. Worst case, you'll have to make sure your backup has no encoding information in it, and is itself consistently UTF-8 (or Latin-1). Then you can empty out your tables, change the entire database to UTF-8 (or Latin-1), and import the backed-up data. Don't forget to change the page from Latin-1 to UTF-8, if that's what your database is. How about your language files -- are they Latin-1 or UTF-8? You need everything to be consistent.
sturbek Posted November 12, 2009 Author Posted November 12, 2009 Phil, thanks for the reply. I have a pretty standard OSC setup. Why would my DB be different from any other? Also, I am seeing these characters in PHPAdmin as well, so I think it in the database, not the OSC display page. In that example I listed, I changed one column to utf8 via the alter statement, but both it AND the unchanged column display the same mistaken characters. When you mention changing the OSC page to UTF8: For example, create_account.php page is saved as utf-8 and the charset set in the HTML via <?php echo CHARSET; ?> is UTF-8 Where else would it be set? Thanks for your help! Steve
MrPhil Posted November 12, 2009 Posted November 12, 2009 I have a pretty standard OSC setup. Why would my DB be different from any other? Well, you said that you've been manually altering the tables so that certain fields (columns) are UTF-8. Is the overall database encoding Latin-1 (the default) or UTF-8? German text could be done in either encoding, but you've got to be consistent in your character encoding, and it sounds like you aren't for some reason. Also, I am seeing these characters in PHPAdmin as well, so I think it in the database, not the OSC display page. Is phpMyAdmin displaying in Latin-1 or UTF-8? View > Page source and see in the header if it declares an encoding. If it doesn't, it's defaulting to Latin-1 (ISO-8859-1). If the overall database is still Latin-1, I think the page will come out in Latin-1, but I won't swear to it. In that example I listed, I changed one column to utf8 via the alter statement, but both it AND the unchanged column display the same mistaken characters. I don't like the idea of changing random columns to be a different encoding. You need your database text, your language files, and your page display all singing the same song. Still, if your database is UTF-8 anyway (have you checked in phpMyAdmin?), and your page is displayed in Latin-1 for some reason, you would see accented characters (multibyte in UTF-8) displayed incorrectly. When you mention changing the OSC page to UTF8: For example, create_account.php page is saved as utf-8 and the charset set in the HTML via <?php echo CHARSET; ?> is UTF-8 For any page, in your browser View > Page source and confirm that the encoding is specified as UTF-8. There's no need to "save" a non-language (just code) PHP file in UTF-8. Language files (particularly if they contain accented characters) need to be edited and saved with the proper encoding.
Recommended Posts
Archived
This topic is now archived and is closed to further replies.