Jump to content
  • Checkout
  • Login
  • Get in touch

osCommerce

The e-commerce.

non-english characters?


rudolfl

Recommended Posts

Hi, all

 

I am after a general advise here.

I am using eWay payment gateway. Recently, I had a customer from Sweden who had non-english character in her address. Putting this address in XML request to eWay caused it to spit the dummy.

eWay advised me that I have to use UTF8 encoding and I am not sure how to do this.

 

Any points will be greatly appreciated.

 

Thanks,

Rudolf

Link to comment
Share on other sites

You need to edit the languages/English.php, or whichever language you use, and change the charset definition to utf-8. You will also need to convert your database so the collation works with utf-8. This is not a minor change if you are not used to working with the database so you may want to ask your host to do it. Also, be sure you create a backup since it can cause problems and you may need to switch back. The best situation would be to create a test shop with its own database so that it can be changed and tested safely.

Support Links:

For Hire: Contact me for anything you need help with for your shop: upgrading, hosting, repairs, code written, etc.

All of My Addons

Get the latest versions of my addons

Recommended SEO Addons

Link to comment
Share on other sites

Note that osC 2.3.x is already in UTF-8, so you must still be back on osC 2.2-something. There are many problems with 2.2 that you need to address immediately (if they have not already been fixed), including a number of security vulnerabilities and compatibility with PHP 5.3 and up. Unless your 2.2 site is heavily modified and the pain of converting to 2.3.3 would be unbearable, I would strongly suggest looking into upgrading via a fresh 2.3.3 install with importation and upgrade of the database. Don't use the 2.2-to-2.3 conversion upgrade -- you'll end up with a "Frankenstore" that is more trouble than it's worth.

 

BTW, there's no such thing as "English characters". English uses the Latin-based ASCII alphabet (character set), with no accented characters as found in other Latin alphabet languages. Latin-1 is a common superset of ASCII, adding accented characters found in Western European languages. There are other Latin-x character sets concentrating on other regions for their accented characters (and even other non-Latin alphabets, such as Cyrillic, Greek, Hebrew, etc.). UTF-8 is a different superset of ASCII, that uses multibyte encoding to provide tens of thousands of characters. Finally, there are a number of other encodings around, many of which are a superset of ASCII, while others have nothing to do with ASCII.

 

Add: note that if you convert your database to UTF-8, that if the original Swedish text was not proper Latin-1 encoding for those characters, the conversion to UTF-8 will probably fail. Be sure to check all customers and data with known accented characters (non-English text). As osC 2.2 was by default Latin-1, and MySQL defaults to Latin-1 (and Swedish collation, to boot!), unless you had changed something it ought to convert to UTF-8 cleanly. MySQL contains function to convert the database from Latin-1 to UTF-8, but you need to be certain that the original (Latin-1) data was correct before attempting to do a conversion. You definitely want to practice on all this with a copy of the site and the database so that you don't mess up your live store.

 

And, if you cut and pasted text (such as product descriptions) in from Word or Outlook, I promise that you will have "Smart Quotes" in there which will not convert properly to UTF-8 (you'll have to manually fix them). Some browsers will work with Smart Quotes because they actually use Windows-1252 character encoding rather than the requested Latin-1 (Windows-1252 is Latin-1 with upper control codes replaced by Smart Quotes characters), but others will interpret them as control codes and chop off the text at the Smart Quotes point. Most browsers seem to work with Smart Quotes when in UTF-8 mode, so long as Word or Outlook offer a UTF-8 version to the clipboard.

Link to comment
Share on other sites

Thanks,

 

I am running 2.3.3, but it was an upgrade from 2.2. While shop itself was done based on 2.3.3, database was migrated.

 

Sounds like a major task (and potentially dangerous too). I think, for now, I will make nice red warning to users to only use english characters. Given this incident is a very rare one, I think it will do for now.

We are at the beginning of our busiest season, so I do not want to do any drastic changes at this time of the year.

 

Thanks again,

Rudolf

Link to comment
Share on other sites

It sounds like, then, the database was not properly converted to UTF-8 during the migration to 2.3.3 (assuming you didn't put 2.3.3 back to Latin-1, in which case you're on your own). If you have been running for a while in 2.3.3, there is the risk that you've got other bits of bad data (non-ASCII accented characters, etc.) beyond this one customer. Keep in mind that you can always create a copy of your shop (code and database) off to the side (and password protected so no one can get in) and play with the database to see how to fix it. It's also a good opportunity to practice backing up and restoring both files and database, if you haven't done so. And you can always keep a backup snapshot of your site, if you fear that something may go wrong with the fix on the production system.

 

Was this Swedish customer's data entered when the shop was osC 2.2 (Latin-1), or was it entered in 2.3.3 (UTF-8)? You've got to figure out what your database is (and it must be consistently Latin-1 or UTF-8 -- no mixture), what your language support files are encoded in (for English, they would be ASCII and thus the same in Latin-1 or UTF-8, except for any setlocale or charset settings, and whether you use a £ sign), and what the pages are displaying in (usually set by the main language file). If at any point these things were not consistent (all Latin-1/ISO-8859-1 or UTF-8), you may have a mixture of encoded data that will have to be manually fixed to be consistent, before anything else. An English-only shop can run with Latin-1, and can handle most of Europe, but when you interface to payment systems that demand UTF-8 you're going to have trouble*. osC 2.3.3 could be set back to all Latin-1, which has been done, so long as you clean up any accented characters that are in the wrong encoding, but it's usually easier in the long run to stay UTF-8.

 

* Perhaps you sent off customer name/address information that was actually Latin-1 accented characters, but osC 2.3.3 claimed by default it was UTF-8 (and thus invalid)? IOW, if your pages were all Latin-1 and the system told eWay it was Latin-1 data, might it have worked?

 

to spit the dummy

I presume that's a colorful Aussie phrase about rejecting something? I have a sister-in-law from Oz and some of her slang and terminology throws me for a loop at times.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...