Supporting translation in our CMS - Charsets are now all utf-8

  Follow me: Follow Bruce Kirkpatrick by email subscription Bruce Kirkpatrick on Twitter Bruce Kirkpatrick on Facebook
Mon, Feb 04, 2013 at 9:55PM

Complex characters such as those in Japanese, Korean and Arabic require more memory in order to be stored in a database and displayed on the screen.   Behind the scenes all our documents are encoded with some kind of standardized format based on how many and which characters the document contains.  See more about character encoding on wikipedia.

When implementing internationalized features into software, most developers choose the UTF-8 charset aka unicode.   Our CMS was previously storing all the data in latin-1 charset.   As of right now, our database and web pages fully support UTF-8.

We still need to continue modifying our applications to take advantage of this and to support alternate versions of text for pages / error message / forms, etc.

For now, below is a live demo of this blog page (generated by our CMS) showing the same text in different languages. Translations courtesy of Google Translate.

English: 

Our web sites support other languages.

 

Japanese: 

当社のウェブサイトは、他の言語をサポートしています。

 

Korean: 

웹 사이트는 다른 언어를 지원합니다.

 

Arabic: 

مواقع على شبكة الإنترنت دعم لغات أخرى.

 

Chinese: 

我們的網站支持其他語言。


Bookmark & Share



Popular tags on this blog

Performance |