By Trevor Baca, VP Software Engineering
1. Writing is older than we thought. Google and you'll find that Egyptian hieroglyphs are some 5000 years old while ancient Chinese dates back to 1500 BCE or even 4500 BCE depending on the source. But even this earliest of dates isn't early enough. Researchers reporting to China's Xinhua news agency last year in May documented thousands of characters on cliffs in the northwest part of the country dating back 8000 years. A summary appears in the BBC online
here.
2. Unicode reserves plenty of space for dead symbols. Need classical Chinese characters that fell out of use centuries ago? They're in there. How about Mayan hieroglyphs? Check. This seems crazy but there's nothing else to do -- writing is 100% historical and arbitrary and so information systems developed to model writing will always have to take historical fall-out into account. Today's characters aren't guaranteed to last. And symbols from centuries past sometimes resurface.
3. The Roman letters used in writing English and other western languages look different on the page compared to Greek and Cyrillic. But there's a common ancestor in the Phoenician alphabet of 3000 years ago. The real kicker? Written Hebrew and Arabic are related in the same way. The Phoenicians spoke a Semitic ancestor of modern Hebrew and Arabic and the script they exported on ships across the Mediterranean still shows this relationship today. The three dancing dots floating Arabic
shin are precisely the three points of the trident-shaped character in Hebrew and Cyrillic. All three represent the
sh sound for which we have no symbol at all in our Roman alphabet.
4. Internationalizing your apps for Chinese, Japanese and Korean is definitely a pain. But even alphabetic languages can trip you up. Remember that "IJ" in Dutch, "ch" in Czech and "rr" in Spanish all count as individual letters -- or
digraphs -- that alphabetize differently than what your
String.sort( ) method might like. The Czech word "chleb" for bread comes between "hora" (for mountain) and "golub" (for pigeon) in the dictionary. Oh, and if you're an graphic interface designer, don't forget that Hebrew and Arabic will run right-to-left in your apps, even in your drop-downs, text boxes and radio buttons.
5. The internet may not be all Roman for ever. Reporters at the Guardian
reported last month that Russian officials may at some point push for the creation of a Cyrillic internet. Similar reports for Chinese circulate from time to time. Whether driven by politics or culture, the result for developers can only mean even more work when dealing with the intricacies of the written word.
Recent Comments
Wed, 04.06.2008 14:57
Opra, I couldn't agree more. If you haven't already, [...]
Wed, 04.06.2008 14:50
The OG Review Query is pretty routine. It's probably [...]
Wed, 04.06.2008 14:14
What is a "Og Review Query"? Can I contact the "Og" a [...]