Thai word nam (water) incorrect rendered in QT
(SuSE 8.2, 9.0) QT (3.1 and 3.2) The Thai word "nam" (water) is rendered incorrect. The rule for typing Thai characters, occupying one character cell, is: consonant, vowel, tone-mark. The problem happens with the vowel sara-am : Sara-am consists of two glyphs, nikhahit and sara-aa, where nikhahit is written above the preceeding character, and sara-aa after. With tone-mark, the typing sequence here is as follows: Consonant, tone-mark, sara-am. In the following example, on the left, we can see in which sequence the characters were typed. ( "อ" is just used as a place holder and was not typed. ) ( Depending on the font used, the shifted tonemark is not shown correctly ) With 3 key-strokes (correct) : (no-nu, mai-tho, sara-am) น อ้ อำ --> น้ำ Invalid rendered, mai-tho should be above the nikhahit-part of sara-am. When sara-am is entered, the preceeding mai-tho should be shifted to the top level. With 4 key-strokes (circumvent) : (no-nu, nikhahit, mai-tho, sara-aa) น อํ อ้ อา --> นํ้า (looks fine) Both should look the same. But the preferred way to type this word is with 3 key-strokes. Any suggestions?
Walter On the MS Windows + OE system where I'm reading your message the results of both sequences in your examples look identical - with the combining mark centred over the right hand vertical stroke of the first glyph (which is modified). Since I don't know Thai I cant tell whether or not this is correct. If you are using an OpenType font there could be a problem with the lookups in the particular font or in the order in which the lookups are applied. There could also be a problem due the order in which the characters are stored. This may be dependant on whether you are using a Thai "code-page" or Unicode as your characterset (and sometimes whether or not the text has been "normalised") Unicode always expects combining characters to be stored *after* base characters while I understand that some 8 bit Thai encodings expect certain vowels characters *before* the consonant which they apply to. Of course Font lookups usually assume a particular character order. In the case of OpenType fonts I'd expect them to be built assuming the order specified in the Unicode & ISO 10646 standards. If the vowel mark and tone mark characters have different "canonical combining classes" specified in the Unicode standard they could also be getting re-ordered in any process of "normalisation" that may be going on. Typing an extra character may be overriding / or preventing this. - Chris -- CJ Fynn ----- Original Message ----- From: "Walter Betschart" <wbpub@bluewin.ch> To: <m17n@suse.com> Sent: Tuesday, January 27, 2004 11:29 PM Subject: [m17n] Thai word nam (water) incorrect rendered in QT (SuSE 8.2, 9.0) QT (3.1 and 3.2) The Thai word "nam" (water) is rendered incorrect. The rule for typing Thai characters, occupying one character cell, is: consonant, vowel, tone-mark. The problem happens with the vowel sara-am : Sara-am consists of two glyphs, nikhahit and sara-aa, where nikhahit is written above the preceeding character, and sara-aa after. With tone-mark, the typing sequence here is as follows: Consonant, tone-mark, sara-am. In the following example, on the left, we can see in which sequence the characters were typed. ( "�" is just used as a place holder and was not typed. ) ( Depending on the font used, the shifted tonemark is not shown correctly ) With 3 key-strokes (correct) : (no-nu, mai-tho, sara-am) � �� �� --> ��� Invalid rendered, mai-tho should be above the nikhahit-part of sara-am. When sara-am is entered, the preceeding mai-tho should be shifted to the top level. With 4 key-strokes (circumvent) : (no-nu, nikhahit, mai-tho, sara-aa) � �� �� �� --> ���� (looks fine) Both should look the same. But the preferred way to type this word is with 3 key-strokes. Any suggestions? -- To unsubscribe, e-mail: m17n-unsubscribe@suse.com For additional commands, e-mail: m17n-help@suse.com
Hi Chris On Wednesday 28 January 2004 01:27, C J Fynn wrote:
Walter
On the MS Windows + OE system where I'm reading your message the results of both sequences in your examples look identical - with the combining mark centred over the right hand vertical stroke of the first glyph (which is modified). Since I don't know Thai I cant tell whether or not this is correct.
I attach a JPEG to show, how it looks in kwrite, and a plain file, which can be opened with other applications. The word should look the same with 3 and 4 key-strokes. But the word is usually typed with 3 key-strokes. It is rendered OK in Windows and OpenOffice. It is not OK in KDE.
If you are using an OpenType font there could be a problem with the lookups in the particular font or in the order in which the lookups are applied.
It doesn't matter which font I use. Norasi, Arial Unicode, Angsana New, Phaisarn, Garuda ....
There could also be a problem due the order in which the characters are stored. This may be dependant on whether you are using a Thai "code-page" or Unicode as your characterset (and sometimes whether or not the text has been "normalised") Unicode always expects combining characters to be stored *after* base characters while I understand that some 8 bit Thai encodings expect certain vowels characters *before* the consonant which they apply to.
Of course Font lookups usually assume a particular character order. In the case of OpenType fonts I'd expect them to be built assuming the order specified in the Unicode & ISO 10646 standards.
If the vowel mark and tone mark characters have different "canonical combining classes" specified in the Unicode standard they could also be getting re-ordered in any process of "normalisation" that may be going on. Typing an extra character may be overriding / or preventing this.
There are 4 levels in Thai writing. I call them: base level, above level, top level and below level. The first consonant no-nu is written in the base level. A vowel, which is written above. should be in the above level. In the word "nam" it is the nikhahit part of the combined vowel sara-am. If there is no vowel above the consonant, the tone-mark is written in the above level. If there is a vowel in the above level, the tone-mark should be written in the top level. Usually a typewriter puts tone-marks allways in the top level. The same approach is taken by OpenOffice. More information about this topic: http://www.inet.co.th/cyberclub/trin/thairef/#ThaiEncodings http://www.fedu.uec.ac.jp/ZzzThai/thailang/#type http://www.nectec.or.th/it-standards/thaistd.pdf http://www.nectec.or.th/it-standards/thaistd_tr.pdf http://www.inet.co.th/cyberclub/trin/thairef/wtt2/char-class.pdf http://www.unicode.org/charts/PDF/U0E00.pdf Cheers Walter
- Chris
-- CJ Fynn
----- Original Message ----- From: "Walter Betschart" <wbpub@bluewin.ch> To: <m17n@suse.com> Sent: Tuesday, January 27, 2004 11:29 PM Subject: [m17n] Thai word nam (water) incorrect rendered in QT
(SuSE 8.2, 9.0) QT (3.1 and 3.2)
The Thai word "nam" (water) is rendered incorrect.
The rule for typing Thai characters, occupying one character cell, is: consonant, vowel, tone-mark.
The problem happens with the vowel sara-am : Sara-am consists of two glyphs, nikhahit and sara-aa, where nikhahit is written above the preceeding character, and sara-aa after. With tone-mark, the typing sequence here is as follows: Consonant, tone-mark, sara-am.
In the following example, on the left, we can see in which sequence the characters were typed. ( "อ" is just used as a place holder and was not typed. ) ( Depending on the font used, the shifted tonemark is not shown correctly )
With 3 key-strokes (correct) : (no-nu, mai-tho, sara-am) น อ้ อำ --> น้ำ Invalid rendered, mai-tho should be above the nikhahit-part of sara-am. When sara-am is entered, the preceeding mai-tho should be shifted to the top level.
With 4 key-strokes (circumvent) : (no-nu, nikhahit, mai-tho, sara-aa) น อํ อ้ อา --> นํ้า (looks fine)
Both should look the same. But the preferred way to type this word is with 3 key-strokes.
Any suggestions?
On Wednesday 28 January 2004 16:26, Walter Betschart wrote:
I attach a JPEG to show, how it looks in kwrite, and a plain file, which can be opened with other applications.
I try it with a PNG File this time. I don't know, if I can send attachments like this to the list.
On Wednesday 28 January 2004 17:12, Walter Betschart wrote:
I try it with a PNG File this time. I don't know, whether I can send attachments like this to the list.
Now I know. No problem. In the following page is an example, how the word "nam" should look like. Just search for "water". http://www.inet.co.th/cyberclub/trin/thairef/#ThaiEncodings
participants (2)
-
C J Fynn
-
Walter Betschart