Mailinglist Archive: opensuse-m17n (20 mails)
| < Previous | Next > |
Re: [m17n] [Resend: Bad From: addr] Chinese pinyin phonetics input :
- From: Ulrich Ruess <utde@xxxxxxxxxxxxxx>
- Date: Wed, 10 Dec 2003 01:32:41 +0000 (UTC)
- Message-id: <200312100930.14138.utde@xxxxxxxxxxxxxx>
On Wednesday, December 10, 2003 2:12, Mike FABIAN wrote:
> Philip Amadeo Saeli <psaeli@xxxxxxxxxxxx> さんは書きました:
> > * Mike FABIAN <mfabian@xxxxxxx> [031208 15:45]:
> >> Philip Amadeo Saeli <psaeli@xxxxxxxxxxxx> さんは書きました:
> >> > I am having little problem with the Chinese character input. The
> >> > problem is including the tone marks required by the pinyin
> >> > transcription. I've tried wierd latin vowels, but have not been able
> >> > to find a complete set necessary. I've been using "Insert -> Special
> >> > Character" in OpenOffice.
> >>
> >> Maybe you are using an unsuitable font? You can choose a
> >> font in this "Insert -> Special Character" dialog in OpenOffice.
> >>
> >> Which characters do you need for PinYin?
> >
> > Specifically, I needed the vowels [aeiou] with the first and third tone
> > marks above them, which are basically a dash and an upside-down caret
> > respectively. The vowels with second and fourth tone marks could be
> > represented by existing latin-1 chars.
>
> I.e. you need:
>
> U+01CD: LATIN CAPITAL LETTER A WITH CARON
> U+01CE: LATIN SMALL LETTER A WITH CARON
> U+01CF: LATIN CAPITAL LETTER I WITH CARON
> U+01D0: LATIN SMALL LETTER I WITH CARON
> U+01D1: LATIN CAPITAL LETTER O WITH CARON
> U+01D2: LATIN SMALL LETTER O WITH CARON
> U+01D3: LATIN CAPITAL LETTER U WITH CARON
> U+01D4: LATIN SMALL LETTER U WITH CARON
> U+011A: LATIN CAPITAL LETTER E WITH CARON
> U+011B: LATIN SMALL LETTER E WITH CARON
>
> U+0100: LATIN CAPITAL LETTER A WITH MACRON
> U+0101: LATIN SMALL LETTER A WITH MACRON
> U+012A: LATIN CAPITAL LETTER I WITH MACRON
> U+012B: LATIN SMALL LETTER I WITH MACRON
> U+014C: LATIN CAPITAL LETTER O WITH MACRON
> U+014D: LATIN SMALL LETTER O WITH MACRON
> U+016A: LATIN CAPITAL LETTER U WITH MACRON
> U+016B: LATIN SMALL LETTER U WITH MACRON
> U+0112: LATIN CAPITAL LETTER E WITH MACRON
> U+0113: LATIN SMALL LETTER E WITH MACRON
>
> is that right?
If you want to do it the right way, you also need the 'ü' (the u with a
diaresis). I know most people do not use it, since in most cases it is
obvious whether the sound is "u" or "ü" (in "xu" it can only be "ü", and in
shu it can only be "u"; as far as I found out, only in "lu" it could be
both), but it is part of the character set needed for proper pinyin. Since
they often use English keyboard layout in the PRC, "ü" is just mapped to
another, unused letter (the "Chinese Spring system" used "v", since "v" is
not used in pinyin and XIM uses "uu").
>
> I think it is quite easy to add a new input method to SCIM which
> enables you to input these characters. That would probably be the most
> efficient way for you to input these characters as you are using SCIM
> anyway. I have to update the SuSE SCIM packages anyway, I'll have a
> look whether I can add something like that.
Before you do this, please consider that for pinyin input you do not need
these characters, since the "tone" is already mapped to some characters on
the keyboard (mostly the keys 1 to 5). The result of a pinyin input on the
keyboard is a Chinese character and not it's pinyin representation. If you
want the pinyin (with the tone marks) appear on the screen, you use your
normal alphabet input. These accented characters are only for typesetting of
pinyin in the Latin alphabet.
Tone input for pinyin becomes less and less important as the quality of
sentence analyzing programs increases.
>
> >> I guess "FreeSans" or "Luxi Sans" have all you need.
> >
> > Don't know about "FreeSans", but "Luxi Sans" did -not- have all the
> > needed chars.
>
> I just checks, you are right, "Luxi Sans" lacks a few glyphs for the
> above characters but "FreeSans" has them all. "FreeSans" is in the
> "freefont" rpm-package on SuSE Linux 9.0.
>
> > I finally found several fonts (in addition to the Arphic GB TTF fonts)
> > that included the needed chars, which solved part of my problem. The
> > Arphic font I was using ("AR PL KaitiM GB", in OpenOffice) had the
> > needed chars, but they were double wide and hence unsuitable, the rest
> > of the base Latin chars being standard width.
> >
> > For anyone who is interested, the fonts which included the needed chars
> > were (names from the OpenOffice font selection menu):
> >
> > "AR PL KaitiM GB" (full char cell width)
> > "AR PL SungtiL GB" (full char cell width)
> > "Caslon"
> > "Caslon RomanSmallcaps"
> > "Courier New"
> > "Gentium"
> > "Gentium Alt"
> > "New Century Schoolbook"
> > "Times New Roman"
>
> "Courier New", "Times New Roman" are commercial fonts (from the
> Microsoft "Webfonts")
>
> "New Century Schoolbook" is one of the classic X11 Bitmap fonts, you
> probably don't want to use a bitmap font in OpenOffice.
>
> "Gentium" is not distributable without asking the author. I wrote an
> e-mail to the author asking whether it is OK to distribute "Gentium"
> with SuSE Linux but never received an reply, therfore this font isn't
> included with SuSE Linux.
>
> "Caslon" is free and included with SuSE Linux.
>
> But "FreeSans" might be better, a serif version ("FreeSerif") and a
> monospaced version "FreeMono" are available as well. All of them in
> regular, bold, oblique, and bold-oblique. And apparently these fonts
> are actively developed, if you miss some characters, I suggest to ask
> the author to add it. The home page of the freefont project is:
>
> http://savannah.gnu.org/projects/freefont/
>
> >> > How can I get the compose key to work together with a Chinese input
> >> > method?
> >>
> >> IIIMF is supposed to solve that problem in the long run.
> >
> > What is IIIMF?
>
> The designated successor of XIM (X Input Method).
>
> XIM has quite a few design limitations, one of them is that you cannot
> easily switch input methods on the fly, usually you have to decide
> which input method to use before starting an application and cannot
> change it later. This is the reason why you cannot use compose
> together with SCIM in OpenOffice.
>
> IIIMF (Internet Intranet Input Method Framework) is a redesign which
> supposedly does not have many of the limitations of XIM. Being able
> to switch between different input methods at random is on of the
> design features of IIIMF. IIIMF is not yet included in SuSE Linux but
> may be included in future.
>
> --
> Mike FABIAN <mfabian@xxxxxxx> http://www.suse.de/~mfabian
> 睡眠不足はいい仕事の敵だ。
>
> --
> To unsubscribe, e-mail: m17n-unsubscribe@xxxxxxxx
> For additional commands, e-mail: m17n-help@xxxxxxxx
> Philip Amadeo Saeli <psaeli@xxxxxxxxxxxx> さんは書きました:
> > * Mike FABIAN <mfabian@xxxxxxx> [031208 15:45]:
> >> Philip Amadeo Saeli <psaeli@xxxxxxxxxxxx> さんは書きました:
> >> > I am having little problem with the Chinese character input. The
> >> > problem is including the tone marks required by the pinyin
> >> > transcription. I've tried wierd latin vowels, but have not been able
> >> > to find a complete set necessary. I've been using "Insert -> Special
> >> > Character" in OpenOffice.
> >>
> >> Maybe you are using an unsuitable font? You can choose a
> >> font in this "Insert -> Special Character" dialog in OpenOffice.
> >>
> >> Which characters do you need for PinYin?
> >
> > Specifically, I needed the vowels [aeiou] with the first and third tone
> > marks above them, which are basically a dash and an upside-down caret
> > respectively. The vowels with second and fourth tone marks could be
> > represented by existing latin-1 chars.
>
> I.e. you need:
>
> U+01CD: LATIN CAPITAL LETTER A WITH CARON
> U+01CE: LATIN SMALL LETTER A WITH CARON
> U+01CF: LATIN CAPITAL LETTER I WITH CARON
> U+01D0: LATIN SMALL LETTER I WITH CARON
> U+01D1: LATIN CAPITAL LETTER O WITH CARON
> U+01D2: LATIN SMALL LETTER O WITH CARON
> U+01D3: LATIN CAPITAL LETTER U WITH CARON
> U+01D4: LATIN SMALL LETTER U WITH CARON
> U+011A: LATIN CAPITAL LETTER E WITH CARON
> U+011B: LATIN SMALL LETTER E WITH CARON
>
> U+0100: LATIN CAPITAL LETTER A WITH MACRON
> U+0101: LATIN SMALL LETTER A WITH MACRON
> U+012A: LATIN CAPITAL LETTER I WITH MACRON
> U+012B: LATIN SMALL LETTER I WITH MACRON
> U+014C: LATIN CAPITAL LETTER O WITH MACRON
> U+014D: LATIN SMALL LETTER O WITH MACRON
> U+016A: LATIN CAPITAL LETTER U WITH MACRON
> U+016B: LATIN SMALL LETTER U WITH MACRON
> U+0112: LATIN CAPITAL LETTER E WITH MACRON
> U+0113: LATIN SMALL LETTER E WITH MACRON
>
> is that right?
If you want to do it the right way, you also need the 'ü' (the u with a
diaresis). I know most people do not use it, since in most cases it is
obvious whether the sound is "u" or "ü" (in "xu" it can only be "ü", and in
shu it can only be "u"; as far as I found out, only in "lu" it could be
both), but it is part of the character set needed for proper pinyin. Since
they often use English keyboard layout in the PRC, "ü" is just mapped to
another, unused letter (the "Chinese Spring system" used "v", since "v" is
not used in pinyin and XIM uses "uu").
>
> I think it is quite easy to add a new input method to SCIM which
> enables you to input these characters. That would probably be the most
> efficient way for you to input these characters as you are using SCIM
> anyway. I have to update the SuSE SCIM packages anyway, I'll have a
> look whether I can add something like that.
Before you do this, please consider that for pinyin input you do not need
these characters, since the "tone" is already mapped to some characters on
the keyboard (mostly the keys 1 to 5). The result of a pinyin input on the
keyboard is a Chinese character and not it's pinyin representation. If you
want the pinyin (with the tone marks) appear on the screen, you use your
normal alphabet input. These accented characters are only for typesetting of
pinyin in the Latin alphabet.
Tone input for pinyin becomes less and less important as the quality of
sentence analyzing programs increases.
>
> >> I guess "FreeSans" or "Luxi Sans" have all you need.
> >
> > Don't know about "FreeSans", but "Luxi Sans" did -not- have all the
> > needed chars.
>
> I just checks, you are right, "Luxi Sans" lacks a few glyphs for the
> above characters but "FreeSans" has them all. "FreeSans" is in the
> "freefont" rpm-package on SuSE Linux 9.0.
>
> > I finally found several fonts (in addition to the Arphic GB TTF fonts)
> > that included the needed chars, which solved part of my problem. The
> > Arphic font I was using ("AR PL KaitiM GB", in OpenOffice) had the
> > needed chars, but they were double wide and hence unsuitable, the rest
> > of the base Latin chars being standard width.
> >
> > For anyone who is interested, the fonts which included the needed chars
> > were (names from the OpenOffice font selection menu):
> >
> > "AR PL KaitiM GB" (full char cell width)
> > "AR PL SungtiL GB" (full char cell width)
> > "Caslon"
> > "Caslon RomanSmallcaps"
> > "Courier New"
> > "Gentium"
> > "Gentium Alt"
> > "New Century Schoolbook"
> > "Times New Roman"
>
> "Courier New", "Times New Roman" are commercial fonts (from the
> Microsoft "Webfonts")
>
> "New Century Schoolbook" is one of the classic X11 Bitmap fonts, you
> probably don't want to use a bitmap font in OpenOffice.
>
> "Gentium" is not distributable without asking the author. I wrote an
> e-mail to the author asking whether it is OK to distribute "Gentium"
> with SuSE Linux but never received an reply, therfore this font isn't
> included with SuSE Linux.
>
> "Caslon" is free and included with SuSE Linux.
>
> But "FreeSans" might be better, a serif version ("FreeSerif") and a
> monospaced version "FreeMono" are available as well. All of them in
> regular, bold, oblique, and bold-oblique. And apparently these fonts
> are actively developed, if you miss some characters, I suggest to ask
> the author to add it. The home page of the freefont project is:
>
> http://savannah.gnu.org/projects/freefont/
>
> >> > How can I get the compose key to work together with a Chinese input
> >> > method?
> >>
> >> IIIMF is supposed to solve that problem in the long run.
> >
> > What is IIIMF?
>
> The designated successor of XIM (X Input Method).
>
> XIM has quite a few design limitations, one of them is that you cannot
> easily switch input methods on the fly, usually you have to decide
> which input method to use before starting an application and cannot
> change it later. This is the reason why you cannot use compose
> together with SCIM in OpenOffice.
>
> IIIMF (Internet Intranet Input Method Framework) is a redesign which
> supposedly does not have many of the limitations of XIM. Being able
> to switch between different input methods at random is on of the
> design features of IIIMF. IIIMF is not yet included in SuSE Linux but
> may be included in future.
>
> --
> Mike FABIAN <mfabian@xxxxxxx> http://www.suse.de/~mfabian
> 睡眠不足はいい仕事の敵だ。
>
> --
> To unsubscribe, e-mail: m17n-unsubscribe@xxxxxxxx
> For additional commands, e-mail: m17n-help@xxxxxxxx
| < Previous | Next > |