[m17n] SuSE 9.1 ISO-2022-JP, zh_CN.UTF-8?

2 Jul 2004

      Thomas Karsten  さんは書きました:
...
recently I upgraded from SuSE 8.2 to SuSE 9.1. I would like to write
japanese texts using ISO-2022-JP. To start kinput I use the following
script (canna server is already running):
----- start -----
ENC=eucjp
#ENC=UTF-8
LANG=ja_JP.$ENC XMODIFIERS='@im=kinput2' mlterm 1>/dev/null 2>&1&
LANG=ja_JP.$ENC kinput2 -xim -kinput -canna 1>/dev/null 2>&1&
----- end -----
Why do you start kinput2 from the same script as mlterm?

If you start more mlterms using that script, you will have several
instances of kinput2 running. That is not necessary.
*One* instance of kinput2 is enough for your X-session.

It doesn't matter what locale you use to start kinput2.

I suggest to put 

            export XMODIFIERS="@im=kinput2"
            kinput2 -xim -kinput -canna -cannaserver unix &

in your ~/.xim file in order to get one kinput2 started
automatically when you start you X-session.

You don't need any LANG=<something> in the line starting
kinput2, kinput2 ignores it anyway.

Then, in your X-session

     LC_CTYPE=ja_JP.UTF-8 mlterm

gives you an mlterm using UTF-8 and

     LC_CTYPE=ja_JP.eucJP mlterm

an mlterm using EUC-JP encoding. Both can talk to kinput2.
The mlterm using EUC-JP encoding can display you old ISO-2022-JP
encoded texts directly, the one using UTF-8 cannot, you have
to convert your texts.
...
Under SuSE 8.2 it was possible to use ISO-2022-JP and ja_JP.eucjp
together and it worked fine. So the mlterm displayed japanese texts
correctly, that used the character set ISO-2022-JP.
mlterm  running in ja_JP.eucJP locale can indeed display texts
encoded in ISO-2022-JP. But that is a special feature of mlterm.
Most other terminals don't do that.
...
After upgrading to SuSE 9.1 I used the same script, but I cannot read
the texts any longer. All that I get displayed is like this:
^[$B;d$O^[(B
This is ISO-2022-JP encoding. If you dump it like this in your mlterm
running in UTF-8, it looks like this.
...
This is how the characters are stored, but why doesn't mlterm display
the japanese characters using the configuration as shown above?
If you start mlterm in ja_JP.eucJP locale it does display the
ISO-2022-JP texts.
...
When I write some text and save it, then it is not stored in the
ISO-2022-JP format.
That depends on how you write that text of course. If you use for
example Vim in an mlterm running in ja_JP.UTF-8 locale, the default
encoding used for new files is UTF-8.
...
When I use UTF-8 in my script, everything works fine. Then I can read
and write japanese in the UTF-8 format. But how do I configure my
system to use ISO-2022-JP?
I suggest to convert your old files to UTF-8. See:

    http://www.suse.de/~mfabian/suse-cjk/encodings.html

If you want to keep you old files unchanged and just display
them in an mlterm running in ja_JP.UTF-8 locale, you can use

    lv -Ij -Ou8 file

or

    iconv -f ISO-2022-JP -t UTF-8 < file > less
...
My second question is: Is there a way to use xcin with the LOCALE set
to zh_CN.UTF-8 (read and write simplified Chinese, using UTF-8)? When
I set LOCALE to this value then xcin always exits with the error
message, that there is no such encoding available. I experienced this
problem under both SuSE 8.2 and SuSE 9.1.
LANG=zh_CN LC_ALL=zh_CN xcin &
    export XMODIFIERS="@im=xcin-zh_CN"

    LC_CTYPE=zh_CN.UTF-8 mlterm &

contrary to kinput2, xcin *does* care for the locale it is started
in. You cannot start xcin in zh_CN.UTF-8. But you can start the
clients using xcin in zh_CN.UTF-8 locale.

By the way, why don't you try SCIM? I believe it is much better
especially for simplified Chinese.

And SuSE 9.1 even includes the intelligent PinYin module of SCIM.  The
intelligent PinYin module used to be closed source, but the author,
Zhe SU was so kind to change the license to GPL recently to get it
included in SuSE Linux.

-- 
Mike FABIAN      http://www.suse.de/~mfabian
睡眠不足はいい仕事の敵だ。