[opensuse-m17n] scim in 10.3 producing ascii7 mojibake
Hi, After (finally) installing opensuse 10.3 on one machine, I found my scim input system impaired. I can still input normally into firefox but not into xterm (or xterm-based mutt) nor into emacs. Whenever I try to input something there, I get a string of plain 7bit characters, including control characters, e.g. "C$" for "ä", "C6" for "ö", "C<" for "ü", "g" + bell sound for "睡眠", "*c" + line deletion for "かな". I still haven't researched much into this, will continue to try to narrow the bug down. Has something like this been reported before? -- PILCH Hartmut 裴寒牧 ピルヒ・ハルトムート http://a2e.de/phm -- To unsubscribe, e-mail: opensuse-m17n+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-m17n+help@opensuse.org
I can still input normally into firefox but not into xterm (or xterm-based mutt) nor into emacs.
Whenever I try to input something there, I get a string of plain 7bit characters, including control characters, e.g. "C$" for "ä", "C6" for "ö", "C<" for "ü", "g" + bell sound for "睡眠", "*c" + line deletion for "かな".
I should add that I'm working in the zh_CN.utf-8 locale, with almost all locale variables set to this, based on selection at the beginning of the installation process, done with YaST2 in this locale this week. I notice also that the display of characters in zh_CN.utf-8 is incomplete. Some members of the corresponding charset (the unicode subset that corresponds to gb2312) are represented by boxes. I find it difficult to imagine that 10.3 could have been published for China with such flaws and in fact haven't found any Chinese reports about such flaws with Google so far. But there's also nothing unusual in my configuration AFAICS. -- Hartmut Pilch http://a2e.de/phm -- To unsubscribe, e-mail: opensuse-m17n+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-m17n+help@opensuse.org
I can still input normally into firefox but not into xterm (or xterm-based mutt) nor into emacs.
Whenever I try to input something there, I get a string of plain 7bit characters, including control characters, e.g. "C$" for "ä", "C6" for "ö", "C<" for "ü", "g" + bell sound for "睡眠", "*c" + line deletion for "かな".
It is now alright. I can input multiple languages, including zh and ja, into xterm and emacs as well as everything else. I had to remove the scim-bridge package. Reports like scim-bridge経由で問題なく入力できている状態 http://lists.opensuse.org/opensuse-ja/2007-10/msg00046.html give the impression that some people may not be able to input Japanese without this package. The docs didn't tell me anything about what scim-bridge is good for, and I don't know how this package, which seems to be dangerous for some people and vital for others, found its way into my installation -- I don't remember having selected it. -- Hartmut Pilch http://a2e.de/phm -- To unsubscribe, e-mail: opensuse-m17n+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-m17n+help@opensuse.org
PILCH Hartmut <phm@a2e.de> さんは書きました:
I can still input normally into firefox but not into xterm (or xterm-based mutt) nor into emacs.
Whenever I try to input something there, I get a string of plain 7bit characters, including control characters, e.g. "C$" for "ä", "C6" for "ö", "C<" for "ü", "g" + bell sound for "睡眠", "*c" + line deletion for "かな".
It is now alright. I can input multiple languages, including zh and ja, into xterm and emacs as well as everything else.
I had to remove the scim-bridge package.
Strange because I usually have the scim-bridge package installed and don’t run into that problem. Can you please try if you can reproduce the problem if you install scim-bridge again?
Reports like
scim-bridge経由で問題なく入力できている状態 http://lists.opensuse.org/opensuse-ja/2007-10/msg00046.html
give the impression that some people may not be able to input Japanese without this package. The docs didn't tell me anything about what scim-bridge is good for, and I don't know how this package, which seems to be dangerous for some people and vital for others, found its way into my installation -- I don't remember having selected it.
It mainly solves problems with incompatibilities between different versions of libstdc++. See section “What is this for?” in /usr/share/doc/packages/scim-bridge/doc/developer/introduction.html or for example comment #18 in http://bugzilla.novell.com/show_bug.cgi?id=353251 Sometimes it appears to solve input problems in Firefox and OpenOffice as well, especially when 32bit versions of these applications are used on a 64bit system. In the latest updates for the acroread 8.1.2 packages for STABLE/Factory and openSUSE 10.3, we have deleted the libstdc++ which comes with the acroread tarball. Then, acroread uses the systemwide libstdc++, i.e. the same which is used by scim and the compatibility problem with scim disappears. I tested that acroread 8.1.2 with libstdc++ deleted works fine with both, the “scim” module and the “scim-bridge” module (GTK_IM_MODULE=scim or GTK_IM_MODULE=scim-bridge). Only XIM still doesn’t work in acroread in many locales, but that can only be solved upstream by Adobe. See also http://bugzilla.novell.com/show_bug.cgi?id=353251 for the XIM problem in acroread. -- Mike FABIAN <mfabian@suse.de> http://www.suse.de/~mfabian 睡眠不足はいい仕事の敵だ。 I � Unicode -- To unsubscribe, e-mail: opensuse-m17n+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-m17n+help@opensuse.org
Whenever I try to input something there, I get a string of plain 7bit characters, including control characters, e.g. "C$" for "ä", "C6" for "ö", "C<" for "ü", "g" + bell sound for "睡眠", "*c" + line deletion for "かな".
It is now alright. I can input multiple languages, including zh and ja, into xterm and emacs as well as everything else.
I had to remove the scim-bridge package.
Strange because I usually have the scim-bridge package installed and don’t run into that problem.
Can you please try if you can reproduce the problem if you install scim-bridge again?
I've been away from this list and busy installing suse 10.3 on several machines, mirrorring the m17n repository, building my own rpm-md repository and quite a few rpm packages meanwhile, but wasn't able to investigate nor to read the m17n mailing list at all, but this change again. Thanks for answering my questions and keeping up the work. -- Hartmut Pilch http://a2e.de/phm -- To unsubscribe, e-mail: opensuse-m17n+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-m17n+help@opensuse.org
PILCH Hartmut <phm@a2e.de> さんは書きました:
I should add that I'm working in the zh_CN.utf-8 locale, with almost all locale variables set to this, based on selection at the beginning of the installation process, done with YaST2 in this locale this week.
I notice also that the display of characters in zh_CN.utf-8 is incomplete.
Where? In xterm? What font are you using?
Some members of the corresponding charset (the unicode subset that corresponds to gb2312) are represented by boxes. I find it difficult to imagine that 10.3 could have been published for China with such flaws and in fact haven't found any Chinese reports about such flaws with Google so far. But there's also nothing unusual in my configuration AFAICS.
-- Mike FABIAN <mfabian@suse.de> http://www.suse.de/~mfabian 睡眠不足はいい仕事の敵だ。 I � Unicode -- To unsubscribe, e-mail: opensuse-m17n+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-m17n+help@opensuse.org
I should add that I'm working in the zh_CN.utf-8 locale, with almost all locale variables set to this, based on selection at the beginning of the installation process, done with YaST2 in this locale this week.
I notice also that the display of characters in zh_CN.utf-8 is incomplete.
Where? In xterm?
Yes. In mlterm everything works fine.
What font are you using?
I have installed most of the SuSE 10.3 font packages as they came. I guess the fonts listed in /usr/share/X11/app-defaults/XTerm are being used. How can I check which font XTerm is using for a given character? -- Hartmut Pilch http://a2e.de/phm -- To unsubscribe, e-mail: opensuse-m17n+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-m17n+help@opensuse.org
PILCH Hartmut <phm@a2e.de> さんは書きました:
I should add that I'm working in the zh_CN.utf-8 locale, with almost all locale variables set to this, based on selection at the beginning of the installation process, done with YaST2 in this locale this week.
I notice also that the display of characters in zh_CN.utf-8 is incomplete.
Where? In xterm?
Yes.
In mlterm everything works fine.
What font are you using?
I have installed most of the SuSE 10.3 font packages as they came. I guess the fonts listed in
/usr/share/X11/app-defaults/XTerm
are being used.
Yes.
How can I check which font XTerm is using for a given character?
xterm is never using more than two fonts at once, one for single width and one for double-width (This is the same with mlterm by the way). In /usr/share/X11/app-defaults/XTerm you see: *fontMenu*fontdefault*Label: Default *VT100.font: -misc-fixed-medium-r-semicondensed-*-13-120-75-75-c-60-iso10646-1 *VT100.wideFont: -misc-fixed-medium-r-normal-*-13-120-75-75-c-120-iso10646-1 I.e. by default these two fonts are used, the one with the VT100.font resource for the single width characters, the one with the VT100.wideFont resource for the double-width characters. The -misc-fixed-medium-r-normal-*-13-120-75-75-c-120-iso10646-1 font used for the double-width characters has enough characters for Japanese but unfortunately not for Chinese. Because of this, I tried to change the default to a different font years ago, but so many people insisted on keeping exactly this font as the default that I could not do it. I would have liked to change the other 6 fonts which can be selected with Control+RightMouse to fonts which cover Unicode well and have matching double width fonts which cover Japanese and Chinese. But this was not possible either because again some people strongly insisted on keeping exactly the same fonts as always. As the people who insist on keeping the time honoured default fonts for xterm usually care only about single width fonts, it would be possible to make CJK work well by default if xterm would not require that the double-width font must be *exactly* twice as wide and *exactly* as high as the single with font but could adapt to small differences by padding a few pixels. mlterm, urxvt, gnome-terminal, konsole, ... can do this. See http://bugzilla.novell.com/show_bug.cgi?id=49305 Then one could keep the well known default fonts for single width and configure double width fonts which are close to the optimal size and have good coverage. But as the single width fonts and the double with fonts currently need to fit *exactly* in xterm, one needs to configure exactly matching pairs of fonts and there are not many of these. With the additional requirement that the single width fonts must stay the same as always, there is not much which can be done except from adding Chinese characters to double width font which match the "classic" xterm single width fonts exactly. The only small concession in the xterm default setup I could achieve was to replace the "unreadable" font which, as the name says, was just some tiny dots too small to read by fonts with good coverage. This is the font pair now called "Unicode Best" in the Control+RightMouse menu: *fontMenu*font1*Label: Unicode Best *VT100.font1: -misc-fixed-medium-r-normal-*-18-120-100-100-c-90-iso10646-1 *VT100.wideFont1: -misc-fixed-medium-r-normal-*-18-120-100-100-c-180-iso10646-1 (But there were even users who objected against this change!) I could change the file /usr/share/X11/app-defaults/UXTerm to use fonts with good Unicode coverage without any complaints though. Therefore, you get default fonts with much better Unicode coverage when you start xterm like this: xterm -class UXTerm (or use the script /usr/bin/uxterm which does use the "-class UXTerm" option.)
Some members of the corresponding charset (the unicode subset that corresponds to gb2312) are represented by boxes. I find it difficult to imagine that 10.3 could have been published for China with such flaws and in fact haven't found any Chinese reports about such flaws with Google so far.
I think the reason why nobody complains is that most people use the "default" terminals of the Gnome or KDE desktop environments, i.e. gnome-terminal or konsole. Both gnome-terminal and konsole handle CJK reasonably well. Only old UNIX hackers use xterm. And these usually know how to setup their fonts if they don’t suit their purpose. Another very good terminal for which is unfortunately not so well known is rxvt-unicode aka urxvt. Contrary to xterm and mlterm, urxvt can use not only two fonts at the same time but any number you like. For example: urxvt -fn "xft:DejaVu Sans Mono:pixelsize=16,xft:IPAGothic,xft:FZSongTi,xft:Khmer OS System,xft:Code2000" This command line uses the first font on the list, i.e. “DejaVu Sans Mono” for all glyphs it supports, then falls back to the next one, “IPAGothic”. If “IPAGothic” still lacks a glyph “FZSongTi” is used and so on ... If you are interested more in Chinese than Japanese, move your favorite Chinese fonts before any Japanese font in the urxvt font list. For further details of the urxvt font setup please see the man-page (“man urxvt”). -- Mike FABIAN <mfabian@suse.de> http://www.suse.de/~mfabian 睡眠不足はいい仕事の敵だ。 I � Unicode -- To unsubscribe, e-mail: opensuse-m17n+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-m17n+help@opensuse.org
PILCH Hartmut <phm@a2e.de> さんは書きました:
After (finally) installing opensuse 10.3 on one machine, I found my scim input system impaired.
I can still input normally into firefox
the gtk scim module is usually used in that case.
but not into xterm (or xterm-based mutt) nor into emacs.
these applications use scim via XIM.
Whenever I try to input something there, I get a string of plain 7bit characters, including control characters, e.g. "C$" for "ä", "C6" for "ö", "C<" for "ü", "g" + bell sound for "睡眠", "*c" + line deletion for "かな".
I still haven't researched much into this, will continue to try to narrow the bug down.
Has something like this been reported before?
I think I have seen that once a few months ago but was unable to reproduce it after restarting the X session. -- Mike FABIAN <mfabian@suse.de> http://www.suse.de/~mfabian 睡眠不足はいい仕事の敵だ。 I � Unicode -- To unsubscribe, e-mail: opensuse-m17n+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-m17n+help@opensuse.org
participants (2)
-
Mike FABIAN
-
PILCH Hartmut