Hi, I am trying to use UTF-8 as the the locale environment under SuSE 8.0. I have some problematic areas where I have not find a solution yet. Hence I need assistance This is what I have under $HOME/.profile export LANG="turkish" export LC_CTYPE="tr_TR.utf8" export LC_NUMERIC="tr_TR.utf8" export LC_TIME="POSIX" export LC_COLLATE="POSIX" export LC_MONETARY="tr_TR.utf8" export LC_MESSAGES="en_US" export LC_PAPER="tr_TR.utf8" export LC_NAME="tr_TR.utf8" export LC_ADDRESS="tr_TR.utf8" export LC_TELEPHONE="tr_TR.utf8" export LC_MEASUREMENT="tr_TR.utf8" export LC_IDENTIFICATION="tr_TR.utf8" Under KDE-> Control Center -> Personalization -> Country and language I have Country as Turkey language as English. I did not install kde-i18n related packages as I do not use the system other than English, however I do create documents in Turkish and I type using the Turkish keyboard. 1) When I type locale charmap I get UTF-8 so my understanding is I am using UTF-8 2) I use "utf8xterm" shell script that comes with shtools-2002.01.30-39 and a) I can type all the turkish characters in the Xterm environment created by this script b) I can use Vim and type the Turkish characters ş (scedilla) ğ (gbreve) ı (idotless) İ (Iabovedot) hence I can use Mutt and type these characters and read them. c) same is true if I just open a konsole with KDE d) When I use "xev" pressing the keys scedila gbreve idotless Iabovedot produces empty. So just to test I made a symbolink link under /usr/lib/locale for tr_TR.UTF-8 which is pointing to tr_TR.utf8. When I export tr_TR.UTF-8 for LC_CTYPE and use "xev" this time I get back the desired output. However this causes Netscape 4.79 to behave strange and I can not use Netscape. e) if LC_CTYPE is tr_TR.utf8 then under emacs I can not type the scedilla, idotless, gbreve, Iabovedot. I have the mule-ucs package installed but still no hope. I tried with setting emacs language environment to turkish but no. There must be something I need to tweak. However I just use emacs for Docbook editing and thats all so I do not know the internals . So in short tp rephrase what do I have to do to have a proper UTF-8 environment which will enable me to type read in Turkish but provide me all the menus and system messages in English Thanks -- Togan Muftuoglu Unofficial SuSE FAQ Maintainer http://dinamizm.ath.cx
Togan Muftuoglu <toganm@dinamizm.com> writes:
This is what I have under $HOME/.profile
export LANG="turkish"
This is just an alias to tr_TR.ISO-8859-9 (see /usr/share/locale/locale.alias). Better use tr_TR.UTF-8 here as well, just for consistency.
export LC_CTYPE="tr_TR.utf8"
tr_TR.UTF-8 is the correct spelling. glibc 'normalizes' the encoding part of the locales, i.e. glibc doesn't care for upper or lower case, hyphens or underscores. But X11 does, for X11 you should better use the correct spelling tr_TR.UTF-8: mfabian@gregory:/tmp$ LC_ALL=tr_TR.utf8 ~mfabian/bin/XSupportsLocale False. mfabian@gregory:/tmp$ LC_ALL=tr_TR.UTF-8 ~mfabian/bin/XSupportsLocale True. mfabian@gregory:/tmp$ (~mfabian/bin/XSupportsLocale is just a tiny test program which does nothing more but reporting the return value of XSupportsLocale().) Many X11 programs will output a warning or error message if you use utf8 instead of UTF-8, for example: mfabian@gregory:/tmp$ LC_ALL=tr_TR.utf8 gedit Gdk-WARNING **: locale not supported by Xlib, locale set to C If you don't see such a warning, it doesn't necessarily mean that there will be no problem, so better use UTF-8 always, not utf8.
export LC_NUMERIC="tr_TR.utf8" export LC_TIME="POSIX" export LC_COLLATE="POSIX" export LC_MONETARY="tr_TR.utf8" export LC_MESSAGES="en_US" export LC_PAPER="tr_TR.utf8" export LC_NAME="tr_TR.utf8" export LC_ADDRESS="tr_TR.utf8" export LC_TELEPHONE="tr_TR.utf8" export LC_MEASUREMENT="tr_TR.utf8" export LC_IDENTIFICATION="tr_TR.utf8"
Apart frome the spelling issue with utf8 -> UTF-8, there is nothing wrong with that, although it is usually not necessary to set all of these variables. You could already achieve the same effect with setting only: export LANG=tr_TR.UTF-8 export LC_TIME=POSIX export LC_COLLATE=POSIX export LC_MESSAGES=en_US variables which you don't set will inherit their value from LANG. After exporting only the that the 'locale' command will output: mfabian@gregory:~$ locale LANG=tr_TR.UTF-8 LC_CTYPE="tr_TR.UTF-8" LC_NUMERIC="tr_TR.UTF-8" LC_TIME=POSIX LC_COLLATE=POSIX LC_MONETARY="tr_TR.UTF-8" LC_MESSAGES=en_US LC_PAPER="tr_TR.UTF-8" LC_NAME="tr_TR.UTF-8" LC_ADDRESS="tr_TR.UTF-8" LC_TELEPHONE="tr_TR.UTF-8" LC_MEASUREMENT="tr_TR.UTF-8" LC_IDENTIFICATION="tr_TR.UTF-8" LC_ALL= mfabian@gregory:~$ variables which did inherit their value from LANG have their values enclosed by "" in the output of 'locale', variables which have been set individually don't have these "".
c) same is true if I just open a konsole with KDE d) When I use "xev" pressing the keys scedila gbreve idotless Iabovedot produces empty.
Yes, this is because of the utf8 -> UTF-8 spelling problem.
So just to test I made a symbolink link under /usr/lib/locale for tr_TR.UTF-8 which is pointing to tr_TR.utf8.
You don't need to create that symlink. It doesn't hurt of course, but it is useless. Remove it again and you will see that it still works if you specify tr_TR.UTF-8. /usr/lib/locale is only used by glibc and glibc doesn't for these spelling details. The locale stuff for X11 is in /usr/X11R6/lib/X11/locale/.
When I export tr_TR.UTF-8 for LC_CTYPE and use "xev" this time I get back the desired output. However this causes Netscape 4.79 to behave strange and I can not use Netscape.
Yes, Netscape 4.79 does not work right in UTF-8 locales. That it appears to work with tr_TR.utf8 is only because this locale is invalid for X11 and the 'C' is used instead: mfabian@gregory:~$ LC_ALL=tr_TR.utf8 netscape netscape: locale `tr_TR.utf8' not supported by Xlib; trying `C'. mfabian@gregory:~$ As Netscape 4.79 is old, unmaintained, binary only software, it is unlikely that this will ever get fixed. Try more modern browsers like Mozilla or Konqueror instead or, if you still need to use Netscape 4.79 for some reason, start it with explicitely specifying a non-UTF-8 locale: LC_ALL=tr_TR netscape Maybe make a script for that if you need it often.
e) if LC_CTYPE is tr_TR.utf8 then under emacs I can not type the scedilla, idotless, gbreve, Iabovedot. I have the mule-ucs package installed but still no hope. I tried with setting emacs language environment to turkish but no.
This will setup defaults in Emacs for ISO-8859-9 encoding, but not for UTF-8.
There must be something I need to tweak. However I just use emacs for Docbook editing and thats all so I do not know the internals . So in short tp rephrase what do I have to do to have a proper UTF-8 environment which will enable me to type read in Turkish but provide me all the menus and system messages in English Thanks
In Emacs use M-x set-keyboard-coding-system RET utf-8 RET then it works. Add (set-keyboard-coding-system 'utf-8) to your ~/.emacs if you want to use that always. Or maybe, to do that only when Emacs is started in a locale which uses UTF-8 charmap, you may prefer: (when (string-match "UTF-8" (shell-command-to-string "locale charmap")) (set-keyboard-coding-system 'utf-8)) If you start Emacs in an UTF-8 locale, you probably also want to read and write files in UTF-8 encoding by default, i.e. you may even want to use (when (string-match "UTF-8" (shell-command-to-string "locale charmap")) (set-default-coding-systems 'utf-8)) 'set-default-coding-systems' already includes 'set-keyboard-coding-system' but it does a bit more, according to its doc-string it does: set-default-coding-systems is a compiled Lisp function in `international/mule-cmds'. (set-default-coding-systems CODING-SYSTEM) Set default value of various coding systems to CODING-SYSTEM. This sets the following coding systems: o coding system of a newly created buffer o default coding system for subprocess I/O This also sets the following values: o default value used as file-name-coding-system for converting file names. o default value for the command `set-terminal-coding-system' (not on MSDOS) o default value for the command `set-keyboard-coding-system'. -- Mike Fabian <mfabian@suse.de> http://www.suse.de/~mfabian 睡眠不足はいい仕事の敵だ。
Hi, Thanks a lot I have set up the $HOME/.profile as you suggested * Mike Fabian; <mfabian@suse.de> on 14 May, 2002 wrote:
export LC_TIME=POSIX export LC_COLLATE=POSIX export LC_MESSAGES=en_US
LC_ALL=tr_TR netscape
Ok understood
In Emacs use M-x set-keyboard-coding-system RET utf-8 RET then it works. Add
(set-keyboard-coding-system 'utf-8)
to your ~/.emacs if you want to use that always. Or maybe, to do that only when Emacs is started in a locale which uses UTF-8 charmap, you may prefer:
(when (string-match "UTF-8" (shell-command-to-string "locale charmap")) (set-keyboard-coding-system 'utf-8))
(when (string-match "UTF-8" (shell-command-to-string "locale charmap")) (set-default-coding-systems 'utf-8))
Ok this I added to my .gnu-emacs-custom, as I read ( or my interpretation ) during the creation of the user $HOME/.emacs and $HOME/.gnu-emacs are copied from /etc/skel directory. When I start emacs yes I can type read save in utf-8 however I have some strange output on some of the keys in Capital form whether I use shift or CapsLock. <begin> # this part is written under Emacs The problematic keys are as follows: Udiaresis (Capital letter): prints out this and says Mark Set I have the same result when I use caps lock or Shift Scedilla (Capital letter): prints out Š same with caps lock and using shift Ccedilla (captial letter) I recive Invalid character message 05600,2944,0xb80 Odiaresis (capital letter): I receive Invalid character message 0340000, 114688, 0x1c000 </end> I tried with replacing SuSE created .emacs and .gnu-emacs with renaming my .gnu-emacs-custom to .emacs but still the same problem happens. However when I type these same characters in an xterm environment created by "utf8xterm" under Vim ( that is what I use for Mutt's editor) here are the results Ü Ş Ç Ö , I can type these characters I can see them on my display and there are no error messages that are visible to me. Is this a bug with emacs or are there other customizations I have to tweak. Thanks -- Togan Muftuoglu Unofficial SuSE FAQ Maintainer http://dinamizm.ath.cx
Togan Muftuoglu <toganm@dinamizm.com> writes:
In Emacs use M-x set-keyboard-coding-system RET utf-8 RET then it works. Add
(set-keyboard-coding-system 'utf-8)
to your ~/.emacs if you want to use that always. Or maybe, to do that only when Emacs is started in a locale which uses UTF-8 charmap, you may prefer:
(when (string-match "UTF-8" (shell-command-to-string "locale charmap")) (set-keyboard-coding-system 'utf-8))
(when (string-match "UTF-8" (shell-command-to-string "locale charmap")) (set-default-coding-systems 'utf-8))
Ok this I added to my .gnu-emacs-custom, as I read ( or my interpretation ) during the creation of the user $HOME/.emacs and $HOME/.gnu-emacs are copied from /etc/skel directory.
Yes, if you use the default Emacs startup files from /etc/skel, it is probably best to add you own setup to the end of ~/.gnu-emacs. The SuSE default ~/.emacs contains (setq custom-file "~/.gnu-emacs-custom") (load "~/.gnu-emacs-custom" t t) therefore ~/.gnu-emacs-custom will be used when you change settings in Emacs with 'M-x customize' and choose 'Save for Future Sessions'. Adding your onw extensions to ~/.gnu-emacs-custom works as well, but the idea is to use ~/.gnu-emacs for setting up stuff by manually adding lisp code and to use ~/.gnu-emacs-custom for the stuff saved automatically by 'M-x customize'.
When I start emacs yes I can type read save in utf-8 however I have some strange output on some of the keys in Capital form whether I use shift or CapsLock.
<begin> # this part is written under Emacs
The problematic keys are as follows:
Udiaresis (Capital letter): prints out this ? and says Mark Set I have the same result when I use caps lock or Shift
Scedilla (Capital letter): prints out Š same with caps lock and using shift
Ccedilla (captial letter) I recive Invalid character message 05600,2944,0xb80
Odiaresis (capital letter): I receive Invalid character message 0340000, 114688, 0x1c000
</end>
Yes, I can reproduce this.
I tried with replacing SuSE created .emacs and .gnu-emacs with renaming my .gnu-emacs-custom to .emacs but still the same problem happens. However when I type these same characters in an xterm environment created by "utf8xterm" under Vim ( that is what I use for Mutt's editor) here are the results
Ü Ş Ç Ö , I can type these characters I can see them on my display and there are no error messages that are visible to me. Is this a bug with emacs or are there other customizations I have to tweak.
It appears to be a bug in Emacs. I see the same problem with GNU Emacs 21.1.1 (which is the version distributed with SuSE Linux 8.0) but it works correctly with GNU Emacs 21.2.50.1, which is a current CVS version of Emacs. It also works already with the XEmacs version distributed with SuSE Linux 8.0, i.e. you can use XEmacs instead of Emacs as a workaround. In case of XEmacs, you add your personal setup like (wh2en (string-match "UTF-8" (shell-command-to-string "locale charmap")) (set-default-coding-systems 'utf-8)) to ~/.xemacs/init.el I'll try whether I can find what change between GNU Emacs 21.1.1 and GNU Emacs 21.2.50.1 fixed this bug. -- Mike Fabian <mfabian@suse.de> http://www.suse.de/~mfabian 睡眠不足はいい仕事の敵だ。
* Mike Fabian; <mfabian@suse.de> on 14 May, 2002 wrote:
therefore ~/.gnu-emacs-custom will be used when you change settings in Emacs with 'M-x customize' and choose 'Save for Future Sessions'. Adding your onw extensions to ~/.gnu-emacs-custom works as well, but the idea is to use ~/.gnu-emacs for setting up stuff by manually adding lisp code and to use ~/.gnu-emacs-custom for the stuff saved automatically by 'M-x customize'.
Hmm nice to know thanks
Yes, I can reproduce this.
Good , I mean its not only my setup then :-)
It also works already with the XEmacs version distributed with SuSE Linux 8.0, i.e. you can use XEmacs instead of Emacs as a workaround.
Ok I will try Xemacs as I have to get my Docbook Editing system back to work :-)
In case of XEmacs, you add your personal setup like
~/.xemacs/init.el
Ok hopefully all my Docbook setup works as well
I'll try whether I can find what change between GNU Emacs 21.1.1 and GNU Emacs 21.2.50.1 fixed this bug.
Much appreciated thanks -- Togan Muftuoglu Unofficial SuSE FAQ Maintainer http://dinamizm.ath.cx
Togan Muftuoglu <toganm@dinamizm.com> writes:
* Mike Fabian; <mfabian@suse.de> on 14 May, 2002 wrote:
It also works already with the XEmacs version distributed with SuSE Linux 8.0, i.e. you can use XEmacs instead of Emacs as a workaround.
Ok I will try Xemacs as I have to get my Docbook Editing system back to work :-)
In case of XEmacs, you add your personal setup like
~/.xemacs/init.el
Ok hopefully all my Docbook setup works as well
[...]
I'll try whether I can find what change between GNU Emacs 21.1.1 and GNU Emacs 21.2.50.1 fixed this bug.
I couldn't find anything obvious. Just one more suggestion for a workaround if you want to continue GNU Emacs and don't want to upgrade to a CVS version: You can use one of the input methods built-in into Emacs, for example: M-x set-input-method RET turkish-postfix M-x describe-input-method RET RET explains how to use it: [...] You can input characters by the following key sequences: key char [type a key sequence to insert the corresponding character] --- ---- --- ---- --- ---- --- ---- --- ---- --- ---- --- ---- --- ---- A^ Â G^ Ğ O" Ö U" Ü a^ â g^ ğ s, ş u^ û C, Ç I. İ S, Ş U^ Û c, ç o" ö u" ü key character(s) [type a key (sequence) and select one from the list] --- ------------ i ı i [...] I always input German with a similar input method because I am not used to the German keyboard layout. I can input German more efficiently using an US keyboard layout and such an input method then using a German keyboard layout. But that is a matter of taste. -- Mike Fabian <mfabian@suse.de> http://www.suse.de/~mfabian 睡眠不足はいい仕事の敵だ。
* Mike Fabian; <mfabian@suse.de> on 14 May, 2002 wrote:
Togan Muftuoglu <toganm@dinamizm.com> writes:
In case of XEmacs, you add your personal setup like
~/.xemacs/init.el
Ok hopefully all my Docbook setup works as well
[...]
I am completely lost with Xemacs :-( iLooks like I have to RTFM
I'll try whether I can find what change between GNU Emacs 21.1.1 and GNU Emacs 21.2.50.1 fixed this bug.
I couldn't find anything obvious.
Just one more suggestion for a workaround if you want to continue GNU Emacs and don't want to upgrade to a CVS version:
So I should not expect a newer version of Emacs which fixes my problem from SUSE in the short term. With Emacs I would not take a risk of working with self compiled CVS version as this is my work machine.
You can use one of the input methods built-in into Emacs, for example:
M-x set-input-method RET turkish-postfix
M-x describe-input-method RET RET
I have figured this one while fighting earlier. With the Xemacs. I have tried your tip (when (string-match "UTF-8" (shell-command-to-string "locale charmap")) (set-default-coding-systems 'utf-8)) I can type the with no problem I can even save it in utf-8. When I reopen the file I saved I have the Turkish characters in raw format how do I fix this ? In Emacs this does not happen. Thanks -- Togan Muftuoglu Unofficial SuSE FAQ Maintainer http://dinamizm.ath.cx
Togan Muftuoglu <toganm@dinamizm.com> writes:
* Mike Fabian; <mfabian@suse.de> on 14 May, 2002 wrote:
Togan Muftuoglu <toganm@dinamizm.com> writes:
[...]
Just one more suggestion for a workaround if you want to continue GNU Emacs and don't want to upgrade to a CVS version:
So I should not expect a newer version of Emacs which fixes my problem from SUSE in the short term. With Emacs I would not take a risk of working with self compiled CVS version as this is my work machine.
[...]
With the Xemacs. I have tried your tip (when (string-match "UTF-8" (shell-command-to-string "locale charmap")) (set-default-coding-systems 'utf-8))
I can type the with no problem I can even save it in utf-8. When I reopen the file I saved I have the Turkish characters in raw format how do I fix this ? In Emacs this does not happen.
You can replace the above lisp code in your ~/.xemacs/init.el by (when (string-match "UTF-8" (shell-command-to-string "locale charmap")) (setq buffer-file-coding-system-for-read 'utf-8) (set-default-coding-systems 'utf-8)) i.e. also set the variable 'buffer-file-coding-system-for-read' to 'utf-8'. The default for this variable is 'undecided': `buffer-file-coding-system-for-read' is a variable declared in Lisp. -- loaded from "/usr/share/xemacs/21.4.6/lisp/code-files.elc" Value: undecided Documentation: Coding system used when reading a file. This provides coarse-grained control; for finer-grained control, use `file-coding-system-alist'. From a Lisp program, if you wish to unilaterally specify the coding system used for one particular operation, you should bind the variable `coding-system-for-read' rather than setting this variable, which is intended to be used for global environment specification. 'undecided' means that XEmacs will try to autodetect the coding system. It looks like GNU Emacs is better than XEmacs in autodetecting UTF-8 encoded files. You also add a 'coding system cookie' to the first line of your file. Looks like this: -*- coding: utf-8 -*- It can be anywhere on the first line of the file. This will override the value of buffer-file-coding-system-for-read. You can also specify the encoding manually when reading a file: C-x RET c utf-8 RET C-x C-f filename RET This works for both Emacs and XEmacs. For XEmacs you also have the shortcut: C-u C-x C-f filename RET utf-8 RET -- Mike Fabian <mfabian@suse.de> http://www.suse.de/~mfabian 睡眠不足はいい仕事の敵だ。
* Mike Fabian; <mfabian@suse.de> on 14 May, 2002 wrote:
I can type the with no problem I can even save it in utf-8. When I reopen the file I saved I have the Turkish characters in raw format how do I fix this ? In Emacs this does not happen.
You can replace the above lisp code in your ~/.xemacs/init.el by
(when (string-match "UTF-8" (shell-command-to-string "locale charmap")) (setq buffer-file-coding-system-for-read 'utf-8) (set-default-coding-systems 'utf-8))
done
-*- coding: utf-8 -*-
done : -) Thanks a lot you have saved my night I can now work and read XEmacs tommorow lunch time -- Togan Muftuoglu Unofficial SuSE FAQ Maintainer http://dinamizm.ath.cx
participants (2)
-
Mike Fabian
-
Togan Muftuoglu