Mailinglist Archive: opensuse-m17n (69 mails)

< Previous Next >
Re: [m17n] UTF-8 and Turkish
  • From: Mike Fabian <mfabian@xxxxxxx>
  • Date: Tue, 14 May 2002 00:25:13 +0000 (UTC)
  • Message-id: <s3t8z6na848.fsf@xxxxxxxxxxxxxxx>
Togan Muftuoglu <toganm@xxxxxxxxxxxx> writes:

> This is what I have under $HOME/.profile
>
> export LANG="turkish"

This is just an alias to tr_TR.ISO-8859-9 (see
/usr/share/locale/locale.alias). Better use tr_TR.UTF-8 here as well,
just for consistency.

> export LC_CTYPE="tr_TR.utf8"

tr_TR.UTF-8 is the correct spelling. glibc 'normalizes' the encoding
part of the locales, i.e. glibc doesn't care for upper or lower case,
hyphens or underscores. But X11 does, for X11 you should better use
the correct spelling tr_TR.UTF-8:

mfabian@gregory:/tmp$ LC_ALL=tr_TR.utf8 ~mfabian/bin/XSupportsLocale
False.
mfabian@gregory:/tmp$ LC_ALL=tr_TR.UTF-8 ~mfabian/bin/XSupportsLocale
True.
mfabian@gregory:/tmp$

(~mfabian/bin/XSupportsLocale is just a tiny test program which does
nothing more but reporting the return value of XSupportsLocale().)

Many X11 programs will output a warning or error message if you use
utf8 instead of UTF-8, for example:

mfabian@gregory:/tmp$ LC_ALL=tr_TR.utf8 gedit

Gdk-WARNING **: locale not supported by Xlib, locale set to C

If you don't see such a warning, it doesn't necessarily mean that
there will be no problem, so better use UTF-8 always, not utf8.

> export LC_NUMERIC="tr_TR.utf8"
> export LC_TIME="POSIX"
> export LC_COLLATE="POSIX"
> export LC_MONETARY="tr_TR.utf8"
> export LC_MESSAGES="en_US"
> export LC_PAPER="tr_TR.utf8"
> export LC_NAME="tr_TR.utf8"
> export LC_ADDRESS="tr_TR.utf8"
> export LC_TELEPHONE="tr_TR.utf8"
> export LC_MEASUREMENT="tr_TR.utf8"
> export LC_IDENTIFICATION="tr_TR.utf8"

Apart frome the spelling issue with utf8 -> UTF-8, there is nothing
wrong with that, although it is usually not necessary to set all of
these variables. You could already achieve the same effect with
setting only:

export LANG=tr_TR.UTF-8
export LC_TIME=POSIX
export LC_COLLATE=POSIX
export LC_MESSAGES=en_US

variables which you don't set will inherit their value from
LANG. After exporting only the that the 'locale' command will output:

mfabian@gregory:~$ locale
LANG=tr_TR.UTF-8
LC_CTYPE="tr_TR.UTF-8"
LC_NUMERIC="tr_TR.UTF-8"
LC_TIME=POSIX
LC_COLLATE=POSIX
LC_MONETARY="tr_TR.UTF-8"
LC_MESSAGES=en_US
LC_PAPER="tr_TR.UTF-8"
LC_NAME="tr_TR.UTF-8"
LC_ADDRESS="tr_TR.UTF-8"
LC_TELEPHONE="tr_TR.UTF-8"
LC_MEASUREMENT="tr_TR.UTF-8"
LC_IDENTIFICATION="tr_TR.UTF-8"
LC_ALL=
mfabian@gregory:~$

variables which did inherit their value from LANG have their
values enclosed by "" in the output of 'locale', variables
which have been set individually don't have these "".

> c) same is true if I just open a konsole with KDE d) When I use "xev"
> pressing the keys scedila gbreve idotless Iabovedot
> produces empty.

Yes, this is because of the utf8 -> UTF-8 spelling problem.

> So just to test I made a symbolink link under
> /usr/lib/locale for tr_TR.UTF-8 which is pointing to tr_TR.utf8.

You don't need to create that symlink. It doesn't hurt of course, but
it is useless. Remove it again and you will see that it still works if
you specify tr_TR.UTF-8. /usr/lib/locale is only used by glibc and
glibc doesn't for these spelling details. The locale stuff for X11 is
in /usr/X11R6/lib/X11/locale/.

> When I export tr_TR.UTF-8 for LC_CTYPE and use "xev" this time I get
> back the desired output. However this causes Netscape 4.79 to behave
> strange and I can not use Netscape.

Yes, Netscape 4.79 does not work right in UTF-8 locales. That it
appears to work with tr_TR.utf8 is only because this locale is invalid
for X11 and the 'C' is used instead:

mfabian@gregory:~$ LC_ALL=tr_TR.utf8 netscape
netscape: locale `tr_TR.utf8' not supported by Xlib; trying `C'.
mfabian@gregory:~$

As Netscape 4.79 is old, unmaintained, binary only software, it
is unlikely that this will ever get fixed. Try more modern
browsers like Mozilla or Konqueror instead or, if you still
need to use Netscape 4.79 for some reason, start it with explicitely
specifying a non-UTF-8 locale:

LC_ALL=tr_TR netscape

Maybe make a script for that if you need it often.

> e) if LC_CTYPE is tr_TR.utf8 then under emacs I can not type the
> scedilla, idotless, gbreve, Iabovedot. I have the mule-ucs package
> installed but still no hope. I tried with setting emacs language
> environment to turkish but no.

This will setup defaults in Emacs for ISO-8859-9 encoding, but not for
UTF-8.

> There must be something I need to tweak.
> However I just use emacs for Docbook editing and thats all so I do not
> know the internals . So in short tp rephrase what do I have to do to
> have a proper UTF-8
> environment which will enable me to type read in Turkish but provide me
> all the menus and system messages in English Thanks

In Emacs use

M-x set-keyboard-coding-system RET utf-8 RET

then it works. Add

(set-keyboard-coding-system 'utf-8)

to your ~/.emacs if you want to use that always. Or maybe, to do that
only when Emacs is started in a locale which uses UTF-8 charmap, you
may prefer:

(when (string-match "UTF-8" (shell-command-to-string "locale charmap"))
(set-keyboard-coding-system 'utf-8))

If you start Emacs in an UTF-8 locale, you probably also want to
read and write files in UTF-8 encoding by default, i.e. you
may even want to use

(when (string-match "UTF-8" (shell-command-to-string "locale charmap"))
(set-default-coding-systems 'utf-8))

'set-default-coding-systems' already includes
'set-keyboard-coding-system' but it does a bit more, according to its
doc-string it does:

set-default-coding-systems is a compiled Lisp function in `international/mule-cmds'.
(set-default-coding-systems CODING-SYSTEM)

Set default value of various coding systems to CODING-SYSTEM.
This sets the following coding systems:
o coding system of a newly created buffer
o default coding system for subprocess I/O
This also sets the following values:
o default value used as file-name-coding-system for converting file names.
o default value for the command `set-terminal-coding-system' (not on MSDOS)
o default value for the command `set-keyboard-coding-system'.

--
Mike Fabian <mfabian@xxxxxxx> http://www.suse.de/~mfabian
睡眠不足はいい仕事の敵だ。

< Previous Next >
Follow Ups
References