[opensuse-support] "bad" locale -> bad/no box characters
These are from 42.3 installations. On both, /etc/sysconfig/language is identical, and 'DroidSansMono.ttf: "Droid Sans Mono" "Regular"' results from 'fc-match monospace'. These show good vs. bad box drawing characters in Konsole3: Good: http://fm.no-ip.com/SS/Suse/lc423goodP5BSE.jpg Bad http://fm.no-ip.com/SS/Suse/lc423bad00srv.jpg Locale on the good host, all POSIX except CTYPE and ALL, same in Konsole3 as in XTerm: # locale LANG=POSIX LC_CTYPE=en_US.UTF-8 LC_NUMERIC="POSIX" LC_TIME="POSIX" LC_COLLATE="POSIX" LC_MONETARY="POSIX" LC_MESSAGES="POSIX" LC_PAPER="POSIX" LC_NAME="POSIX" LC_ADDRESS="POSIX" LC_TELEPHONE="POSIX" LC_MEASUREMENT="POSIX" LC_IDENTIFICATION="POSIX" LC_ALL= On the bad host, all en_US.UTF-8 except LANG, and XTerm is OK, but Konsole is not: LANG=POSIX LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL=en_US.UTF-8 The difference between good and bad has root in .bashrc: export LC_ALL=en_US.UTF-8 The question is, why does Xterm get box characters right, but Konsole not. Bug? -- Evolution as taught in public schools is religion, not science. Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata *** http://fm.no-ip.com/ -- To unsubscribe, e-mail: opensuse-support+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-support+owner@opensuse.org
On 26/12/2018 01.07, Felix Miata wrote:
These are from 42.3 installations. On both, /etc/sysconfig/language is identical, and 'DroidSansMono.ttf: "Droid Sans Mono" "Regular"' results from 'fc-match monospace'. These show good vs. bad box drawing characters in Konsole3: Good: http://fm.no-ip.com/SS/Suse/lc423goodP5BSE.jpg Bad http://fm.no-ip.com/SS/Suse/lc423bad00srv.jpg
Locale on the good host, all POSIX except CTYPE and ALL, same in Konsole3 as in XTerm: # locale LANG=POSIX LC_CTYPE=en_US.UTF-8 LC_NUMERIC="POSIX" LC_TIME="POSIX" LC_COLLATE="POSIX" LC_MONETARY="POSIX" LC_MESSAGES="POSIX" LC_PAPER="POSIX" LC_NAME="POSIX" LC_ADDRESS="POSIX" LC_TELEPHONE="POSIX" LC_MEASUREMENT="POSIX" LC_IDENTIFICATION="POSIX" LC_ALL=
This is typical of running locale as root. You have to do it as the user that has the session. -- Cheers / Saludos, Carlos E. R. (from 42.3 x86_64 "Malachite" at Telcontar)
Carlos E. R. composed on 2018-12-26 03:54 (UTC+0100):
Felix Miata wrote:
These are from 42.3 installations. On both, /etc/sysconfig/language is identical, and 'DroidSansMono.ttf: "Droid Sans Mono" "Regular"' results from 'fc-match monospace'. These show good vs. bad box drawing characters in Konsole3: Good: http://fm.no-ip.com/SS/Suse/lc423goodP5BSE.jpg Bad http://fm.no-ip.com/SS/Suse/lc423bad00srv.jpg
Locale on the good host, all POSIX except CTYPE and ALL, same in Konsole3 as in XTerm: # locale LANG=POSIX LC_CTYPE=en_US.UTF-8 LC_NUMERIC="POSIX" LC_TIME="POSIX" LC_COLLATE="POSIX" LC_MONETARY="POSIX" LC_MESSAGES="POSIX" LC_PAPER="POSIX" LC_NAME="POSIX" LC_ADDRESS="POSIX" LC_TELEPHONE="POSIX" LC_MEASUREMENT="POSIX" LC_IDENTIFICATION="POSIX" LC_ALL=
This is typical of running locale as root. You have to do it as the user that has the session.
I guess I didn't remember to write as much as I needed to. I know root and users are normally different, but not /why/ they are by default configured differently. Here, except for LC_ALL, all are [en_US|en_US.UTF-8] (by default) for ordinary users (which usually have 'export LC_TIME=en_DK' in .bashrc). (What, if anything as a practical matter, makes en_US and en_US.UTF-8 differ I don't know either, but I'd really rather not see displayed CJK and other alphabets' characters I don't read.) -- Evolution as taught in public schools is religion, not science. Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata *** http://fm.no-ip.com/ -- To unsubscribe, e-mail: opensuse-support+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-support+owner@opensuse.org
On 26/12/2018 04.20, Felix Miata wrote:
Carlos E. R. composed on 2018-12-26 03:54 (UTC+0100):
Felix Miata wrote:
These are from 42.3 installations. On both, /etc/sysconfig/language is identical, and 'DroidSansMono.ttf: "Droid Sans Mono" "Regular"' results from 'fc-match monospace'. These show good vs. bad box drawing characters in Konsole3: Good: http://fm.no-ip.com/SS/Suse/lc423goodP5BSE.jpg Bad http://fm.no-ip.com/SS/Suse/lc423bad00srv.jpg
Locale on the good host, all POSIX except CTYPE and ALL, same in Konsole3 as in XTerm: # locale LANG=POSIX LC_CTYPE=en_US.UTF-8 LC_NUMERIC="POSIX" LC_TIME="POSIX" LC_COLLATE="POSIX" LC_MONETARY="POSIX" LC_MESSAGES="POSIX" LC_PAPER="POSIX" LC_NAME="POSIX" LC_ADDRESS="POSIX" LC_TELEPHONE="POSIX" LC_MEASUREMENT="POSIX" LC_IDENTIFICATION="POSIX" LC_ALL=
This is typical of running locale as root. You have to do it as the user that has the session.
I guess I didn't remember to write as much as I needed to. I know root and users are normally different, but not /why/ they are by default configured differently. Here, except for LC_ALL, all are [en_US|en_US.UTF-8] (by default) for ordinary users (which usually have 'export LC_TIME=en_DK' in .bashrc).
I recogn I did not look at the photos, just at your pasted text. And on the second locale posting you ommitted the command prompt, which confused me into thinking that it was done as user. Please, always post the prompt and the command, it gives necessary information. Now that I look at the photos, the answer is slightly different. Machine 00srv does not use posix in the locale configuration. /etc/sysconfig/language ## Type: string(ctype) ## Default: ctype # # This defines if the user "root" should use the locale settings # which are defined here. # Value "ctype" means that root uses just LC_CTYPE. # ROOT_USES_LANG="ctype" I never change this.
(What, if anything as a practical matter, makes en_US and en_US.UTF-8 differ I don't know either, but I'd really rather not see displayed CJK and other alphabets' characters I don't read.)
Non UTF-8 charset, go back more than a decade. -- Cheers / Saludos, Carlos E. R. (from 42.3 x86_64 "Malachite" at Telcontar)
Carlos E. R. composed on 2018-12-26 08:57 (UTC+0100):
Felix Miata wrote:
Carlos E. R. composed on 2018-12-26 03:54 (UTC+0100):
Felix Miata wrote:
These are from 42.3 installations. On both, /etc/sysconfig/language is identical, ... Now that I look at the photos, the answer is slightly different. Machine 00srv does not use posix in the locale configuration.
/etc/sysconfig/language
## Type: string(ctype) ## Default: ctype # # This defines if the user "root" should use the locale settings # which are defined here. # Value "ctype" means that root uses just LC_CTYPE. # ROOT_USES_LANG="ctype"
I never change this.
I don't either, and since both PCs are using the same /etc/sysconfig/language file, with that same specification, what is this supposed to be telling me, and particularly, with regard to Konsole output?
(What, if anything as a practical matter, makes en_US and en_US.UTF-8 differ I don't know either, but I'd really rather not see displayed CJK and other alphabets' characters I don't read.)
Non UTF-8 charset, go back more than a decade.
Plain en_US is by definition non-UTF-8? I thought it was an alias to something else that may or may not be UTF-8 equivalent? Box drawing characters have been available on PCs for more than three decades. -- Evolution as taught in public schools is religion, not science. Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata *** http://fm.no-ip.com/ -- To unsubscribe, e-mail: opensuse-support+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-support+owner@opensuse.org
On 26/12/2018 10.01, Felix Miata wrote:
Carlos E. R. composed on 2018-12-26 08:57 (UTC+0100):
Felix Miata wrote:
Carlos E. R. composed on 2018-12-26 03:54 (UTC+0100):
/etc/sysconfig/language
## Type: string(ctype) ## Default: ctype # # This defines if the user "root" should use the locale settings # which are defined here. # Value "ctype" means that root uses just LC_CTYPE. # ROOT_USES_LANG="ctype"
I never change this.
I don't either, and since both PCs are using the same /etc/sysconfig/language file, with that same specification, what is this supposed to be telling me, and particularly, with regard to Konsole output?
(What, if anything as a practical matter, makes en_US and en_US.UTF-8 differ I don't know either, but I'd really rather not see displayed CJK and other alphabets' characters I don't read.)
Non UTF-8 charset, go back more than a decade.
Plain en_US is by definition non-UTF-8? I thought it was an alias to something else that may or may not be UTF-8 equivalent? Box drawing characters have been available on PCs for more than three decades.
There is no /usr/share/X11/locale/en_US/ directory, so beware. en_US.UTF-8 might be wrong. There is "/usr/lib/locale/en_US.utf8", lowercase. They exist en_US, en_US.iso885915, and en_US.utf8, and all three are different actual directories, not links: en_US: -rw-r--r-- 73 root root 237196 Dec 5 12:14 LC_CTYPE en_US.utf8 -rw-r--r-- 199 root root 278308 Dec 5 12:14 LC_CTYPE en_US.iso885915 -rw-r--r-- 26 root root 237448 Dec 5 12:14 LC_CTYPE
Box drawing characters have been available on PCs for more than three decades.
In which charset? Because 8 bit charset have different interpretations of what goes above code 127. You can not use that today - except that some people do. TomTom, for instance :-/ Yes, the IBM-PC had them, but this is Linux, inheriting from Unix, so not IBM-PC charset here. Then there are many variants: Spain has one, France has another... -- Cheers / Saludos, Carlos E. R. (from 42.3 x86_64 "Malachite" at Telcontar)
26.12.2018 12:01, Felix Miata пишет:
Plain en_US is by definition non-UTF-8? I thought it was an alias to something else that may or may not be UTF-8 equivalent? Box drawing characters have been available on PCs for more than three decades.
lsblk does not use line drawing characters available on PC; it is using UTF-8 line drawing characters if locale is UTF-8 or plain ASCII equivalent if locale is not UTF-8. It does not use terminal database (be it terminfo or termcap) at all. -- To unsubscribe, e-mail: opensuse-support+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-support+owner@opensuse.org
On 26/12/2018 17.25, Andrei Borzenkov wrote:
26.12.2018 12:01, Felix Miata пишет:
Plain en_US is by definition non-UTF-8? I thought it was an alias to something else that may or may not be UTF-8 equivalent? Box drawing characters have been available on PCs for more than three decades.
lsblk does not use line drawing characters available on PC; it is using UTF-8 line drawing characters if locale is UTF-8 or plain ASCII equivalent if locale is not UTF-8. It does not use terminal database (be it terminfo or termcap) at all.
Look at this photo he posted:
Looks like konsole is printing thinking that it has UTF but has not. If it knew it would be using other chars, not garbage. -- Cheers / Saludos, Carlos E. R. (from 42.3 x86_64 "Malachite" at Telcontar)
26.12.2018 19:45, Carlos E. R. пишет:
On 26/12/2018 17.25, Andrei Borzenkov wrote:
26.12.2018 12:01, Felix Miata пишет:
Plain en_US is by definition non-UTF-8? I thought it was an alias to something else that may or may not be UTF-8 equivalent? Box drawing characters have been available on PCs for more than three decades.
lsblk does not use line drawing characters available on PC; it is using UTF-8 line drawing characters if locale is UTF-8 or plain ASCII equivalent if locale is not UTF-8. It does not use terminal database (be it terminfo or termcap) at all.
Look at this photo he posted:
What makes you believe I did not?
Looks like konsole is printing thinking that it has UTF but has not. If it knew it would be using other chars, not garbage.
On 26/12/2018 18.21, Andrei Borzenkov wrote:
26.12.2018 19:45, Carlos E. R. пишет:
On 26/12/2018 17.25, Andrei Borzenkov wrote:
26.12.2018 12:01, Felix Miata пишет:
Plain en_US is by definition non-UTF-8? I thought it was an alias to something else that may or may not be UTF-8 equivalent? Box drawing characters have been available on PCs for more than three decades.
lsblk does not use line drawing characters available on PC; it is using UTF-8 line drawing characters if locale is UTF-8 or plain ASCII equivalent if locale is not UTF-8. It does not use terminal database (be it terminfo or termcap) at all.
Look at this photo he posted:
What makes you believe I did not?
Sorry, I knew that on the next post, then this mail account had an hicup and could not say so. Not even read your other post. Anyway, that konsole in the photo, you can see it prints garbage instead of some ascii line emulation. -- Cheers / Saludos, Carlos E. R. (from 42.3 x86_64 "Malachite" at Telcontar)
26.12.2018 3:07, Felix Miata пишет:
These are from 42.3 installations. On both, /etc/sysconfig/language is identical, and 'DroidSansMono.ttf: "Droid Sans Mono" "Regular"' results from 'fc-match monospace'. These show good vs. bad box drawing characters in Konsole3: Good: http://fm.no-ip.com/SS/Suse/lc423goodP5BSE.jpg Bad http://fm.no-ip.com/SS/Suse/lc423bad00srv.jpg
Konsole treats program output as non-UTF-8
Locale on the good host, all POSIX except CTYPE and ALL, same in Konsole3 as in XTerm: # locale LANG=POSIX LC_CTYPE=en_US.UTF-8 LC_NUMERIC="POSIX" LC_TIME="POSIX" LC_COLLATE="POSIX" LC_MONETARY="POSIX" LC_MESSAGES="POSIX" LC_PAPER="POSIX" LC_NAME="POSIX" LC_ADDRESS="POSIX" LC_TELEPHONE="POSIX" LC_MEASUREMENT="POSIX" LC_IDENTIFICATION="POSIX" LC_ALL=
On the bad host, all en_US.UTF-8 except LANG, and XTerm is OK, but Konsole is not: LANG=POSIX LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL=en_US.UTF-8
Locale in shell session is irrelevant. What is relevant, is locale in which konsole was started and which determines how input is interpreted. Unforutnately I do not know any good way to find it out. You may try something like tr '\0' '\n' < /proc/${PID_OF_KONSOLE}/environ but I won't be surprised if KDE takes this information from somewhere else. When I was using KDE I could not find any way to launch specific KDE application in specific locale without changing global use settings.
The difference between good and bad has root in .bashrc:
export LC_ALL=en_US.UTF-8
The question is, why does Xterm get box characters right, but Konsole not. Bug?
-- To unsubscribe, e-mail: opensuse-support+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-support+owner@opensuse.org
On 26/12/2018 17.22, Andrei Borzenkov wrote:
26.12.2018 3:07, Felix Miata пишет:
Locale in shell session is irrelevant. What is relevant, is locale in which konsole was started and which determines how input is interpreted. Unforutnately I do not know any good way to find it out. You may try something like
tr '\0' '\n' < /proc/${PID_OF_KONSOLE}/environ
but I won't be surprised if KDE takes this information from somewhere else. When I was using KDE I could not find any way to launch specific KDE application in specific locale without changing global use settings.
I think KDE ignores the locale; instead it creates one from the KDE own configuration. But that konsole is running "su" or "su -", so the locale can be altered, while the actual charset it uses to display is not. -- Cheers / Saludos, Carlos E. R. (from 42.3 x86_64 "Malachite" at Telcontar)
Andrei Borzenkov composed on 2018-12-26 19:22 (UTC+0300):
Felix Miata composed:
These are from 42.3 installations. On both, /etc/sysconfig/language is identical, and 'DroidSansMono.ttf: "Droid Sans Mono" "Regular"' results from 'fc-match monospace'. These show good vs. bad box drawing characters in Konsole3: Good: http://fm.no-ip.com/SS/Suse/lc423goodP5BSE.jpg Bad http://fm.no-ip.com/SS/Suse/lc423bad00srv.jpg
Konsole treats program output as non-UTF-8
Locale on the good host, all POSIX except CTYPE and ALL, same in Konsole3 as in XTerm: # locale LANG=POSIX LC_CTYPE=en_US.UTF-8 LC_NUMERIC="POSIX" LC_TIME="POSIX" LC_COLLATE="POSIX" LC_MONETARY="POSIX" LC_MESSAGES="POSIX" LC_PAPER="POSIX" LC_NAME="POSIX" LC_ADDRESS="POSIX" LC_TELEPHONE="POSIX" LC_MEASUREMENT="POSIX" LC_IDENTIFICATION="POSIX" LC_ALL=
On the bad host, all en_US.UTF-8 except LANG, and XTerm is OK, but Konsole is not: (# locale) LANG=POSIX LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL=en_US.UTF-8
Locale in shell session is irrelevant. What is relevant, is locale in which konsole was started and which determines how input is interpreted. Unforutnately I do not know any good way to find it out. You may try something like
tr '\0' '\n' < /proc/${PID_OF_KONSOLE}/environ
# ps -A | grep -i sole 1737 ? 00:00:55 konsole 1740 ? 00:00:20 konsole # tr '\0' '\n' < /proc/${PID_OF_KONSOLE}/environ -bash: /proc//environ: No such file or directory # tr '\0' '\n' < /proc/${1740}/environ -bash: /proc//environ: No such file or directory # tr '\0' '\n' < /proc/1740/environ | wc -l 82 # tr '\0' '\n' < /proc/1740/environ | egrep 'LC_|LOCALE|LANG|locale' LANG=en_US G_FILENAME_ENCODING=@locale,UTF-8,ISO-8859-15,CP1252 LC_TIME=en_DK
but I won't be surprised if KDE takes this information from somewhere else. When I was using KDE I could not find any way to launch specific KDE application in specific locale without changing global use settings.
The difference between good and bad has root in .bashrc:
export LC_ALL=en_US.UTF-8
The question is, why does Xterm get box characters right, but Konsole not. Bug? -- Evolution as taught in public schools is religion, not science.
Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata *** http://fm.no-ip.com/ -- To unsubscribe, e-mail: opensuse-support+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-support+owner@opensuse.org
On 26/12/2018 22.59, Felix Miata wrote:
Andrei Borzenkov composed on 2018-12-26 19:22 (UTC+0300):
tr '\0' '\n' < /proc/${PID_OF_KONSOLE}/environ
# ps -A | grep -i sole 1737 ? 00:00:55 konsole 1740 ? 00:00:20 konsole
# tr '\0' '\n' < /proc/1740/environ | wc -l 82 # tr '\0' '\n' < /proc/1740/environ | egrep 'LC_|LOCALE|LANG|locale' LANG=en_US
There you have, no UTF. -- Cheers / Saludos, Carlos E. R. (from 42.3 x86_64 "Malachite" at Telcontar)
participants (3)
-
Andrei Borzenkov
-
Carlos E. R.
-
Felix Miata