[Bug 470921] New: man pages gets truncated if LANG is set to ko_KR.UTF-8
https://bugzilla.novell.com/show_bug.cgi?id=470921 Summary: man pages gets truncated if LANG is set to ko_KR.UTF-8 Classification: openSUSE Product: openSUSE 11.1 Version: Final Platform: Other OS/Version: Other Status: NEW Severity: Normal Priority: P5 - None Component: Documentation AssignedTo: ke@novell.com ReportedBy: teheo@novell.com QAContact: ke@novell.com CC: mfabian@novell.com Found By: --- English man pages get truncated if LANG is ko_KR.UTF-8. $ LANG=ko_KR.UTF-8 man 1 man | wc iconv: 24779 �ġ�� �߸�� �Է� ���� ��� 406 2872 24779 $ LANG=en_US.UTF-8 man 1 man | wc 599 4032 35518 iconv seems to fail on certain character(s). This happens on a lot of man pages. Fabian, this seems to be related to the Korean man page fix, right? Thanks. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=470921 Mike Fabian <mfabian@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=470921 User mfabian@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=470921#c1 Mike Fabian <mfabian@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO Info Provider| |teheo@novell.com --- Comment #1 from Mike Fabian <mfabian@novell.com> 2009-01-30 03:01:37 MST --- Is the Korean man-page package "man-pages-ko" installed? I.e. is there a Korean man page of "man" available? I have this on my machine mfabian@magellan:~$ rpm -qf /usr/share/man/ko/man1/man.1.gz man-pages-ko-20050219-81.2 mfabian@magellan:~$ and the command "LANG=ko_KR.UTF-8 man 1 man" displays a Korean man page, not an English one: mfabian@magellan:~$ LANG=ko_KR.UTF-8 man 1 man |wc 172 824 9830 mfabian@magellan:~$ LANG=en_US.UTF-8 man 1 man |wc 599 4032 35518 mfabian@magellan:~$ LANG=ko_KR.UTF-8 LC_MESSAGES=en_US.UTF-8 man 1 man | wc 599 4032 35518 mfabian@magellan:~$ None of these commands shows an error message. The last command pipes the English man-page to "wc". -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=470921 User teheo@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=470921#c2 Tejun Heo <teheo@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED Info Provider|teheo@novell.com | --- Comment #2 from Tejun Heo <teheo@novell.com> 2009-01-30 03:10:10 MST --- No, it's not installed. I'll try to reproduce the problem with Korean manpages installed. Okay, hmm... weird. $ LANG=ko_KR.UTF-8 man git-ls-files | wc iconv: 1484 �ġ�� �߸�� �Է� ���� ��� 45 133 1484 $ LANG=ko_KR.UTF-8 LC_MESSAGES=en_US.UTF-8 man git-ls-files | wc 192 803 7111 $ LANG=en_US.UTF-8 man git-ls-files | wc 192 803 7111 $ LANG=ko_KR.UTF-8 LC_MESSAGES=en_US.UTF-8 man 1 man | wc 599 4032 35518 git-ls-files doesn't have Korean translation and weirdly setting LC_MESSAGES explicitly avoids iconv failure. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=470921 User mfabian@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=470921#c3 --- Comment #3 from Mike Fabian <mfabian@novell.com> 2009-01-30 04:48:50 MST --- OK, thank you, I’ll check again ... -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=470921 User mfabian@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=470921#c4 --- Comment #4 from Mike Fabian <mfabian@novell.com> 2009-01-30 04:50:29 MST --- Yes, same here, with man-pages-ko removed I can reproduce the problem and explicitly setting LC_MESSAGES avoids it. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=470921 User mfabian@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=470921#c5 --- Comment #5 from Mike Fabian <mfabian@novell.com> 2009-01-30 05:02:40 MST --- OK, the problem is here in /usr/bin/nroff: case "${LANGUAGE-${LC_ALL-${LC_MESSAGES-${LANG}}}}" in ja*) [...] T=-Tlatin1 export LC_ALL=ko_KR.EUC-KR ICONV="iconv -f EUC-KR -t UTF-8" ;; The iconv fails when the man-page isn't really Korean (fallback to English) and happens to contain latin1. Conversion from EUC-KR to UTF-8 fails if the input is really latin1. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=470921 User mfabian@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=470921#c6 --- Comment #6 from Mike Fabian <mfabian@novell.com> 2009-02-02 10:27:28 MST --- Doesn't seem to be easy to fix. For the man-page reported here, I can workaround the problem by testing whether the input is ASCII only and then use ICONV="iconv -f ISO-8859-1 -t UTF-8" instead of converting from EUC-KR. The source of the English man-page of "man" is ASCII only, it contains: Description@Octal@latin1@ascii _ continuation hyphen@255@\*[softhyphen]@- i.e. the Latin1 character is created from the groff macro \*[softhyphen] because the -Tlatin1 device is used for Korean in /usr/bin/nroff. But this doesn't work for man-pages like /usr/share/man/man7/iso_8859-1.7.gz which already contain Latin1 characters because I cannot distinguish Latin1 from EUC-KR with a simple iconv test. The only workaround I can think of at the moment is to use the above ICONV="iconv -f ISO-8859-1 -t UTF-8" for the man-pages where the input is purely ASCII and ICONV="iconv -c -f EUC-KR -t UTF-8" for the man-pages which are not. The "-c" will throw away all characters it cannot convert. For Korean man-pages this should not throw away anything at all because the Korean man-pages (currently) most likely contain only charaters which are either already EUC-KR or are convertible into EUC-KR if the source of the man-page is UTF-8 encoded. With the "-c" option the /usr/share/man/man7/iso_8859-1.7.gz will not be truncated but all Latin1 characters will be omitted. This should be good enough. I think better fixes are not possible until we can finally update to a more modern groff version. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=470921 User mfabian@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=470921#c7 --- Comment #7 from Mike Fabian <mfabian@novell.com> 2009-02-02 11:18:58 MST --- Created an attachment (id=269342) --> (https://bugzilla.novell.com/attachment.cgi?id=269342) /usr/bin/nroff.patch I "fixed" it with the attached patch. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=470921 User mfabian@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=470921#c8 Mike Fabian <mfabian@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution| |FIXED --- Comment #8 from Mike Fabian <mfabian@novell.com> 2009-02-02 11:20:50 MST --- Fixed package submitted to Factory and the M17N project in the openSUSE build service. Closing as FIXED. Tejun, can you please test the new package? Should be available here soon: http://download.opensuse.org/repositories/M17N -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=470921 User teheo@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=470921#c9 --- Comment #9 from Tejun Heo <teheo@novell.com> 2009-02-03 20:17:32 MST --- Eh... Which package am I supposed to try? The man package in 11.1 directory is older than the one I have installed? Thanks. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=470921 User mfabian@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=470921#c10 --- Comment #10 from Mike Fabian <mfabian@novell.com> 2009-02-04 03:50:24 MST --- Maybe the build service had not finished building them yet? They appear to be there now, the following packages are from 2009-02-02: http://download.opensuse.org/repositories/M17N/openSUSE_11.1/i586/groff-1.18... http://download.opensuse.org/repositories/M17N/openSUSE_11.1/x86_64/groff-11... http://download.opensuse.org/repositories/M17N/openSUSE_11.1/src/groff-1.181... and have the right changelog: * 月 2月 02 2009 mfabian@suse.de - bnc#470921: add more workarounds for Korean to fix the truncation of some non-Korean man-pages in ko_KR.UTF-8 locale. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=470921 User teheo@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=470921#c11 --- Comment #11 from Tejun Heo <teheo@novell.com> 2009-02-04 19:52:52 MST --- Verified. Both man and git manpages look fine now. Thanks. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com