[Bug 470921] man pages gets truncated if LANG is set to ko_KR.UTF-8

2 Feb 2009

      https://bugzilla.novell.com/show_bug.cgi?id=470921

User mfabian@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=470921#c6

--- Comment #6 from Mike Fabian <mfabian@novell.com>  2009-02-02 10:27:28 MST ---
Doesn't seem to be easy to fix. For the man-page reported here,
I can workaround the problem by testing whether the input is ASCII
only and then use

    ICONV="iconv -f ISO-8859-1 -t UTF-8"

instead of converting from EUC-KR. The source of the English man-page
of "man" is ASCII only, it contains:

    Description@Octal@latin1@ascii
    _
    continuation hyphen@255@\*[softhyphen]@-

i.e. the Latin1 character is created from the groff macro
\*[softhyphen] because the -Tlatin1 device is used for Korean
in /usr/bin/nroff.

But this doesn't work for man-pages like
/usr/share/man/man7/iso_8859-1.7.gz which already contain Latin1
characters because I cannot distinguish Latin1 from EUC-KR
with a simple iconv test.

The only workaround I can think of at the moment is
to use the above ICONV="iconv -f ISO-8859-1 -t UTF-8" for
the man-pages where the input is purely ASCII
and ICONV="iconv -c -f EUC-KR -t UTF-8" for the man-pages
which are not.

The "-c" will throw away all characters it cannot convert.
For Korean man-pages this should not throw away anything at
all because the Korean man-pages (currently) most likely contain
only charaters which are either already EUC-KR
or are convertible into EUC-KR if the source of the man-page
is UTF-8 encoded.

With the "-c" option the /usr/share/man/man7/iso_8859-1.7.gz will
not be truncated but all Latin1 characters will be omitted.

This should be good enough. I think better fixes are not
possible until we can finally update to a more modern groff version.

-- 
Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

[Bug 470921] man pages gets truncated if LANG is set to ko_KR.UTF-8

bugzilla_noreply＠novell.com