https://bugzilla.novell.com/show_bug.cgi?id=470921
User mfabian@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=470921#c6
--- Comment #6 from Mike Fabian 2009-02-02 10:27:28 MST ---
Doesn't seem to be easy to fix. For the man-page reported here,
I can workaround the problem by testing whether the input is ASCII
only and then use
ICONV="iconv -f ISO-8859-1 -t UTF-8"
instead of converting from EUC-KR. The source of the English man-page
of "man" is ASCII only, it contains:
Description@Octal@latin1@ascii
_
continuation hyphen@255@\*[softhyphen]@-
i.e. the Latin1 character is created from the groff macro
\*[softhyphen] because the -Tlatin1 device is used for Korean
in /usr/bin/nroff.
But this doesn't work for man-pages like
/usr/share/man/man7/iso_8859-1.7.gz which already contain Latin1
characters because I cannot distinguish Latin1 from EUC-KR
with a simple iconv test.
The only workaround I can think of at the moment is
to use the above ICONV="iconv -f ISO-8859-1 -t UTF-8" for
the man-pages where the input is purely ASCII
and ICONV="iconv -c -f EUC-KR -t UTF-8" for the man-pages
which are not.
The "-c" will throw away all characters it cannot convert.
For Korean man-pages this should not throw away anything at
all because the Korean man-pages (currently) most likely contain
only charaters which are either already EUC-KR
or are convertible into EUC-KR if the source of the man-page
is UTF-8 encoded.
With the "-c" option the /usr/share/man/man7/iso_8859-1.7.gz will
not be truncated but all Latin1 characters will be omitted.
This should be good enough. I think better fixes are not
possible until we can finally update to a more modern groff version.
--
Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.