[Bug 683857] New: man: new Unicode characters in use
https://bugzilla.novell.com/show_bug.cgi?id=683857 https://bugzilla.novell.com/show_bug.cgi?id=683857#c0 Summary: man: new Unicode characters in use Classification: openSUSE Product: openSUSE 11.4 Version: Final Platform: All OS/Version: Linux Status: NEW Severity: Minor Priority: P5 - None Component: Basesystem AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: jengelh@medozas.de QAContact: qa@suse.de Found By: Beta-Customer Blocker: --- Starting with openSUSE 11.4, /usr/bin/man outputs the character U+2010 when it breaks a word where it previously used U+002D. As a result, since many fonts do not have the U+2010 character (including terminus on xterm, and especially the text console), a replacement graphic such as a rectange is displayed instead. The soft hyphen at U+00AD could be used instead, or switching back to just plain ASCII hyphens. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=683857 https://bugzilla.novell.com/show_bug.cgi?id=683857#c zj jia <zjjia@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |zjjia@novell.com AssignedTo|bnc-team-screening@forge.pr |werner@novell.com |ovo.novell.com | -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=683857 https://bugzilla.novell.com/show_bug.cgi?id=683857#c1 Dr. Werner Fink <werner@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |mvyskocil@novell.com, | |werner@novell.com AssignedTo|werner@novell.com |mvyskocil@novell.com --- Comment #1 from Dr. Werner Fink <werner@novell.com> 2011-03-31 08:37:59 UTC --- man uses groff for character mapping and less for output on the terminal -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=683857 https://bugzilla.novell.com/show_bug.cgi?id=683857#c2 Michal Vyskocil <mvyskocil@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO InfoProvider| |werner@novell.com --- Comment #2 from Michal Vyskocil <mvyskocil@novell.com> 2011-04-22 08:53:09 UTC --- That seems to be regression of dropped bnc446710.patch - see bug 446710. However it seems the fonts/devutf8/R is not the place for it anymore. With u2010 24 0 0x002D in that file I've got echo "\[u2010]" | nroff -mandoc -Tutf8 | head -n 1 | od -x 0000000 80e2 0a90 0000004 which is hyphen in utf-8 only ascii seems to produce proper replacement echo "\[u2010]" | nroff -mandoc -Tascii | head -n 1 | od -x 0000000 0a2d 0000004 even if I was not able to realize in which .tmac file is this mapping one. There's no big difference in loaded tmac files between devascii and devutf8. Only in later case the unicode.tmac and latin.tmac are called after tty.tmac. Only one solution I'm aware of is revert the logic of unicode.tmac - instead of current mapping of 0x2d to 0x2010 et all \" unicode.tmac \" char - \[hy] char ` \[oq] char ' \[cq] \" EOF use \" unicode.tmac \" char \[hy] - char \[oq] ` char \[cq] ' \" EOF but that might cause unwanted side-effects in case someone else use non tty output. So maybe we can name it as deunicode.tmac and call it in tty.tmac instead of unicode one. Werner: what do you think? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=683857 https://bugzilla.novell.com/show_bug.cgi?id=683857#c3 Michal Vyskocil <mvyskocil@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|P5 - None |P3 - Medium Status|NEEDINFO |ASSIGNED InfoProvider|werner@novell.com | --- Comment #3 from Michal Vyskocil <mvyskocil@novell.com> 2011-04-28 12:21:30 UTC --- uh forget that - I patched tty.tmac to not include unicode.tmac, which changes the 0x2d to 0x2010. I don't think we need to change it back. I'm going to sent a fix to M17N soon. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=683857 https://bugzilla.novell.com/show_bug.cgi?id=683857#c4 --- Comment #4 from Michal Vyskocil <mvyskocil@novell.com> 2011-04-28 14:51:09 UTC --- The problem has been fixed in M17N[1] groff by commit 12 [2]. The tty.tmac no longer include unicode.tmac, so ascii chars will be not replaced. Feel free to test it before I'll submit it to Factory from M17N repository [1]. [1] http://download.opensuse.org/repositories/M17N/openSUSE_11.4/ [2] https://build.opensuse.org/package/rdiff?commit=12&linkrev=base&package=groff&project=M17N -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=683857 https://bugzilla.novell.com/show_bug.cgi?id=683857#c5 --- Comment #5 from Jan Engelhardt <jengelh@medozas.de> 2011-04-28 15:29:54 UTC --- I have updated to the package, but still see U+2010 used for wordbreaks. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=683857 https://bugzilla.novell.com/show_bug.cgi?id=683857#c6 Michal Vyskocil <mvyskocil@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO InfoProvider| |jengelh@medozas.de --- Comment #6 from Michal Vyskocil <mvyskocil@novell.com> 2011-05-02 14:35:54 UTC --- Can you get me an example? Which man page and under which conditions. Thanks. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=683857 https://bugzilla.novell.com/show_bug.cgi?id=683857#c7 Jan Engelhardt <jengelh@medozas.de> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED InfoProvider|jengelh@medozas.de | --- Comment #7 from Jan Engelhardt <jengelh@medozas.de> 2011-05-02 15:02:02 UTC --- Created an attachment (id=427556) --> (http://bugzilla.novell.com/attachment.cgi?id=427556) Test manpage groff-1.20.1-183.1.x86_64.rpm from M17N/openSUSE_11.4. $ locale LANG=en_US.UTF-8 LC_CTYPE=de_DE.UTF-8 LC_NUMERIC=POSIX LC_TIME=POSIX LC_COLLATE=POSIX LC_MONETARY=POSIX LC_MESSAGES=nb_NO.UTF-8 LC_PAPER=de_DE.UTF-8 LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL= Running inside xterm-268: $ man -l test.1 | pcregrep -o '[^\w]+' | sort -u .. ‐ When adding | hexdump -C, this will produce "e2 80 90", which is a sign of U+2010. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=683857 https://bugzilla.novell.com/show_bug.cgi?id=683857#c8 --- Comment #8 from Michal Vyskocil <mvyskocil@novell.com> 2011-06-06 11:00:40 UTC --- Updated patch adds the deunicode.tmac, which turns those unicodization off on tty. Then hexdump -C returns 00000000 2d 0a |-.| 00000002 Commited as a revision13 to M17N/groff. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=683857 https://bugzilla.novell.com/show_bug.cgi?id=683857#c9 Michal Vyskocil <mvyskocil@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution| |FIXED --- Comment #9 from Michal Vyskocil <mvyskocil@novell.com> 2011-06-06 11:10:17 UTC --- Submitted into openSUSE:Factory by request 72760 - I assume you can use the version from M17N, so no maintenance update is requested, thus closing. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=683857 https://bugzilla.novell.com/show_bug.cgi?id=683857#c10 --- Comment #10 from Bernhard Wiedemann <bwiedemann@novell.com> 2011-06-06 18:00:27 CEST --- This is an autogenerated message for OBS integration: This bug (683857) was mentioned in https://build.opensuse.org/request/show/72760 Factory / groff -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=683857 https://bugzilla.novell.com/show_bug.cgi?id=683857#c11 Dave Plater <davejplater@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |davejplater@gmail.com --- Comment #11 from Dave Plater <davejplater@gmail.com> 2011-06-07 20:32:24 UTC --- (In reply to comment #2)
That seems to be regression of dropped bnc446710.patch - see bug 446710. However it seems the fonts/devutf8/R is not the place for it anymore. With
u2010 24 0 0x002D
in that file I've got
echo "\[u2010]" | nroff -mandoc -Tutf8 | head -n 1 | od -x 0000000 80e2 0a90 0000004
which is hyphen in utf-8
only ascii seems to produce proper replacement
echo "\[u2010]" | nroff -mandoc -Tascii | head -n 1 | od -x 0000000 0a2d 0000004
even if I was not able to realize in which .tmac file is this mapping one. There's no big difference in loaded tmac files between devascii and devutf8. Only in later case the unicode.tmac and latin.tmac are called after tty.tmac.
Only one solution I'm aware of is revert the logic of unicode.tmac - instead of current mapping of 0x2d to 0x2010 et all
.\" unicode.tmac .\" .char - \[hy] .char ` \[oq] .char ' \[cq] .\" EOF
use
.\" unicode.tmac .\" .char \[hy] - .char \[oq] ` .char \[cq] ' .\" EOF
but that might cause unwanted side-effects in case someone else use non tty output. So maybe we can name it as deunicode.tmac and call it in tty.tmac instead of unicode one.
Werner: what do you think?
I came upon this bug while googling deunicode.tmac due to a new rpmlint error for a few package's man pages. This is from lilv, a package I'm preparing for factory : lilv.x86_64: W: manual-page-warning /usr/share/man/man1/lv2jack.1.gz 69: can't find macro file `deunicode.tmac' lilv.x86_64: W: manual-page-warning /usr/share/man/man1/serdi.1.gz 69: can't find macro file `deunicode.tmac' lilv.x86_64: W: manual-page-warning /usr/share/man/man3/lilv.3.gz 69: can't find macro file `deunicode.tmac' lilv.x86_64: W: manual-page-warning /usr/share/man/man3/SerdURI.3.gz 69: can't find macro file `deunicode.tmac' lilv.x86_64: W: manual-page-warning /usr/share/man/man3/SerdNode.3.gz 69: can't find macro file `deunicode.tmac' lilv.x86_64: W: manual-page-warning /usr/share/man/man1/sordi.1.gz 69: can't find macro file `deunicode.tmac' lilv.x86_64: W: manual-page-warning /usr/share/man/man3/serd.3.gz 69: can't find macro file `deunicode.tmac' lilv.x86_64: W: manual-page-warning /usr/share/man/man3/SerdChunk.3.gz 69: can't find macro file `deunicode.tmac' lilv.x86_64: W: manual-page-warning /usr/share/man/man3/sord.3.gz 69: can't find macro file `deunicode.tmac' lilv.x86_64: W: manual-page-warning /usr/share/man/man1/lv2ls.1.gz 69: can't find macro file `deunicode.tmac' lilv.x86_64: W: manual-page-warning /usr/share/man/man1/lv2info.1.gz 69: can't find macro file `deunicode.tmac' This man page may contain problems that can cause it not to be formatted as intended. Is there a package that provides deunicode.tmac? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=683857 https://bugzilla.novell.com/show_bug.cgi?id=683857#c12 --- Comment #12 from Jan Engelhardt <jengelh@medozas.de> 2011-06-07 20:41:28 UTC --- As of * Mon Jun 06 2011 mvyskocil@suse.cz - - fix bnc#682913: device X100 is missing * create new groff-devx package containing all devX devices, as they need X for build - fix bnc#683857: Unicode characters in use * groff-1.20.1-deunicode.patch adds deunicode.tmac to tty.tmac removes all unecessary unicode characters in tty output I still get 0x2010 as a dash separator. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=683857 https://bugzilla.novell.com/show_bug.cgi?id=683857#c13 Jan Engelhardt <jengelh@medozas.de> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |REOPENED Resolution|FIXED | --- Comment #13 from Jan Engelhardt <jengelh@medozas.de> 2011-06-07 20:41:48 UTC --- - -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=683857 https://bugzilla.novell.com/show_bug.cgi?id=683857#c14 --- Comment #14 from Michal Vyskocil <mvyskocil@novell.com> 2011-06-08 09:29:20 UTC --- Sorry, I accidentally tested the groff from 11.3. However the deunicode.tmac is not the proper solution. The working one is simple - change the soft-hyphenation char to - That is what the new version is doing # To be sure I'm testing the right version! $ rpm -q --changelog groff | head -n 4* Wed Jun 08 2011 mvyskocil@suse.cz - fix bnc#683857: Unicode characters in use properly * change the soft hyphenation char to - in tty.tmac $ man -l test.1 | pcregrep -o '[^\w]+' | sort -u | grep -- '-' | hexdump -C 00000000 2d 0a |-.| 00000002 Commited as revision 17 to M17N/groff -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=683857 https://bugzilla.novell.com/show_bug.cgi?id=683857#c Michal Vyskocil <mvyskocil@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Blocks| |698290 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=683857 https://bugzilla.novell.com/show_bug.cgi?id=683857#c15 Jan Engelhardt <jengelh@medozas.de> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|REOPENED |RESOLVED Resolution| |FIXED --- Comment #15 from Jan Engelhardt <jengelh@medozas.de> 2011-06-08 14:01:48 UTC --- Now does what was wanted. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=683857 https://bugzilla.novell.com/show_bug.cgi?id=683857#c Jan Engelhardt <jengelh@medozas.de> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |VERIFIED -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=683857 https://bugzilla.novell.com/show_bug.cgi?id=683857#c16 --- Comment #16 from Bernhard Wiedemann <bwiedemann@novell.com> 2011-06-09 12:00:15 CEST --- This is an autogenerated message for OBS integration: This bug (683857) was mentioned in https://build.opensuse.org/request/show/73067 11.4 / groff https://build.opensuse.org/request/show/73070 Factory / groff -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=683857 https://bugzilla.novell.com/show_bug.cgi?id=683857#c17 Swamp Workflow Management <swamp@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status Whiteboard| |maint:released:11.4:41461 --- Comment #17 from Swamp Workflow Management <swamp@suse.com> 2011-06-16 07:36:02 UTC --- Update released for: groff, groff-debuginfo, groff-doc Products: openSUSE 11.4 (debug, i586, x86_64) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com