[Bug 397244] New: utf-8 breaks sed
https://bugzilla.novell.com/show_bug.cgi?id=397244 Summary: utf-8 breaks sed Product: openSUSE 11.0 Version: RC 1 Platform: i586 OS/Version: Other Status: NEW Severity: Critical Priority: P5 - None Component: Basesystem AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: koenig@linux.de QAContact: qa@suse.de Found By: --- the sed expression below should strip off everything upto and including the last space. but with utf-8 locale this doesn't work: echo -e 'Intel\xae Core 2 Duo T7300' | LC_ALL=C sed 's/.* //' T7300 echo -e 'Intel\xae Core 2 Duo T7300' | LC_ALL=de_DE sed 's/.* //' T7300 BUT: cho -e 'Intel\xae Core 2 Duo T7300' | LC_ALL=de_DE.utf-8 sed 's/.* //' Intel�T7300 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=397244 User koenig@linux.de added comment https://bugzilla.novell.com/show_bug.cgi?id=397244#c1 --- Comment #1 from Harald Koenig <koenig@linux.de> 2008-06-04 13:47:52 MDT --- forgot version information (I've updated tofactory,just in case...) # rpm -q sed sed-4.1.5-101 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=397244 Marcus Meissner <meissner@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|bnc-team-screening@forge.provo.novell.com |mkoenig@novell.com -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=397244 User coolo@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=397244#c2 Stephan Kulow <coolo@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Severity|Critical |Normal --- Comment #2 from Stephan Kulow <coolo@novell.com> 2008-06-04 23:36:55 MDT --- looks pretty INVALID to me: echo -e 'Intel\xae Core 2 Duo T7300' | iconv -flatin1 -tutf8 | LC_ALL=de_DE.utf-8 sed 's/.* //' T7300 Sure, sed might behave undefined with non-utf8 input in utf-8 locales, but that's pretty much expected to me. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=397244 User mkoenig@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=397244#c3 Matthias Koenig <mkoenig@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |INVALID --- Comment #3 from Matthias Koenig <mkoenig@novell.com> 2008-06-05 02:42:51 MDT --- Of course, 0xae is invalid UTF-8 input. The correct UTF-8 sequence for 0xae is 0xc2 0xae, then it works as expected: # echo -e 'Intel\xc2\xae Core 2 Duo T7300' | LC_ALL=de_DE.utf-8 sed 's/.* //' T7300 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com