https://bugzilla.novell.com/show_bug.cgi?id=355757 Summary: recode generates nonstandard UCS-2 encoding. Product: openSUSE 10.3 Version: Final Platform: Other OS/Version: Other Status: NEW Severity: Normal Priority: P5 - None Component: Basesystem AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: jw@novell.com QAContact: qa@suse.de Depends on: 355755 Found By: --- echo Hello World > test.txt recode ..UCS-2 test.txt > test.ucs2 xxd test.txt 0000000: 0048 0065 006c 006c 006f 0020 0057 006f .H.e.l.l.o. .W.o 0000010: 0072 006c 0064 000a .r.l.d.. This hexdump shows big-endian ucs-2, but misses a BOM. The BOM is defined in rfc2781 and explained as follows in http://en.wikipedia.org/wiki/UTF-16: The UTF-16 (and UCS-2) encoding scheme allows either endian representation to be used, but mandates that the byte order should be explicitly indicated by prepending a Byte Order Mark before the first serialized character. This BOM is the encoded version of the Zero-Width No-Break Space (ZWNBSP) character, codepoint U+FEFF, chosen because it should never legitimately appear at the beginning of any character data. This results in the byte sequence FE FF (in hexadecimal) for big-endian architectures, or FF FE for little-endian. The BOM at the beginning of a UTF-16 or UCS-2 encoded data is considered to be a signature separate from the text itself; it is for the benefit of the decoder. [...] The BOM is not optional in the UCS-2 scheme. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.