Re: [opensuse] uncompessing zip files and accented characters

13 Jul 2010

      Philipp Thomas wrote:
...
* Dave Howorth (dhoworth@mrc-lmb.cam.ac.uk) [20100713 10:50]:
...
They already are, I think.
No they aren't.
Another cryptic posting! What do you mean by this?

My understanding is that unzip does not alter the binary octets in the
filenames, so in so far as a filename contains characters at all they
are preserved in whatever character set and encoding was used to create
them.

Is that wrong? What has unzip transformed the filenames into if it
hasn't preserved them?
...
The zip format neither specifies an encoding to use nor does
it offer a field that identifies the encoding. Thus unzip in its original
form can't handle different encodings and you also can't specify the
encoding. And as stated, upstream has rejected all patches up till now,
stating that utf8 should be used. Right, as if any Win* user would be able
to do so.
To remedy the situation a bit I've accepted a patch to openSUSE's unzip that
will decode russian and czech encodings. As librcc is extensible, maybe it
could be extended to also handle hungarian file names sokmetime in the
future.
Philipp
-- 
To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org
For additional commands, e-mail: opensuse+help@opensuse.org