Philipp Thomas wrote:
* Dave Howorth (dhoworth@mrc-lmb.cam.ac.uk) [20100713 12:50]:
Exactly, as far as I know filenames are stored in the filesystem as octets.
Correct so far.
There's no notion of characters or encodings.
That's not correct.
OK, I hold my hand up. Please just give me a reference to the place where the encodings are defined so I can learn.
Neither does the kernel care what the octet sequence represents.
Wrong! File system drivers like ntfs or vfat explicitely use specific encodings.
Hmm, so Microsoft break software layering abstractions. Why does that not surprise me? Isn't ntfs-3g a user-level driver though?
Talk of encoding in the filenames themselves is muddled thinking.
I tend to disagree.
These articles are quite old so I thought at first my beliefs were just out of date: http://lwn.net/Articles/71472/ http://www.win.tue.nl/~aeb/linux/lk/lk-6.html But this is 2010-05-23 http://www.dwheeler.com/essays/fixing-unix-linux-filenames.html "Yet because you can’t know the character encoding of a given filename, in theory you can’t display filenames at all today. Why? Because then you don’t know how to translate the bytes of a filename into displayable characters (!)."
As has been suggested, convmv is a way to do that.
Convmv can help may be able to convert the file name on disk but it won't change unzip's display.
Indeed. Setting the appropriate environment, specifically locale, in which to run unzip is the way to do that. Cheers, Dave PS I'm not trying to argue that the filename architecture is the best way to design it, just what it is. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org