Hi, I'm trying to back up a load of files with French and German characters in their names to a udf filesystem on a CD-RW. Every time I try to add a file with an accented character, though, I just get "udf: bad UTF-8 character" recorded in the system log, and no file is created. Short of using ext2 instead of udf for the CD, is there any way of fixing this? I've tried specifying --u16 or --u8 when running mkudffs, but it makes no difference. I'm currently running stock kernel 2.4.20. Stephen
"Stephen" == Stephen Mollett
writes:
Stephen> Hi, I'm trying to back up a load of files with French and Stephen> German characters in their names to a udf filesystem on a Stephen> CD-RW. Every time I try to add a file with an accented Stephen> character, though, I just get "udf: bad UTF-8 character" Stephen> recorded in the system log, and no file is created. Are the source filename in utf8 or in something like latin1? If the latter, I'll bet they are not being translated; a latin1 octet-stream wil tend to be invalid utf8.... -JimC
On Sunday 15 Jun 2003 21:39, James H. Cloos Jr. wrote:
"Stephen" == Stephen Mollett
writes: Stephen> ... Every time I try to add a file with an accented Stephen> character, though, I just get "udf: bad UTF-8 character" Are the source filename in utf8 or in something like latin1?
I'm not sure. I'm just copying them directly from my hard disk, which is reiserfs. I've tried using both plain cp from the command line and using drag-and-drop with konqueror (in case one translates the filenames in some way) but both fail. Attempting to create a new file or directory on the CD with an accented character in its name fails similarly. The system's configured with ISO 8859-1 as its default character map, if that information helps in any way. Stephen
Stephen> I'm not sure. I'm just copying them directly from my hard Stephen> disk, which is reiserfs. Stephen> The system's configured with ISO 8859-1 as its default Stephen> character map, if that information helps in any way. Yes, that does suggest that the filenames are in latin1 on the reiser filesystem. The error you get suggests the udf filesystem requires that non-ascii filenames be in utf8. As such, you need to somehow convert the names from latin1 to utf8 when copying the files over. Without chaning your locale, something like this *might* work: #!/bin/bash dest=$1;shift for ij in $*;do kl=$(echo $ij|iconv -f latin1 -t utf8|tr -d \\n) cp $ik ${dest}/${kl} done exit But it is untested, off the top of my head, and does not duplicate the syntax of cp(1). (The target dir is the first arg of this pseudo-code script, unlike cp(1) where it is the last arg.) The downside of something like the above is that the filenames on the udf fs won't look right in a non-utf8 locale. On most posix filesystems -- including ffs, ext2/3 and reiser -- the filenames are just an octet-stream with only NULL and / disallowed. UDF, OTOH, stipulates utf16 or utf8, IIRC. So perhaps the better answer to your problem is to tar up the files and copy the tar archives to the udf fs, rather than the individual files. -JimC
You can just mount with -o iocharset=<whatever> (default, iso8859-1, etc) Linux UDF defaults to utf8 if you don't specific the character set. Ben James H. Cloos Jr. wrote:
Stephen> I'm not sure. I'm just copying them directly from my hard Stephen> disk, which is reiserfs.
Stephen> The system's configured with ISO 8859-1 as its default Stephen> character map, if that information helps in any way.
Yes, that does suggest that the filenames are in latin1 on the reiser filesystem. The error you get suggests the udf filesystem requires that non-ascii filenames be in utf8. As such, you need to somehow convert the names from latin1 to utf8 when copying the files over.
Without chaning your locale, something like this *might* work:
#!/bin/bash dest=$1;shift for ij in $*;do kl=$(echo $ij|iconv -f latin1 -t utf8|tr -d \\n) cp $ik ${dest}/${kl} done exit
But it is untested, off the top of my head, and does not duplicate the syntax of cp(1). (The target dir is the first arg of this pseudo-code script, unlike cp(1) where it is the last arg.)
The downside of something like the above is that the filenames on the udf fs won't look right in a non-utf8 locale. On most posix filesystems -- including ffs, ext2/3 and reiser -- the filenames are just an octet-stream with only NULL and / disallowed. UDF, OTOH, stipulates utf16 or utf8, IIRC. So perhaps the better answer to your problem is to tar up the files and copy the tar archives to the udf fs, rather than the individual files.
-JimC
participants (3)
-
Ben Fennema
-
James H. Cloos Jr.
-
Stephen Mollett