Various problems with packet writing
For a long time I have been looking around for some technology that would allow me to use high volume rewritable and removable media. For several years I have used IOmega ZIP drives, whcih I find very fast and dependable, but their capacity is a bit too small and the price of media too high. Recently I decided to give a try to CD packet writing. When I tried to use packet writing software like DirectCD or InCD under Windows, I came accross two problems: 1. Sometimes the medium got corrupted to such extent, that I was unable to read it at all (i.e. complete loss of data) 2. The machine was very unresponsive during the process of writing For these reasons I decided to give it a try under Unix/Linux. I tried to compile CD packet-writing patch into SuSE 2.4.19 kernel and the generic 2.4.20 kernel. I also bought a new IDE CD-writer (Plextor 48/24/48) to make sure that the CDRW drive would support all necessary features. Everything basically went on well, however, I ran into several minor problems I would like to ask about: 1. I tried to copy the complete directory /usr/X11R6 onto the CD (around 100MByte in 8000 files) using "cp -r". I followed the progress using "df -k". At the beginning the progress was really fast (1Mbyte per second?), but at a certain point it slowed down tremendously. I suppose that some internal buffers got full and the copying process had to wait until the packet driver flushed the data onto the CD. However, at this point the machine became very, very unresponsive. The "top" command showed CPU utilization below 5%. I have no idea what kind of activity the machine was engaged in (besides writing data to the CD, which should not puch such a load on it). Perhaps a scheduling problem? 2. After the copy process had finished, I tried to unmount the media. The unmount command took some 30 minutes to finish. The command "df -k" kept showing the CD mounted throughout the whole process of unmounting, but both the used space and free space were reported incorrectly. However, the whole process finished without errors and I was able to mount the CD again and access everything I had written to it. 3. I was unable to rename a directory. The UDF driver wrote an error message to the syslog mentioning some unsupported feature. According to a message posted to this mailing list the problem should have been fixed, but I encountered it despite having used the most recent kernel and patch. 4. I had trouble with filenames using special characters in my national alphabet (Slovak). Without any special mount options I was unable to use certain characters in filenames. When I used kernel version 2.4.19, I was able to use the mount option "iocharset=ISO-8859-2", which partially solved the problem. However, with kernel version 2.4.20 this option was refused by the mount command as illegal. According to man pages, the option was legal, but ignored. In neither case this was true. Another problem was when I tried to read the CD under MS Windows 2000 (built in UDF reader). Filenames with special characters were mangled and I was unable to open the files. I understand that for the Slovak language Windows uses a 852 codepage while Unix uses ISO-8859-2, but is not UDF supposed to use UTF-16 encoding (and any filename written to a UDF disk appropriately converted)? Next I plan to try out Mt Rainer. I expect that the behavior in cases 1. a 2. may be completely different, but 3. and 4. concern more the UDF filesystem itself than packet writing. Any feedback on these issues will be very welcome. Robert Szelepecsenyi
On Sun, Dec 29, 2002 at 08:04:38PM +0100, Robert Szelepcsenyi wrote:
1. I tried to copy the complete directory /usr/X11R6 onto the CD (around 100MByte in 8000 files) using "cp -r". I followed the progress using "df -k". At the beginning the progress was really fast (1Mbyte per second?), but at a certain point it slowed down tremendously. I suppose that some internal buffers got full and the copying process had to wait until the packet driver flushed the data onto the CD. However, at this point the machine became very, very unresponsive. The "top" command showed CPU utilization below 5%. I have no idea what kind of activity the machine was engaged in (besides writing data to the CD, which should not puch such a load on it). Perhaps a scheduling problem?
2.4 behaves very pourly when writing to a very slow device (such as a CD-RW).
2. After the copy process had finished, I tried to unmount the media. The unmount command took some 30 minutes to finish. The command "df -k" kept showing the CD mounted throughout the whole process of unmounting, but both the used space and free space were reported incorrectly. However, the whole process finished without errors and I was able to mount the CD again and access everything I had written to it.
The copy can return before all the data is written to disc. Unmounting just forces a sync, so all the rest of the data has to be written out before the disc unmounts. BTW, if you run a diff with the data on the CD-RW and the original data, are there any differences?
3. I was unable to rename a directory. The UDF driver wrote an error message to the syslog mentioning some unsupported feature. According to a message posted to this mailing list the problem should have been fixed, but I encountered it despite having used the most recent kernel and patch.
Were you using UDF from CVS? If not, you wern't using the latest version.
4. I had trouble with filenames using special characters in my national alphabet (Slovak). Without any special mount options I was unable to use certain characters in filenames. When I used kernel version 2.4.19, I was able to use the mount option "iocharset=ISO-8859-2", which partially solved the problem. However, with kernel version 2.4.20 this option was refused by the mount command as illegal. According to man pages, the option was legal, but ignored. In neither case this was true.
Make sure CONFIG_NLS was defined. If not, iocharset doesn't exist as an option.
Another problem was when I tried to read the CD under MS Windows 2000 (built in UDF reader). Filenames with special characters were mangled and I was unable to open the files. I understand that for the Slovak language Windows uses a 852 codepage while Unix uses ISO-8859-2, but is not UDF supposed to use UTF-16 encoding (and any filename written to a UDF disk appropriately converted)?
UDF uses Unicode (8 or 16 bits, depending). The only testing with foreign languages I've done is using UTF-8 and Japanese, and it seemed to work right. No clue about iocharset.. I think I just copied the code from VFAT =) Ben
On December 29, 2002 10:39 PM, Ben Fennema wrote:
2.4 behaves very pourly when writing to a very slow device (such as a CD-RW).
I tried to recompile the kernel with the preemtible patch Damjan Bole mentioned on this list (thanks). The performance has improved a lot. Althouth still being rather slow during writing to a CDRW, the machine is at least not completely frozen. In the future I hope for further improvement in this respect, but for now I consider this problem solved to sufficient extent.
The copy can return before all the data is written to disc. Unmounting just forces a sync, so all the rest of the data has to be written out before the disc unmounts.
BTW, if you run a diff with the data on the CD-RW and the original data, are there any differences?
I tried to put three huge files (tar archives) on the disc. After unmounting and remounting I tried to diff the files. It went through without any errors. I have never experienced any data inconsistency or loss unless I made a mistake mounting the disc or the disc developed bad blocks.
Were you using UDF from CVS? If not, you wern't using the latest version.
Make sure CONFIG_NLS was defined. If not, iocharset doesn't exist as an
You are right. My mistake. option. Again, you were right. When I recompiled the 2.4-20 kernel with this NLS, I could use the option again.
Another problem was when I tried to read the CD under MS Windows 2000 (built in UDF reader). Filenames with special characters were mangled and I was unable to open the files. I understand that for the Slovak language Windows uses a 852 codepage while Unix uses ISO-8859-2, but is not UDF supposed to use UTF-16 encoding (and any filename written to a UDF disk appropriately converted)?
UDF uses Unicode (8 or 16 bits, depending). The only testing with foreign languages I've done is using UTF-8 and Japanese, and it seemed to work right.
This is the only problem I have not been able to make any progress on at all. I resorted to experimenting on a hard disk partition, which I formated using UDF, to make sure it had nothing to do with packet writing (and save my burner and CD-RW media). I did the following test: I set the samba server to 825 codepage (client side) and ISO-8859-2 charset (server side). I created files with various national characters in their names in my public_html directory. I listed the contents of the directory on my Windows machine, the web browser being set to ISO-8859-2 encoding. All filenames showed correct. In this way I verified that the samba server had done all the translations correctly and that the files had correct names in the ISO-8859-2 encoding. Next I mounted a UDF volume with iocharset set to ISO-8859-2 and copied the files there. When I listed the contents of the directory, I got back rather different filenames. It seemed to me, that accented characters (that also exist in West European countries) were correct, but most of the other characters were mangled. However, without using the iocharset option I was not able to create some files at all. Another problem was that I was not able to use UTF-8 and iocharset at the same time. I would like to know, what takes care for the ISO-8859-x <-> UTF16 transcoding. It seems to me that the system lacks something for ISO-8859-2 and falls back to ISO-8859-1. Thanks, Robert Szelepcsenyi
On Mon, 6 Jan 2003, Robert Szelepcsenyi wrote:
On December 29, 2002 10:39 PM, Ben Fennema wrote:
BTW, if you run a diff with the data on the CD-RW and the original data, are there any differences?
I tried to put three huge files (tar archives) on the disc. After unmounting and remounting I tried to diff the files. It went through without any errors. I have never experienced any data inconsistency or loss unless I made a mistake mounting the disc or the disc developed bad blocks.
A warning though. Quite a few people have seen data corruption, but I think it is more likely to happen with many small files than a few large. -- Peter Osterlund - petero2@telia.com http://w1.894.telia.com/~u89404340
Robert Szelepcsenyi wrote:
I've done is using UTF-8 and Japanese, and it seemed to work right.
Oops, major bug when dealing with a file name that starts with only needing 8 bits per character and later on needs 16 bits. It starts over, but doesn't reset the location of where its writing the characters into. I havn't tested the fix (its in CVS), but it should be simple enough that even I couldn't screw it up (famous last words). Give it a try and let me know if it behaves any differently for you.
Another problem was that I was not able to use UTF-8 and iocharset at the same time.
Their mutually exclusive. UTF-8 maps directly int 8 or 16 bit unicode.
I would like to know, what takes care for the ISO-8859-x <-> UTF16 transcoding. It seems to me that the system lacks something for ISO-8859-2 and falls back to ISO-8859-1.
the NLS translation tables handle the ISO-8859-x <-> Unicode transcoding. (Which should work exactly the same as VFAT and NTFS do, assuming no more bugs of mine =]) Ben
Thanks for your effort. I updated my kernel source from the CVS. The problem with renaming a directory has disappeared. Using ISO-8859-2 charset still does not work. Before I applied the patch, sometimes I had even got a lot of nonsense when listing the contents of a directory (looked like chunks of binary data dumped right onto the terminal). Not the situation seems to have improved a little bit, but the problem with ISO-8859-2 charset still persists. Basically: When I mount without using the iocharset=iso8859-2 option, I can't use certain filenames at all. I get the error "Filename too long" and the syslog says: "Bad UTF-8 character". I consider this normal. When I mount with the iocharset=iso8859-2 option, I can create anything I want, but I won't get it back correctly. Certain characters are even transformed into control symbols like "^O" or "^M". Another problem: when I mount without the iocharset=iso8859-2 option, create some files with national characters, unmount, and remount with the iocharset=iso8859-2 option, I will get filenames that are of completely different length. I would expect filenames to stay the same length even if the charset/codepage are incorrect. If you need, I can send you a tar file with these "ill" filenames. If you untar into onto a udf volume, you will get something quite diffrent from the case when you untar it onto an ext2 or reiserfs volume. Robert Szelepcsenyi -----Original Message----- From: Ben Fennema [mailto:bfennema@attbi.com] Sent: Tuesday, January 07, 2003 6:56 AM To: Robert Szelepcsenyi Cc: packet-writing@suse.com Subject: Re: Various problems with packet writing Robert Szelepcsenyi wrote:
I've done is using UTF-8 and Japanese, and it seemed to work right.
Oops, major bug when dealing with a file name that starts with only needing 8 bits per character and later on needs 16 bits. It starts over, but doesn't reset the location of where its writing the characters into. I havn't tested the fix (its in CVS), but it should be simple enough that even I couldn't screw it up (famous last words). Give it a try and let me know if it behaves any differently for you.
Another problem was that I was not able to use UTF-8 and iocharset at the same time.
Their mutually exclusive. UTF-8 maps directly int 8 or 16 bit unicode.
I would like to know, what takes care for the ISO-8859-x <-> UTF16 transcoding. It seems to me that the system lacks something for ISO-8859-2 and falls back to ISO-8859-1.
the NLS translation tables handle the ISO-8859-x <-> Unicode transcoding. (Which should work exactly the same as VFAT and NTFS do, assuming no more bugs of mine =]) Ben -- To unsubscribe, e-mail: packet-writing-unsubscribe@suse.com For additional commands, e-mail: packet-writing-help@suse.com
Ok, this should be fixed in CVS.. *crosses fingers* Ben Robert Szelepcsenyi wrote:
This is the only problem I have not been able to make any progress on at all. I resorted to experimenting on a hard disk partition, which I formated using UDF, to make sure it had nothing to do with packet writing (and save my burner and CD-RW media). I did the following test:
I set the samba server to 825 codepage (client side) and ISO-8859-2 charset (server side). I created files with various national characters in their names in my public_html directory. I listed the contents of the directory on my Windows machine, the web browser being set to ISO-8859-2 encoding. All filenames showed correct. In this way I verified that the samba server had done all the translations correctly and that the files had correct names in the ISO-8859-2 encoding.
Next I mounted a UDF volume with iocharset set to ISO-8859-2 and copied the files there. When I listed the contents of the directory, I got back rather different filenames. It seemed to me, that accented characters (that also exist in West European countries) were correct, but most of the other characters were mangled. However, without using the iocharset option I was not able to create some files at all.
Another problem was that I was not able to use UTF-8 and iocharset at the same time.
I would like to know, what takes care for the ISO-8859-x <-> UTF16 transcoding. It seems to me that the system lacks something for ISO-8859-2 and falls back to ISO-8859-1.
Thanks,
Robert Szelepcsenyi
I have tried all possible characters. It seems to be working fine now. Thank you for your prompt help a Microsoft customer could only dream of. :-) First I could not read the disc on a Windows machine. Probably, it was formatted in UDF version 2.01, which Windows seems not to support. Even installing Roxio UDF reader did not solve the problem. However, after I had reformatted the disc using cdrwtool, which probably defaults to 1.5, I was able to read the disc without any problems. Even all the diacritics came out perfectly correct. Now all the problems I mentioned in my first posting to this list have been more or less solved. l would also like to try Mt Rainer. I have bought a CD Burner that supports this functionality. However, the most recent patch I have been able to find is: http://www.kernel.org/pub/linux/kernel/people/axboe/patches/v2.4/2.4.19-pre4 /cd-mrw-2.gz which failed to patch my 2.4.20 kernel (and SuSE 2.4.19 kernel) as well. Is there a patch for one of these kernels? Robert Szelepcsenyi -----Original Message----- From: Ben Fennema [mailto:bfennema@attbi.com] Sent: Thursday, January 09, 2003 6:05 AM To: Robert Szelepcsenyi Cc: packet-writing@suse.com Subject: Re: Various problems with packet writing Ok, this should be fixed in CVS.. *crosses fingers* Ben Robert Szelepcsenyi wrote:
This is the only problem I have not been able to make any progress on at all. I resorted to experimenting on a hard disk partition, which I formated using UDF, to make sure it had nothing to do with packet writing (and save my burner and CD-RW media). I did the following test:
I set the samba server to 825 codepage (client side) and ISO-8859-2 charset (server side). I created files with various national characters in their names in my public_html directory. I listed the contents of the directory on my Windows machine, the web browser being set to ISO-8859-2 encoding. All filenames showed correct. In this way I verified that the samba server had done all the translations correctly and that the files had correct names in the ISO-8859-2 encoding.
Next I mounted a UDF volume with iocharset set to ISO-8859-2 and copied the files there. When I listed the contents of the directory, I got back rather different filenames. It seemed to me, that accented characters (that also exist in West European countries) were correct, but most of the other characters were mangled. However, without using the iocharset option I was not able to create some files at all.
Another problem was that I was not able to use UTF-8 and iocharset at the same time.
I would like to know, what takes care for the ISO-8859-x <-> UTF16 transcoding. It seems to me that the system lacks something for ISO-8859-2 and falls back to ISO-8859-1.
Thanks,
Robert Szelepcsenyi
I'd also like to try mt rainier. AFAIK Jens will merge it into 2.5.x when things stabilize, but I'm not sure. Someone has more info about that?
Damjan
On Thu, 9 Jan 2003 23:11:03 +0100
"Robert Szelepcsenyi"
I have tried all possible characters. It seems to be working fine now. Thank you for your prompt help a Microsoft customer could only dream of. :-)
First I could not read the disc on a Windows machine. Probably, it was formatted in UDF version 2.01, which Windows seems not to support. Even installing Roxio UDF reader did not solve the problem. However, after I had reformatted the disc using cdrwtool, which probably defaults to 1.5, I was able to read the disc without any problems. Even all the diacritics came out perfectly correct.
Now all the problems I mentioned in my first posting to this list have been more or less solved. l would also like to try Mt Rainer. I have bought a CD Burner that supports this functionality. However, the most recent patch I have been able to find is:
http://www.kernel.org/pub/linux/kernel/people/axboe/patches/v2.4/2.4.19-pre4 /cd-mrw-2.gz
which failed to patch my 2.4.20 kernel (and SuSE 2.4.19 kernel) as well. Is there a patch for one of these kernels?
Robert Szelepcsenyi
participants (4)
-
Ben Fennema
-
Damjan Bole
-
Peter Osterlund
-
Robert Szelepcsenyi