On 2022-10-08 01:01, Lew Wolfgang wrote:
Hi Folks,
I'm anticipating a requirement where I'll have lots of data on a large RAID-6 partition. The data consists of many 7-GB files that were uploaded over several Ethernet channels.
The files then need to be copied to individual "JBOD" disks plugged into the same chassis as the RAID-6 array. Alas, there is more data on the RAID-6 then will fit on one JBOD disk, so I've got to come up with a process that will copy the files to a disk, then stop when the disk is full, prompt the operator to unmount and remove the JBOD, then plug in another one, and mount to continue the process. Repeat until all the data are copied to as many JBOD's as it takes.
I think I can convince tar to write to multiple volumes and prompt to change media, but I don't want to treat the JBOD's as character devices, or like a tape drive in other words. I'd like to take any one of the disks in a series and mount them on another host and read the files without having to load the other disks in that series.
I'm thinking a fancy shell script is called for here. I bet that rsync should be leveraged to allow restarting from an aborted copy process.
Any ideas? It seems to be a common requirement, maybe someone's done it already?
I'm not familiar with JBOD. I would output as many whole files (each 7 GB) as can fit in a single hard disk, leaving a chunk of free space of less than 7 GB "wasted" at the end, then jump to the next disk. I understand this is archival, so independent disks and whole files have more chances of survival. And, if the data can be compressed, I would use compressed BTRFS partitions, which is transparent. If you do not want to waste those last 7 GB or less, then I would use perhaps dd to split the last file to size, and script it. But then you need some sort of catalog and scripting to extract the whole. But I would not use tar, I think. Maybe reinventing the wheel, though. (because they are all 7GB sized files) -- Cheers / Saludos, Carlos E. R. (from 15.3 x86_64 at Telcontar)