[opensuse] Tar file Error - Urgent data recovery
I am having problems with extracting a tar file for a colleague. It was created with tar / gzip on opensuse 42.1 and a md5 checksum generated and stored on a NAS. The 42.1 box had a disk failure (which is why I keep advising everyone to use raid but they still do not listen until it goes wrong). So 42.3 has been installed on the new disk(s) which are now in raid (which would have avoided this problem had I been listened to in the first place). So first thing, I verified the md5 sum against the tar file which came back succesful. so I am using tar -xvzf /mnt/BACKUP/file.tar.gz to extract the file. But I keep getting the following error before extraction completes: gzip: stdin: Input/output error tar: Unexpected EOF in archive tar: Error is not recoverable: exiting now The odd thing is it isn't happening on one file, it happens on a different one every time (but never gets to the end). How can I recover the data in this tar file? Is there a way to ignore this error and continue anyway? It is quite important to get this data back so I would really appreciate some help. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-08-26 13:26, Paul Groves wrote:
I am having problems with extracting a tar file for a colleague.
It was created with tar / gzip on opensuse 42.1 and a md5 checksum generated and stored on a NAS.
...
But I keep getting the following error before extraction completes:
gzip: stdin: Input/output error tar: Unexpected EOF in archive tar: Error is not recoverable: exiting now
That's why I never do tar.gz to archive/backup data. Not reliable.
The odd thing is it isn't happening on one file, it happens on a different one every time (but never gets to the end).
How can I recover the data in this tar file? Is there a way to ignore this error and continue anyway?
I believe it is impossible, sorry. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
On 26/08/17 07:48 AM, Carlos E. R. wrote:
That's why I never do tar.gz to archive/backup data. Not reliable.
Sadly, many people disagree with you, and I don't mean people on this list, I mean service providers. For example, I changed providers and had them dump all my email and they did it to tar.gz format. Many others use this format for 'automatic archiving' or for distribution Yes it is not robust. There just isn't the file markers & metadata in TAR that there is in CPIO, for example. So what do you recommend instead? Using TAR with any other compression format doesn't address the problems with TAR. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-08-26 14:32, Anton Aylward wrote:
On 26/08/17 07:48 AM, Carlos E. R. wrote:
That's why I never do tar.gz to archive/backup data. Not reliable.
Sadly, many people disagree with you, and I don't mean people on this list, I mean service providers.
For example, I changed providers and had them dump all my email and they did it to tar.gz format. Many others use this format for 'automatic archiving' or for distribution
Yes it is not robust. There just isn't the file markers & metadata in TAR that there is in CPIO, for example.
So what do you recommend instead? Using TAR with any other compression format doesn't address the problems with TAR.
Uncompressed tar, if you must use tar. Perhaps rar with recovery data (the commercial version). -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
Paul Groves wrote:
I am having problems with extracting a tar file for a colleague.
It was created with tar / gzip on opensuse 42.1 and a md5 checksum generated and stored on a NAS.
The 42.1 box had a disk failure (which is why I keep advising everyone to use raid but they still do not listen until it goes wrong).
So 42.3 has been installed on the new disk(s) which are now in raid (which would have avoided this problem had I been listened to in the first place).
So first thing, I verified the md5 sum against the tar file which came back succesful.
so I am using tar -xvzf /mnt/BACKUP/file.tar.gz to extract the file.
But I keep getting the following error before extraction completes:
gzip: stdin: Input/output error tar: Unexpected EOF in archive tar: Error is not recoverable: exiting now
The odd thing is it isn't happening on one file, it happens on a different one every time (but never gets to the end).
How can I recover the data in this tar file? Is there a way to ignore this error and continue anyway?
It is quite important to get this data back so I would really appreciate some help.
It's odd that the checksum works but tar doesn't. You could try copying it somewhere first. Maybe the disk it is on is causing errors. Or you could try gunzip first to create a .tar, and then tar xvf that. Maybe it's not compressed, or it's compressed with something else, like bzip2, or xz? Did you create it? Maybe the version of tar is different or faulty, you could try an older version. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 26/08/17 13:24, Richmond wrote:
It's odd that the checksum works but tar doesn't. You could try copying it somewhere first. Maybe the disk it is on is causing errors. Or you could try gunzip first to create a .tar, and then tar xvf that. It never crossed my mind to try that. I will give it a go. (Not something I have ever done before though). Maybe it's not compressed, or it's compressed with something else, like bzip2, or xz? Did you create it? it war created with sudo tar -cvzf /path/to/archive.tar.gz /files/here Maybe the version of tar is different or faulty, you could try an older version. Maybe. Seems unlikely though.
I have a spare box. I will install 42.1 on it and try to extract it on there with the older version of tar. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Paul Groves wrote:
I have a spare box. I will install 42.1 on it and try to extract it on there with the older version of tar.
I was brain-storming. It is unlikely. That seems a lot of work to go to, it would be easier to get the source and compile it. But looking at the 42.1 repo, tar is version 1.27.1 which is the same version as I have on 42.3. The change log shows it has been patched though. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 26/08/17 14:34, Richmond wrote:
Paul Groves wrote:
I have a spare box. I will install 42.1 on it and try to extract it on there with the older version of tar.
I was brain-storming. It is unlikely. That seems a lot of work to go to, it would be easier to get the source and compile it. I don't really want to mess with the production server more than necessary. But looking at the 42.1 repo, tar is version 1.27.1 which is the same version as I have on 42.3. The change log shows it has been patched though.
1.29 on my 42.3 (but then I did upgrade everything when I re-installed the OS). -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Paul Groves wrote:
I am having problems with extracting a tar file for a colleague.
I just thought of something else. Some time ago I transferred a huge tar using sftp. I got errors trying to extract it. But a checksum on the original revealed that the errors were in the transfer, so if you have transferred it by sftp, use network file system instead. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 26/08/17 07:26 AM, Paul Groves wrote:
so I am using tar -xvzf /mnt/BACKUP/file.tar.gz to extract the file.
have you tried uncompressing the file to a regular TAR file first? -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Hi Paul, [...]
so I am using tar -xvzf /mnt/BACKUP/file.tar.gz to extract the file.
But I keep getting the following error before extraction completes:
gzip: stdin: Input/output error tar: Unexpected EOF in archive tar: Error is not recoverable: exiting now
The odd thing is it isn't happening on one file, it happens on a different one every time (but never gets to the end).
How can I recover the data in this tar file? Is there a way to ignore this error and continue anyway?
you could try to unzip it first to a pure tar file. Besides that, http://www.gzip.org/recover.txt gives a few hints how to handle corrupt gzip files.
It is quite important to get this data back so I would really appreciate some help.
Bye. Michael. -- Michael Hirmke -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 26/08/17 08:34 AM, Michael Hirmke wrote:
you could try to unzip it first to a pure tar file.
+1 If that fails you know the problem is with the compression not the TAR.
Besides that, http://www.gzip.org/recover.txt gives a few hints how to handle corrupt gzip files.
My experience is that TAR is not that reliable. The bigger the file, the more likely it is to be a problem. But then ALL software seems to have problems, more so for some people that others. I recall a time in my life when merely my presence seemed to be a jinx for electronic devices and some of them 'blew up', that is they went bang and produced smoke and copious amounts of heat and damaged at least the circuitry. One time this was heating element and the retort exploded and fragments of glass went flying. For some reason I had bent down below bench level, tie a shoelace or pick something up, and escaped it. Someone later quoted part of "Child in time" to me. These days electronics works but plant life fails. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 26/08/2017 13:26, Paul Groves wrote:
I am having problems with extracting a tar file for a colleague.
It was created with tar / gzip on opensuse 42.1 and a md5 checksum generated and stored on a NAS.
The 42.1 box had a disk failure (which is why I keep advising everyone to use raid but they still do not listen until it goes wrong).
So 42.3 has been installed on the new disk(s) which are now in raid (which would have avoided this problem had I been listened to in the first place).
So first thing, I verified the md5 sum against the tar file which came back succesful.
so I am using tar -xvzf /mnt/BACKUP/file.tar.gz to extract the file.
But I keep getting the following error before extraction completes:
gzip: stdin: Input/output error tar: Unexpected EOF in archive tar: Error is not recoverable: exiting now
The odd thing is it isn't happening on one file, it happens on a different one every time (but never gets to the end).
How can I recover the data in this tar file? Is there a way to ignore this error and continue anyway?
It is quite important to get this data back so I would really appreciate some help.
Try running bznew on the file, it re-compresses gzip files to bzip (bz2) files. It also has a verify mode. Make a copy of the original first. Regards Dave P -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Sat, 26 Aug 2017 12:26:51 +0100 Paul Groves wrote: 8< - - - - - snipped - - - - - >8
so I am using tar -xvzf /mnt/BACKUP/file.tar.gz to extract the file.
But I keep getting the following error before extraction completes:
gzip: stdin: Input/output error tar: Unexpected EOF in archive tar: Error is not recoverable: exiting now 8< - - - - - snipped - - - - - >8
Hi Paul, Your post triggered a memory of my having saved an excerpt some time ago from a thread discussing similar unexpected tar errors and behaviors:
This old way of writing `tar' options can surprise even experienced users. For example, the two commands:
tar cfz archive.tar.gz file tar -cfz archive.tar.gz file
are quite different. The first example uses `archive.tar.gz' as the value for option `f' and recognizes the option `z'. The second (old form) example, however, uses `z' as the value for option `f' -- probably not what was intended.
Old options are kept for compatibility with old versions of `tar'.
This second example could be corrected in many ways, among which the following are equivalent:
tar -czf archive.tar.gz file tar -cf archive.tar.gz -z file tar cf archive.tar.gz -z file
As far as we know, all `tar' programs, GNU and non-GNU, support old options.
I've been using the 'new form' (no hyphen preceding flags) for a few years since reading this and not encountered any problems. Could this be somehow related? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Paul Groves composed on 2017-08-26 12:26 (UTC+0100):
I am having problems with extracting a tar file for a colleague.
It was created with tar / gzip on opensuse 42.1 and a md5 checksum generated and stored on a NAS.
The 42.1 box had a disk failure (which is why I keep advising everyone to use raid but they still do not listen until it goes wrong).
So 42.3 has been installed on the new disk(s) which are now in raid (which would have avoided this problem had I been listened to in the first place).
So first thing, I verified the md5 sum against the tar file which came back succesful.
so I am using tar -xvzf /mnt/BACKUP/file.tar.gz to extract the file.
But I keep getting the following error before extraction completes:
gzip: stdin: Input/output error tar: Unexpected EOF in archive tar: Error is not recoverable: exiting now
The odd thing is it isn't happening on one file, it happens on a different one every time (but never gets to the end).
How can I recover the data in this tar file? Is there a way to ignore this error and continue anyway?
It is quite important to get this data back so I would really appreciate some help.
Maybe try star and/or pax instead. Also try entering those archives with mc and copying out their content. -- "The wise are known for their understanding, and pleasant words are persuasive." Proverbs 16:21 (New Living Translation) Team OS/2 ** Reg. Linux User #211409 ** a11y rocks! Felix Miata *** http://fm.no-ip.com/ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
OK, I never managed to get the data back from the tar file so I would assume it was corrupted during creation. However all of the data has been recovered from several hours of detective work on my part going through everyone's computer and copying their data back to the new server. A few older file were lost, however I have a tape from a couple of weeks ago with those older files already on. I have spent the last hour saying 'I told you so' to the colleague that thought RAID was a waste of time and he has now supplied me with lots of beer to leave him alone. :D And I set up the server with RAID and made a tape backup which I will be taking home and putting somewhere safe. so it all ended up OK :) -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-08-29 19:16, Paul Groves wrote:
OK, I never managed to get the data back from the tar file so I would assume it was corrupted during creation.
However all of the data has been recovered from several hours of detective work on my part going through everyone's computer and copying their data back to the new server.
A few older file were lost, however I have a tape from a couple of weeks ago with those older files already on.
I have spent the last hour saying 'I told you so' to the colleague that thought RAID was a waste of time and he has now supplied me with lots of beer to leave him alone. :D
And I set up the server with RAID and made a tape backup which I will be taking home and putting somewhere safe.
so it all ended up OK :)
Paranoids compare the backup with the original as part of the backup procedure before calling it done ;-) -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
On 29/08/17 19:54, Carlos E. R. wrote:
OK, I never managed to get the data back from the tar file so I would assume it was corrupted during creation.
However all of the data has been recovered from several hours of detective work on my part going through everyone's computer and copying their data back to the new server.
A few older file were lost, however I have a tape from a couple of weeks ago with those older files already on.
I have spent the last hour saying 'I told you so' to the colleague that thought RAID was a waste of time and he has now supplied me with lots of beer to leave him alone. :D
And I set up the server with RAID and made a tape backup which I will be taking home and putting somewhere safe.
so it all ended up OK :) Paranoids compare the backup with the original as part of the backup
On 2017-08-29 19:16, Paul Groves wrote: procedure before calling it done ;-)
I have always wondered how to do this because it appears it is not possible to use tar with -W (Verify) and -M (multi volume) at the same time. Will start a new thread on this topic. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-08-29 22:30, Paul Groves wrote:
On 29/08/17 19:54, Carlos E. R. wrote:
OK, I never managed to get the data back from the tar file so I would assume it was corrupted during creation.
However all of the data has been recovered from several hours of detective work on my part going through everyone's computer and copying their data back to the new server.
A few older file were lost, however I have a tape from a couple of weeks ago with those older files already on.
I have spent the last hour saying 'I told you so' to the colleague that thought RAID was a waste of time and he has now supplied me with lots of beer to leave him alone. :D
And I set up the server with RAID and made a tape backup which I will be taking home and putting somewhere safe.
so it all ended up OK :) Paranoids compare the backup with the original as part of the backup
On 2017-08-29 19:16, Paul Groves wrote: procedure before calling it done ;-)
I have always wondered how to do this because it appears it is not possible to use tar with -W (Verify) and -M (multi volume) at the same time.
I don't use tar.gz for backups. It is not reliable, you should know that by now ;-) -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
Hi,
On 2017-08-29 19:16, Paul Groves wrote:
OK, I never managed to get the data back from the tar file so I would assume it was corrupted during creation.
However all of the data has been recovered from several hours of detective work on my part going through everyone's computer and copying their data back to the new server.
A few older file were lost, however I have a tape from a couple of weeks ago with those older files already on.
I have spent the last hour saying 'I told you so' to the colleague that thought RAID was a waste of time and he has now supplied me with lots of beer to leave him alone. :D
And I set up the server with RAID and made a tape backup which I will be taking home and putting somewhere safe.
so it all ended up OK :)
Paranoids compare the backup with the original as part of the backup procedure before calling it done ;-)
real [tm] paranoids are doing a backup, comparing it, duplicating it to another medium and comparing it again. I should know that, cause I *am* a real [tm] paranoid ;)
-- Cheers / Saludos,
Carlos E. R.
Bye. Michael. -- Michael Hirmke -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-08-29 22:40, Michael Hirmke wrote:
Hi,
On 2017-08-29 19:16, Paul Groves wrote:
OK, I never managed to get the data back from the tar file so I would assume it was corrupted during creation.
However all of the data has been recovered from several hours of detective work on my part going through everyone's computer and copying their data back to the new server.
A few older file were lost, however I have a tape from a couple of weeks ago with those older files already on.
I have spent the last hour saying 'I told you so' to the colleague that thought RAID was a waste of time and he has now supplied me with lots of beer to leave him alone. :D
And I set up the server with RAID and made a tape backup which I will be taking home and putting somewhere safe.
so it all ended up OK :)
Paranoids compare the backup with the original as part of the backup procedure before calling it done ;-)
real [tm] paranoids are doing a backup, comparing it, duplicating it to another medium and comparing it again. I should know that, cause I *am* a real [tm] paranoid ;)
Also, paranoids™ do nor rely on tar.gz ;-) -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
Hi, [...]
Paranoids compare the backup with the original as part of the backup procedure before calling it done ;-)
real [tm] paranoids are doing a backup, comparing it, duplicating it to another medium and comparing it again. I should know that, cause I *am* a real [tm] paranoid ;)
Also, paranoids? do nor rely on tar.gz ;-)
exactly - I'm doing a 1:1 rsync backup without any compression on the target side.
-- Cheers / Saludos,
Carlos E. R.
Bye. Michael. -- Michael Hirmke -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-08-30 09:58, Michael Hirmke wrote:
Hi,
[...]
Paranoids compare the backup with the original as part of the backup procedure before calling it done ;-)
real [tm] paranoids are doing a backup, comparing it, duplicating it to another medium and comparing it again. I should know that, cause I *am* a real [tm] paranoid ;)
Also, paranoids? do nor rely on tar.gz ;-)
exactly - I'm doing a 1:1 rsync backup without any compression on the target side.
Same here. To clarify, the problem is not tar itself, but the compression. A single error, and the entire archive is lost, usually beyond repair. I would love to use compression, but so far, I don't like the compressors in Linux for backup. Only RAR has data error protection, but it is commercial and doesn't handle all Linux attributes/permissions. Other compressors are adding error correction features, but are not there yet. Then, fylesystems could be compressed. This is a thing that MsDos/Windows has had since decades, but not Linux. Only now btrfs does, but I don't trust that fs yet... -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
On 30/08/17 03:58 AM, Michael Hirmke wrote:
exactly - I'm doing a 1:1 rsync backup without any compression on the target side.
Well, yes. Disk-to-disk-to<something> is a good basic strategy, given the right <something> Regular readers will recall that I use LVM which allows a snapshot of a like file system to be made, and that's probably more efficacious than the rsync is both are local. Rsync wins out if you are, for example copying to the Cloud as a backup. Personally I don't. It's about bandwidth. I suppose I could pay for more but with so many other ways of backing up available the pressure isn't there for me. I'm not saying it won't be the case for other people, organizations. Again, regular readers will recall that I organize much of my workspace to back up onto CD/DVD. Yes, it requires organization and planning but it works. But sometimes you do need to 'package', and the fact that we use RPM files to package/bundle up a set of files and some metadata is an example of that. The guts of a RPM file is gzip'd CPIO/SVR4 format file. More recent versions of RPM can also use bzip2, lzip, lzma, or xz compression. IIR there was a shift by the UNIX leaders such as Bell's USG away from TAR to CPIO back in the 1970s. Basically the decision tree comes down to this: A) Do you want to package the files into a single file or not? B) if "yes" to the above, do you want to compress the package? I suppose you could consider compressing each file as it goes into the package but that's neither so efficient nor as convenient. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-08-30 14:43, Anton Aylward wrote:
But sometimes you do need to 'package', and the fact that we use RPM files to package/bundle up a set of files and some metadata is an example of that.
The guts of a RPM file is gzip'd CPIO/SVR4 format file. More recent versions of RPM can also use bzip2, lzip, lzma, or xz compression.
But 1) this is not a backup, and 2) there is a checksum. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
On 30/08/17 09:15 AM, Carlos E. R. wrote:
On 2017-08-30 14:43, Anton Aylward wrote:
But sometimes you do need to 'package', and the fact that we use RPM files to package/bundle up a set of files and some metadata is an example of that.
The guts of a RPM file is gzip'd CPIO/SVR4 format file. More recent versions of RPM can also use bzip2, lzip, lzma, or xz compression.
But 1) this is not a backup, and 2) there is a checksum.
Checksums are good. It doesn't matter whether they are on the result of a rsync or are for a bundled package. And why should you not use a RPM format for bundling up a collection of files that constitute a backup? Isn't that what, in effect, they are used for anyway? Don't you see they symmetry? You can save a group of files with metadata about how to reinstall them, compress same, and then 'ship' it to any media you want, DVD, cloud, ftp .. Yes, it takes a bit more than a simple TAR or CPIO, but you are supplying the metadata for re-installation. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-08-30 16:00, Anton Aylward wrote:
On 30/08/17 09:15 AM, Carlos E. R. wrote:
On 2017-08-30 14:43, Anton Aylward wrote:
But sometimes you do need to 'package', and the fact that we use RPM files to package/bundle up a set of files and some metadata is an example of that.
The guts of a RPM file is gzip'd CPIO/SVR4 format file. More recent versions of RPM can also use bzip2, lzip, lzma, or xz compression.
But 1) this is not a backup, and 2) there is a checksum.
Checksums are good. It doesn't matter whether they are on the result of a rsync or are for a bundled package.
And why should you not use a RPM format for bundling up a collection of files that constitute a backup? Isn't that what, in effect, they are used for anyway? Don't you see they symmetry?
You don't understand. A backup is something you rely on for recovery in case of disaster. It has to be reliable, and has to last long. An rpm is not critical. If the download went bad, the checksum notices and you can download it again. If there was an error in creation, the users complain and it is created again. A backup can not be created again if bad. They are different use cases. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
On 30/08/17 02:23 PM, Carlos E. R. wrote:
On 2017-08-30 16:00, Anton Aylward wrote:
On 30/08/17 09:15 AM, Carlos E. R. wrote:
On 2017-08-30 14:43, Anton Aylward wrote:
But sometimes you do need to 'package', and the fact that we use RPM files to package/bundle up a set of files and some metadata is an example of that.
The guts of a RPM file is gzip'd CPIO/SVR4 format file. More recent versions of RPM can also use bzip2, lzip, lzma, or xz compression.
But 1) this is not a backup, and 2) there is a checksum.
Checksums are good. It doesn't matter whether they are on the result of a rsync or are for a bundled package.
And why should you not use a RPM format for bundling up a collection of files that constitute a backup? Isn't that what, in effect, they are used for anyway? Don't you see they symmetry?
You don't understand.
A backup is something you rely on for recovery in case of disaster. It has to be reliable, and has to last long.
An rpm is not critical. If the download went bad, the checksum notices and you can download it again. If there was an error in creation, the users complain and it is created again.
A backup can not be created again if bad.
They are different use cases.
I do understand. You are describing the conventional use of RPM as a method of distribution. *I* am focusing on it as a method of parcelling up a series of files so that it can be un-parcelled at a later date. Which is what TAR and CPIO do. Oh, wait! RPM uses CPIO to do the parcelling, it just adds some metadata. If TAR, with or without compression, is a valid way of making a backup, then CPIO is also a valid way; Some programs that do TAR such as PAX can also do CPIO (pax can read input archives and write output in cpio and tar formats; see the -x option). In fact PAX does a great job of encoding extra metadata and has many other options. I use is instead of CPIO or TAR! If CPIO is a valid way then packaging it with metadata saying where it should be unbundled and adding capability to prevent clash with any thing that it might overwrite that has a later modification date when 'restoring' is also quite valid, sensible in fact. Saying you can't use CPIO to do backups is like saying that you can't use TAR to do backups, and people *DO* use TAR to do backups. Not because it is a good idea, but because the UNIX greybeards did it tape in antediluvian ages. That CPIO, a replacement for TAR, was subverted to RPM, well, .. Isn't packaging a group of files to store, on a CD or DVD rather than a tape drive "backup"? If you were doing it before with TAR, and you're doing it now with CPIO, what's the issue? A metadata wrapper? I'm not disputing your specific (and widely spread) use case. I'm just saying that its a way of packaging files that can be put on an external storage media and retrieved at a later date. Which to my mind constitutes a "backup". -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-08-30 22:27, Anton Aylward wrote:
On 30/08/17 02:23 PM, Carlos E. R. wrote:
On 2017-08-30 16:00, Anton Aylward wrote:
On 30/08/17 09:15 AM, Carlos E. R. wrote:
On 2017-08-30 14:43, Anton Aylward wrote:
But sometimes you do need to 'package', and the fact that we use RPM files to package/bundle up a set of files and some metadata is an example of that.
The guts of a RPM file is gzip'd CPIO/SVR4 format file. More recent versions of RPM can also use bzip2, lzip, lzma, or xz compression.
But 1) this is not a backup, and 2) there is a checksum.
Checksums are good. It doesn't matter whether they are on the result of a rsync or are for a bundled package.
And why should you not use a RPM format for bundling up a collection of files that constitute a backup? Isn't that what, in effect, they are used for anyway? Don't you see they symmetry?
You don't understand.
A backup is something you rely on for recovery in case of disaster. It has to be reliable, and has to last long.
An rpm is not critical. If the download went bad, the checksum notices and you can download it again. If there was an error in creation, the users complain and it is created again.
A backup can not be created again if bad.
They are different use cases.
I do understand. You are describing the conventional use of RPM as a method of distribution.
*I* am focusing on it as a method of parcelling up a series of files so that it can be un-parcelled at a later date.
Which is what TAR and CPIO do.
Oh, wait! RPM uses CPIO to do the parcelling, it just adds some metadata.
If TAR, with or without compression, is a valid way of making a backup, then CPIO is also a valid way; Some programs that do TAR such as PAX can also do CPIO (pax can read input archives and write output in cpio and tar formats; see the -x option). In fact PAX does a great job of encoding extra metadata and has many other options. I use is instead of CPIO or TAR!
If CPIO is a valid way then packaging it with metadata saying where it should be unbundled and adding capability to prevent clash with any thing that it might overwrite that has a later modification date when 'restoring' is also quite valid, sensible in fact.
Saying you can't use CPIO to do backups is like saying that you can't use TAR to do backups, and people *DO* use TAR to do backups. Not because it is a good idea, but because the UNIX greybeards did it tape in antediluvian ages.
I never said you can not use cpio for backups. In fact, I have used it. It is more reliable than targz. I say not rpm for backups. I also say not tar.gz for backups.
That CPIO, a replacement for TAR, was subverted to RPM, well, ..
Isn't packaging a group of files to store, on a CD or DVD rather than a tape drive "backup"? If you were doing it before with TAR, and you're doing it now with CPIO, what's the issue? A metadata wrapper?
The issue is that an error in the gz part and the whole archive is lost. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
On 30/08/17 10:58 PM, Carlos E. R. wrote:
I say not rpm for backups.
So: CPIO for backups is OK. CPIO plus metadata on where yto unpack it and scripts to tidly up and mechanisms to deal with possible overrights when doing a restore is not OK. Can you please explain what it is about RPM that makes it NOT an 'backup'? Can you please explain what the difference is, from the software-to-do-it point of view as opposed to the human-philosophical point of view, between a backup and an archive? If, for example, On the one hand ... I were to take a walk though my ~/Photogrpahs/ and package them up by project (as opposed to date) and CPIO the hierarchy, then add a file of 'metatdata' about the project, then lx all that Or on the other hand .... I were to take a walk though my ~/Photogrpahs/ and package them up by project (as opposed to date) and then use 'rpmbuild' to package it all up and use 'lx' to compress the contents and in each case, after step and repeat for each project, put them all a DVD and stick the DVD in my folder of same, and do this ech month, automating the same, Well what's the difference? Its the same tools, CPIO, a compressor such as LX, files, scripts. Right now, 'rpmbuild' is a binary. I'm sure that its antecedents were scripts or that it could be replaced with a script. That has happened back and forth with many tools -- I recall working with Henry Spenser back in 1982/83 to get the size of his UNIX on a PDP/44 down by converting some binaries to scripts. The Bourne shell of those days was very small and compact! Nowhere near as good as Perl for bit-twiddling, but still, when you have limited memory, roll-in/roll-out and limited disk (I suspect a target for this was the luggable LSI-11) its a viable stragey. Whether you are putting the RPM in a 'repository' or on a DVD is rather arbitrary, isn't it? I have 'backups' of many of the earlier releases of openSuse, some not available now on repositories, on DVDs. With RPMs. My reasoning is that if u package up the photo projects and stick them on a DVD and put them in long term safe store they are backups. Oh, and by the way, for the OP: http://dpk.io/pax Good comic, eh? -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-08-31 15:24, Anton Aylward wrote:
On 30/08/17 10:58 PM, Carlos E. R. wrote:
I say not rpm for backups.
So:
CPIO for backups is OK. CPIO plus metadata on where yto unpack it and scripts to tidly up and mechanisms to deal with possible overrights when doing a restore is not OK.
Can you please explain what it is about RPM that makes it NOT an 'backup'?
Everybody knows that it is a software distribution method. I don't have to explain it.
Can you please explain what the difference is, from the software-to-do-it point of view as opposed to the human-philosophical point of view, between a backup and an archive?
A backup can use an archive as a method of doing it. An rpm is not an archive. A tar is. A cpio is. Even if rpm contains an archive. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
On 31/08/17 10:07 AM, Carlos E. R. wrote:
Can you please explain what the difference is, from the software-to-do-it point of view as opposed to the human-philosophical point of view, between a backup and an archive?
A backup can use an archive as a method of doing it. An rpm is not an archive. A tar is. A cpio is. Even if rpm contains an archive.
let's see. If I simply use 'cdrecord' or 'k3b' to package files onto a DVD, so that in due course the DVD was mountable and I could extract those files if I had lost the originals, does that make it an archive or backup? The fact that these are 'only' bundled on the DVD and not encapsulated, as the RPMs are on a distribution DVD makes a difference? If I were a commercial photographer then what would i do? if I were a wedding photographer then I'd use the format that makes the DVD mountable, preferable on a Windows or on a MAC device, since that's what my customers had at home. But I'm not. I use Darktable, which is only available for Linux (and the like), and I might want to _distribute_ my photographs as backgrounds, examples, textures, with the accompany Darktable scripts, all to go in the Darktable specific places. But they are exactly the same photographs as I'm "backing up' onto DVD sing k3b, using CPIO, whatever? As far as I can see RPM is just a way of making an archive that unpacks with specifics according to the enclosed metadata. And that might just as well be backing up for later restore. It's my INTENT that matters. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-08-31 16:44, Anton Aylward wrote:
On 31/08/17 10:07 AM, Carlos E. R. wrote:
Can you please explain what the difference is, from the software-to-do-it point of view as opposed to the human-philosophical point of view, between a backup and an archive?
A backup can use an archive as a method of doing it. An rpm is not an archive. A tar is. A cpio is. Even if rpm contains an archive.
let's see. If I simply use 'cdrecord' or 'k3b' to package files onto a DVD, so that in due course the DVD was mountable and I could extract those files if I had lost the originals, does that make it an archive or backup?
It is a plain copy of files to the CD. It is a backup.
If I were a commercial photographer then what would i do? if I were a wedding photographer then I'd use the format that makes the DVD mountable, preferable on a Windows or on a MAC device, since that's what my customers had at home.
But I'm not. I use Darktable, which is only available for Linux (and the like), and I might want to _distribute_ my photographs as backgrounds, examples, textures, with the accompany Darktable scripts, all to go in the Darktable specific places.
But they are exactly the same photographs as I'm "backing up' onto DVD sing k3b, using CPIO, whatever?
As far as I can see RPM is just a way of making an archive that unpacks with specifics according to the enclosed metadata. And that might just as well be backing up for later restore.
It's my INTENT that matters.
Whatever. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
Op donderdag 31 augustus 2017 15:24:11 CEST schreef Anton Aylward:
On 30/08/17 10:58 PM, Carlos E. R. wrote:
I say not rpm for backups.
So:
CPIO for backups is OK. CPIO plus metadata on where yto unpack it and scripts to tidly up and mechanisms to deal with possible overrights when doing a restore is not OK.
Can you please explain what it is about RPM that makes it NOT an 'backup'?
Can you please explain what the difference is, from the software-to-do-it point of view as opposed to the human-philosophical point of view, between a backup and an archive?
If, for example, On the one hand ... I were to take a walk though my ~/Photogrpahs/ and package them up by project (as opposed to date) and CPIO the hierarchy, then add a file of 'metatdata' about the project, then lx all that
Or on the other hand .... I were to take a walk though my ~/Photogrpahs/ and package them up by project (as opposed to date) and then use 'rpmbuild' to package it all up and use 'lx' to compress the contents
and in each case, after step and repeat for each project, put them all a DVD and stick the DVD in my folder of same, and do this ech month, automating the same,
Well what's the difference?
Its the same tools, CPIO, a compressor such as LX, files, scripts. Right now, 'rpmbuild' is a binary. I'm sure that its antecedents were scripts or that it could be replaced with a script. That has happened back and forth with many tools -- I recall working with Henry Spenser back in 1982/83 to get the size of his UNIX on a PDP/44 down by converting some binaries to scripts. The Bourne shell of those days was very small and compact! Nowhere near as good as Perl for bit-twiddling, but still, when you have limited memory, roll-in/roll-out and limited disk (I suspect a target for this was the luggable LSI-11) its a viable stragey.
Whether you are putting the RPM in a 'repository' or on a DVD is rather arbitrary, isn't it?
I have 'backups' of many of the earlier releases of openSuse, some not available now on repositories, on DVDs. With RPMs.
My reasoning is that if u package up the photo projects and stick them on a DVD and put them in long term safe store they are backups.
Oh, and by the way, for the OP: http://dpk.io/pax Good comic, eh?
> Q: Are you sure? > >> A: Because it reverses the logical flow of conversation. >> >>> Q: Why is top posting frowned upon?
IMO ( dutch "archief" isn't completely the same as the american "archive" ) the thread is going in a language based direction. -- Gertjan Lettink, a.k.a. Knurpht openSUSE Board Member openSUSE Forums Team -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 31/08/17 10:36 AM, Knurpht - Gertjan Lettink wrote:
IMO ( dutch "archief" isn't completely the same as the american "archive" ) the thread is going in a language based direction.
I've read an number of articles that claim there is a difference between a backup and an archive, using the english-language meanings. When stripped of all the frippery the difference comes down to this: A backup is taken regularly, has a limited lifetime and is optimized for a quick restore. An archive is intended for a long lifetime. Given modern technologies such as the use of the cloud, the easy ability of cheap fast disk and DVDs for off-lining, this difference has, as far as I can tell, been reduced to insignificance. Once I commit something to a DVD it has the lifetime of the DVD. Once I push things to the cloud they have the lifetime of the contract I have with the cloud provider or the lifetime of the cloud provider, whichever is the shorter. Disk-to-disk, disk-to-tape, whatever. No, cloud and DVD are more easily retrievable than connecting a new drive .. oh wait, maybe its in your NAS or your local cloud or ... Anyway, the classic difference between archive and backup has become meaningless. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-08-31 17:29, Anton Aylward wrote:
On 31/08/17 10:36 AM, Knurpht - Gertjan Lettink wrote:
IMO ( dutch "archief" isn't completely the same as the american "archive" ) the thread is going in a language based direction. I've read an number of articles that claim there is a difference between a backup and an archive, using the english-language meanings.
When stripped of all the frippery the difference comes down to this:
A backup is taken regularly, has a limited lifetime and is optimized for a quick restore.
An archive is intended for a long lifetime.
An archive is an structured file that contains files. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
On 31/08/17 11:36 AM, Carlos E. R. wrote:
An archive is an structured file that contains files.
Carlos: I suggest that you look at the variety of implementation of file systems, databases and such, since that definition encompasses many of them. I recall when I had to deal with an IBM system where there was a 'dataset that was, a far as the OS was concerned, was a 'file' in IBM's terms, which mean it was a span of sectors on the disk referenced by the disk allocation table. As far as the OS was concerned it was a data blob. But to the VM I was using it was a file system, and one of those was a database that contained messages (lets call it email for convenience sake even though it had nothing to do with SMTP) and those were treated, logically, by the application, as individual files, much the way that when I read email in Thunderbird via my IMAP interface, what's behind could be a similar format MBOX format or a individual files of a MDIR format. I neither know not care. The AS/400 file systems is so close to being a relational database as makes little difference when building database applications. You might think of the OS as a DBMS for file systems :-) Then, of course, there's FUSE (which leads to LUKS). it's a file, in a file system, which might itself be a database, but who knows, that contains a file system. So by your definition it constitutes an archive. But you mount it, just like I mount my backup DVDs. What do you want to bet I can back up my ~anton containing a file that is a FUSE file system onto a DVD, and mount that DVD then mount that file system, and LO!, therein is, well I mad a disk-to-disk copy of my ~anton before creating that FUSE file, then after creating it and mounting it did another disk-to-disk to put all my old ~anton along with all the dot-files into it "as backup" before burning the earlier discussed DVD, so that when I mounted the DVD and mounted the FUSE I could then FUSE mount my ~anton .... and its getting a bit recursive isn't it? If you're confused by this, its because you keep insisting that your above definition is meaningful. It's not. A filesystem *IS* a structured container of files. It can be anything. It could be a cloud server using Avian Carrier TCP that flies to a tower where monks consult ancient parchments and send the Avian Carrier replies back. The performance would suck, but it *IS* a file system. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-08-31 18:18, Anton Aylward wrote:
On 31/08/17 11:36 AM, Carlos E. R. wrote:
An archive is an structured file that contains files.
Carlos: I suggest that you look at the variety of implementation of file systems, databases and such, since that definition encompasses many of them.
No. In computer systems, and archive has that precise definition and that is the one I have been referring to during this thread, not other. A zip file is an archive. An rpm file contains an archive, is not an archive. Period. And no, I'm not confused. I'm aware of the methods you mention. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
On 31/08/17 16:36, Carlos E. R. wrote:
On 2017-08-31 17:29, Anton Aylward wrote:
On 31/08/17 10:36 AM, Knurpht - Gertjan Lettink wrote:
IMO ( dutch "archief" isn't completely the same as the american "archive" ) the thread is going in a language based direction. I've read an number of articles that claim there is a difference between a backup and an archive, using the english-language meanings.
When stripped of all the frippery the difference comes down to this:
A backup is taken regularly, has a limited lifetime and is optimized for a quick restore.
An archive is intended for a long lifetime.
An archive is an structured file that contains files.
That's the computer jargon meaning, not the English Language meaning. So yes, it seems clear that we are moving towards an argument over "what does this word mean". Archives existed long before computers (well, the mechanical variety, at least). Archaeologists have discovered a fair few prehistoric archives :-) Cheers, Wol -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Op donderdag 31 augustus 2017 17:29:27 CEST schreef Anton Aylward:
On 31/08/17 10:36 AM, Knurpht - Gertjan Lettink wrote:
IMO ( dutch "archief" isn't completely the same as the american "archive" ) the thread is going in a language based direction.
I've read an number of articles that claim there is a difference between a backup and an archive, using the english-language meanings.
When stripped of all the frippery the difference comes down to this:
A backup is taken regularly, has a limited lifetime and is optimized for a quick restore.
An archive is intended for a long lifetime.
That's what I was taught in the late eighties. We backupped ( allow to return to previous day ) and archived ( cumulation for life ). -- Gertjan Lettink, a.k.a. Knurpht openSUSE Board Member openSUSE Forums Team -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 31/08/17 03:30 PM, Knurpht - Gertjan Lettink wrote:
That's what I was taught in the late eighties. We backupped ( allow to return to previous day ) and archived ( cumulation for life ).
Indeed. I was taught same in the late 1970s, but that was then, this is now. Back then disk were expensive and easily interchangeable ones more so. We didn't have CDs or DVDs or USB/semiconductor slice removable memory. Heck, the non removable memory for the PDP/11 was expensive enough as it was! We had tape, and the fastest/highest density was 6520bpi and it was SLOW. The 'tower of hanoi' algorithm tried to give a good compromise between backup and archive. The idea was to do incremental backups that became an archive. You tried to deal only with changes over a given period. That way the amount of time you were writing to those slow tape drives was minimised, but you always had enough to make a meaningful recovery to any 'checkpoint'. That was then, this is now. Now I have a, well slow to what I could purchase if I paid more, cable feed that is faster than the PDP/11 tape drives I was using at the beginning for the 1980s. The basic RK05 drive I used on the PDP 11 had a capacity of 2.5 megabytes. My home CPU has a L2 cache of 4096K byes. or is that bits? I also had a RL01 upgraded to RLO2 that could handle 10 megabytes disk packs and had a 1.44Mb/s transfer rate. My home PC has 4 gigabytes of memory. Eventually we upgraded to a RK07 that could handle nearly 30 megabytes. Then there were the RK05 'flying saucer' disk packs :-) Talking about what we considered the norm at the beginning of the 1980s has little relevance now. I can probably back up all of my system, not just my /home, over my network connection to the cloud, faster than I could back up the equivalent on the loadable /usr on the RLO2 to tape in 1982. I have a suspicion that even my SATA DVD reader, should I use it with a ShadowFS, would be faster than the RL02 but I'd have to check the numbers. The PDP-11/45 had a dual bus architecture so autonomous disk transfer could be carried out in parallel with computation and hence appear faster. The recollection are interesting, but the distinctions were forced on us by the limitations of the technology of the time. They no longer apply. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Thu, Aug 31, 2017 at 9:24 AM, Anton Aylward <opensuse@antonaylward.com> wrote:
On 30/08/17 10:58 PM, Carlos E. R. wrote:
I say not rpm for backups.
So:
CPIO for backups is OK. CPIO plus metadata on where yto unpack it and scripts to tidly up and mechanisms to deal with possible overrights when doing a restore is not OK.
Can you please explain what it is about RPM that makes it NOT an 'backup'?
Let me introduce the term "Archival Backup". An archival backup in my mind is one designed to still be readable a decade or more after it is made. Most DVDs don't have a 10-year or longer expected lifetime. Kodak used to sell CDs made with gold as the medium that was sandwiched in the plastic. It was intended for Archival Backup purposes. The idea was that even if the plastic halves came unglued, the gold would not oxidate, so it was a far more reliable medium. There are other archival backup solutions. I often have seen lawyers simply put data on a USB drive and put that in a safe. I'm not sure that is a great solution, but it certainly is one that is used frequently. I personally take important large data sets, then put them in the equivalent of a large segmented tar archive. I put a copy of the segments on at least 2 drives. I also record the hash of each of the segments. Then, if I have to use that data in the future, I verify the hashes. I have had occasions where I had hash disagrees on both copies, but fortunately just on a few of the segment files and I was able to piece together a full set without any hash disagrees. I haven't gone to triple redundancy in general. But arguably it would be smart. Also, hash disagrees are far less common with drives made in the last 5 years than they were with drives made around 2005. I remember a lot of issues with hash disagrees in the 2005-2010 timeframe. Greg -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-08-31 17:50, Greg Freemyer wrote:
Let me introduce the term "Archival Backup".
An archival backup in my mind is one designed to still be readable a decade or more after it is made. Most DVDs don't have a 10-year or longer expected lifetime.
Right.
Kodak used to sell CDs made with gold as the medium that was sandwiched in the plastic. It was intended for Archival Backup purposes. The idea was that even if the plastic halves came unglued, the gold would not oxidate, so it was a far more reliable medium.
I remember them, although I have never used them.
There are other archival backup solutions. I often have seen lawyers simply put data on a USB drive and put that in a safe. I'm not sure that is a great solution, but it certainly is one that is used frequently.
Safe from stealing ;-) Manageable. I'm unsure about long term, though.
I personally take important large data sets, then put them in the equivalent of a large segmented tar archive. I put a copy of the segments on at least 2 drives. I also record the hash of each of the segments. Then, if I have to use that data in the future, I verify the hashes.
I assume they are not compressed.
I have had occasions where I had hash disagrees on both copies, but fortunately just on a few of the segment files and I was able to piece together a full set without any hash disagrees.
I haven't gone to triple redundancy in general. But arguably it would be smart.
Also, hash disagrees are far less common with drives made in the last 5 years than they were with drives made around 2005. I remember a lot of issues with hash disagrees in the 2005-2010 timeframe.
You could use an archival method that stores with extra data recovery chunks, so that recovery of a damaged sector is possible. One such possibility is "par2", which currently is only packaged on some home repos. But I find it somewhat cumbersome to use. The commercial "rar" includes this feature (since decades ago) and is easy to use. The caveat, of course, is that it does not support the full Linux file metadata set: permissions, attributes, acls, etc. Perhaps one could include a text script in the archive that would recreate them. I saw some other archivers that claim to have it but are betas or young products. I still need to investigate them more. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
There are other archival backup solutions. I often have seen lawyers simply put data on a USB drive and put that in a safe. I'm not sure that is a great solution, but it certainly is one that is used frequently. Just for interest, eeprom spec taken from a PIC data sheet I just happened to have open: Data retention without refresh is conservatively estimated to be greater
On 31/08/2017 17:50, Greg Freemyer wrote: than 40 years. So I should imagine that the same would apply to usb drives. Dave P -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 01/09/17 08:54, Dave Plater wrote:
There are other archival backup solutions. I often have seen lawyers simply put data on a USB drive and put that in a safe. I'm not sure that is a great solution, but it certainly is one that is used frequently. Just for interest, eeprom spec taken from a PIC data sheet I just happened to have open: Data retention without refresh is conservatively estimated to be greater
On 31/08/2017 17:50, Greg Freemyer wrote: than 40 years. So I should imagine that the same would apply to usb drives.
No. You may well be right, but eeproms are physically much larger than modern chips (the transistors etc, that is). Charge leakage is now a major problem due to the shrinkage of the die. Cheers, Wol -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Hi Anton,
On 30/08/17 03:58 AM, Michael Hirmke wrote:
exactly - I'm doing a 1:1 rsync backup without any compression on the target side.
Well, yes. Disk-to-disk-to<something> is a good basic strategy, given the right <something>
let it be another disk. Disks are that cheap, that you always can afford a second one to bring it to a different building - just in case of any physical desaster. You should have a spare controller in the same location, though. Otherwise you also might run into troubles when trying to recover your backup. Even better if you have a complete server there. [...] Bye. Michael. -- Michael Hirmke -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 30/08/17 03:21 PM, Michael Hirmke wrote:
let it be another disk.
Sure, why not. Well, maybe its a matter of size. My local stores don't seem to want to sell me anything smaller than a 350G, and cost per byte, they are EXPENSIVE! it's actually cheaper to buy a 1T drive. And a 2T derive is on;y about 25% more than that. But my point with the RPM argument is that a lot of the time we aren't dealing with stuff that size. I've commented a number of times on how easy it is to arrange stuff for backup onto 5G DVDs:
du -sx ~anton ~anton//Documents/ ~anton/Photographs/ByYear/* ~anton/Mail/ 3.2G /home/anton 730M /home/anton//Documents/ 597K /home/anton/Photographs/ByYear/2003 33M /home/anton/Photographs/ByYear/2012 626M /home/anton/Photographs/ByYear/2013 4.6G /home/anton/Photographs/ByYear/2014 2.3G /home/anton/Photographs/ByYear/2015
2.8G /home/anton/Photographs/ByYear/2016 3.7G /home/anton/Photographs/ByYear/2017 24M /home/anton/Photographs/ByYear/2005 2.7G /home/anton/Mail/ Music and media and Movies are larger, yes, but I'm in the process of seperating those out by genre and subgenre. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 30/08/17 23:08, Anton Aylward wrote:
I've commented a number of times on how easy it is to arrange stuff for backup onto 5G DVDs:
9G DVDs aren't that expensive, either, although it seems easier to buy a 9G drive than 9G media :-) My quick investigation into tape backup, though, concluded that - for the home user at least - it was prohibitively expensive. Cheers, Wol -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 31/08/17 07:41 AM, Wols Lists wrote:
On 30/08/17 23:08, Anton Aylward wrote:
I've commented a number of times on how easy it is to arrange stuff for backup onto 5G DVDs:
9G DVDs aren't that expensive, either, although it seems easier to buy a 9G drive than 9G media :-)
:-) Indeed! A quick on-line check of my regular downtown Strip of back-to-back computer stores (east of Spadina on College, predominantly) doesn't throw them up. I might have to drill deeper. However given the speed for DVD burners and that OK quality (aka no worse than buying HP brand) DVDs are less than C$8/50 and CDs are only slightly cheaper, and they do seem to last longer than many of the USB sticks I've bought (!) (though why my camera CD cards last so long is a matter for discussion another time on offtopic) I think the DVDs are adequate -- for now.
My quick investigation into tape backup, though, concluded that - for the home user at least - it was prohibitively expensive.
Back in the days when SMBs were using the old IBM PCs and ATs, a 30Meg or 60meg tape cartridge was a wonderful backup -- for a SMB, and just barely affordable for a enthusiast or the kind of hobbyist produce shareware, those Peter Norton wannbes. Later, in the 386/SCO days when I was doing Progress Database applications, I had one. I'm aware that there are now micro-cartridges that can store 300G, but why? We've commented here that a basic openSUSE install can take less than 20G. It's when you start filling up /var and /home that it gets space hungry. Even so # du -sh ~anton 79G /home/anton OK, so there's a lot of my stuff that isn't under ~anton; thank you 'owncloud' under /srv And alternative (kernel and /usr), space for the disk-to-disk copied, all those online ISO for doing VM deployment. A Terabyte of disk space vanishes pretty quickly. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Thu, Aug 31, 2017 at 7:41 AM, Wols Lists <antlists@youngman.org.uk> wrote:
On 30/08/17 23:08, Anton Aylward wrote:
I've commented a number of times on how easy it is to arrange stuff for backup onto 5G DVDs:
9G DVDs aren't that expensive, either, although it seems easier to buy a 9G drive than 9G media :-)
My quick investigation into tape backup, though, concluded that - for the home user at least - it was prohibitively expensive.
Are 9GB reliable now? 8 or 9 years ago: I could burn the first layer of data on a 5GB or 9GB DVD and have great reliability. But 9GB media invoke a second layer of recording media. As soon as I started trying to put 7 or 8 GB of data on a DVD, my reliability rates went in the toilet. I bought multiple different brands of drives and media. I never achieved even close to a 80% success rate at being able to read back in the data I wrote. ======== At some point I gave up and have ignored the ability to write to the second layer of DVD media. If I need over 5GB now, I use a Blu-Ray drive and media. Greg -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 08/31/2017 12:33 PM, Greg Freemyer wrote:
But 9GB media invoke a second layer of recording media. As soon as I started trying to put 7 or 8 GB of data on a DVD, my reliability rates went in the toilet. I bought multiple different brands of drives and media.
Yes, if you aren't buying bargain basement media. The double layer disks from some vendors are pretty good. My failure rate with the Verbatim have been less than 1 in 40 1 year after burning. http://www.digitalfaq.com/reviews/dvd-media.htm https://www.amazon.com/gp/product/B000GHWRIK Of course some of the early double layer drives sucked as well, so you need quality in both places. -- After all is said and done, more is said than done. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 31/08/17 21:40, John Andersen wrote:
On 08/31/2017 12:33 PM, Greg Freemyer wrote:
But 9GB media invoke a second layer of recording media. As soon as I started trying to put 7 or 8 GB of data on a DVD, my reliability rates went in the toilet. I bought multiple different brands of drives and media.
Yes, if you aren't buying bargain basement media. The double layer disks from some vendors are pretty good. My failure rate with the Verbatim have been less than 1 in 40 1 year after burning.
http://www.digitalfaq.com/reviews/dvd-media.htm https://www.amazon.com/gp/product/B000GHWRIK
Of course some of the early double layer drives sucked as well, so you need quality in both places.
I just ordered a DVD drive back in the day and it came with DL as a matter of course, so I guess it's pretty decent. And so far, I've just used it to try to copy commercial DVDs, so my failure rate is 100% but I think that's down to copy protection, not drive or disk quality. Cheers, Wol -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 01/09/17 11:41, Wols Lists wrote:
On 31/08/17 21:40, John Andersen wrote:
But 9GB media invoke a second layer of recording media. As soon as I started trying to put 7 or 8 GB of data on a DVD, my reliability rates went in the toilet. I bought multiple different brands of drives and media. Yes, if you aren't buying bargain basement media. The double layer disks from some vendors are pretty good. My failure rate with
On 08/31/2017 12:33 PM, Greg Freemyer wrote: the Verbatim have been less than 1 in 40 1 year after burning.
http://www.digitalfaq.com/reviews/dvd-media.htm https://www.amazon.com/gp/product/B000GHWRIK
Of course some of the early double layer drives sucked as well, so you need quality in both places.
I just ordered a DVD drive back in the day and it came with DL as a matter of course, so I guess it's pretty decent.
And so far, I've just used it to try to copy commercial DVDs, so my failure rate is 100% but I think that's down to copy protection, not drive or disk quality.
Cheers, Wol
I have managed to backup some of my commercial DVDs. Even with my cheap-o £8.50 DVD drive. All I had to do was install all of the media codecs from packman and libdvdcss2. Test they play in VLC. Then rip using k3b. Although some DVDs had lines across them when things were moving. Perhaps an interlace problem? I see this all the time in VLC player but have never managed to solve it. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 29/08/17 18:16, Paul Groves wrote:
OK, I never managed to get the data back from the tar file so I would assume it was corrupted during creation.
However all of the data has been recovered from several hours of detective work on my part going through everyone's computer and copying their data back to the new server.
A few older file were lost, however I have a tape from a couple of weeks ago with those older files already on.
I have spent the last hour saying 'I told you so' to the colleague that thought RAID was a waste of time and he has now supplied me with lots of beer to leave him alone. :D
And I set up the server with RAID and made a tape backup which I will be taking home and putting somewhere safe.
so it all ended up OK :)
Which raid? !!! And have you read the raid wiki, about keeping everything safe? RAID DOES NOT EQUAL BACKUP !!! The best raid is raid-6, although it does come with some performance hit. And *make* *sure* you've got scrubbing and monitoring in place (active, not passive, you only need a glitch in your mail setup and you never receive the email telling you your raid-6 is critical because you've got two dead drives ...) Cheers, Wol -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 30/08/17 16:23, Wols Lists wrote:
On 29/08/17 18:16, Paul Groves wrote:
OK, I never managed to get the data back from the tar file so I would assume it was corrupted during creation.
However all of the data has been recovered from several hours of detective work on my part going through everyone's computer and copying their data back to the new server.
A few older file were lost, however I have a tape from a couple of weeks ago with those older files already on.
I have spent the last hour saying 'I told you so' to the colleague that thought RAID was a waste of time and he has now supplied me with lots of beer to leave him alone. :D
And I set up the server with RAID and made a tape backup which I will be taking home and putting somewhere safe.
so it all ended up OK :)
Which raid? !!! RAID 10 in this case. (4 disks).
And have you read the raid wiki, about keeping everything safe? RAID DOES NOT EQUAL BACKUP !!! No but it stops the server from going offline for ages while you wait for a new drive to come in the post.
I am wasting my breath telling them they are not doing backups correctly. Perhaps they may listen this time (pah!) I just make sure I backup my own work and get on with my own job at this stage.
The best raid is raid-6, although it does come with some performance hit. And *make* *sure* you've got scrubbing and monitoring in place (active, not passive, you only need a glitch in your mail setup and you never receive the email telling you your raid-6 is critical because you've got two dead drives ...)
Cheers, Wol
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 30/08/17 16:59, Paul Groves wrote:
On 30/08/17 16:23, Wols Lists wrote:
On 29/08/17 18:16, Paul Groves wrote:
OK, I never managed to get the data back from the tar file so I would assume it was corrupted during creation.
However all of the data has been recovered from several hours of detective work on my part going through everyone's computer and copying their data back to the new server.
A few older file were lost, however I have a tape from a couple of weeks ago with those older files already on.
I have spent the last hour saying 'I told you so' to the colleague that thought RAID was a waste of time and he has now supplied me with lots of beer to leave him alone. :D
And I set up the server with RAID and made a tape backup which I will be taking home and putting somewhere safe.
so it all ended up OK :)
Which raid? !!! RAID 10 in this case. (4 disks).
And have you read the raid wiki, about keeping everything safe? RAID DOES NOT EQUAL BACKUP !!! No but it stops the server from going offline for ages while you wait for a new drive to come in the post.
Can you get the new drive right now, and install it as a hot spare? I get the impression you know all this, but if you lose two drives on raid10 you could be in trouble (33% chance) ... raid-6 would survive. Cheers, Wol -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 30/08/17 17:49, Anthony Youngman wrote:
On 30/08/17 16:59, Paul Groves wrote:
On 30/08/17 16:23, Wols Lists wrote:
On 29/08/17 18:16, Paul Groves wrote:
OK, I never managed to get the data back from the tar file so I would assume it was corrupted during creation.
However all of the data has been recovered from several hours of detective work on my part going through everyone's computer and copying their data back to the new server.
A few older file were lost, however I have a tape from a couple of weeks ago with those older files already on.
I have spent the last hour saying 'I told you so' to the colleague that thought RAID was a waste of time and he has now supplied me with lots of beer to leave him alone. :D
And I set up the server with RAID and made a tape backup which I will be taking home and putting somewhere safe.
so it all ended up OK :)
Which raid? !!! RAID 10 in this case. (4 disks).
And have you read the raid wiki, about keeping everything safe? RAID DOES NOT EQUAL BACKUP !!! No but it stops the server from going offline for ages while you wait for a new drive to come in the post.
Can you get the new drive right now, and install it as a hot spare?
I get the impression you know all this, but if you lose two drives on raid10 you could be in trouble (33% chance) ... raid-6 would survive.
Cheers, Wol
Already sorted it all out. New drives arrived the other day. (I did post it). I made a tape backup and brought it home too. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
participants (13)
-
Anthony Youngman
-
Anton Aylward
-
Carl Hartung
-
Carlos E. R.
-
Dave Plater
-
Felix Miata
-
Greg Freemyer
-
John Andersen
-
Knurpht - Gertjan Lettink
-
mh@mike.franken.de
-
Paul Groves
-
Richmond
-
Wols Lists