On 02/18/2014 11:37 AM, Ted Byers wrote:
On 14-02-18 11:07 AM, Anton Aylward wrote:
A couple of days ago there was mention of "~/.thumbnails" and I mentioned that it was one folder that there was no need to back up as it would be recreated.
No doubt there are other folders like that, caches and indexes, and it set me thinking. What else do we not need to back up?
Whether backing up to a DVD or to a 'cloud' service (using rsync perhaps), reducing the volume is a positive.
Some of the volume can be cleaned up with tools like Bleachbit, removing known caches, backup files and more.
But what about index files? Thunderbird, for example, leaves a lot of index and other files lying about. Which of those can be cleaned up? Bleachbit says it can deal with the index files, but what else is there?
There is also a lot of "~/.<something> which might also be or might have candidates.
What do you think/know of?
I have no idea, so I can't answer that question.
But, I am as interested in the answer to your question as I am in a closely related question, which is, "What must be backed up?" One should, if one is thorough, end up with everything in the system being in the one list or the other, and that would tell us what is the easiest to set up in a backup script that is intended to automate the process. It is somewhat like a security policy question: 'whatever is not explicitly permitted is forbidden' vs 'whatever is not explicitly forbidden is permitted'. The two often have very different consequences, both for security and cost of implementation. So, if I am to automate backups, do I construct a list of what must be backed up, or do I backup everything except for what is on an exclusion list?
I'm taking a 'whatever is not explicitly forbidden is permitted' approach to backup, so I'm trying to determine what is no needed, what is 'dynamic' and will be reconstructed. The index files and thumbnails come into that category. As was pointed out, if you try deleting the thumbnails while Dolphin/Konqueror is active they will be reconstructed as you go along. The other thing is 'scope'. Backing up the system and backing up a user are two different cases. A lot of the system can be recovered from the DVD or the repositories. Anything that is or could be a tmpFS shouldn't be backed up. That pretty much leaves /etc and some things under /var, assuming you haven't done the customizations elsewhere such as /usr/share.
From my perspective, when I get to design my backup strategy, I want to know what is the least costly strategy that will a) enable me to restore a failed system to it's state prior to the failure, and b) provide absolute protection for (potentially very costly) data (in some cases, data that is both expensive to collect, and represents significant legal liabilities if lost - and of course, must be kept confidential).
"Least Costly" is why I'm asking this. Backing up to cloud eats my monthly allotment of bandwidth. Last month I got a warning about that. Rsync is very good but it does have to ask what is at the other end. That eventually adds up. Buying another 1T drive is cheaper, but I'm going to run out of SATA ports on my motherboard :-) I like the idea of backing up to DVD. Right now, after 'pruning' what I know about, ".thumbnails" and some caches, the non NFS mounted part of ~anton/ fits on a DVD. Just. That won't last for long. I've configures, as I've mentioned, many of the subdirectories -- ~/Documents, ~/Music, and more -- into 5G partitions, looking forward to doing partition by partition backup when they are full enough to justify it. But right now they aren't that full. No doubt there is software ... "backuponcd" might be a candidate. I wonder if its even worth buying CDs rather then DVDs?
But you also talk of cleaning up parts of the file system, which seems to be the antithesis of a backup, or perhaps complementary(?). That would suggest that an automated script that backs up parts of the file system perhaps ought to have a section that cleans up those parts that are not backed up; becoming more system maintenance rather than just backup, and that raises another question as to what tool should be used, the behaviour of which is to be governed by what set of rules.
Take a look at 'bleachbit'. Also take look at 'fslint' which can identify duplicates, dead symlinks and more. All this can reduce the volume of what has to be backed up.
Is there a website that provides easy access to whatever collected wisdom has been accumulated on these questions?
Perhaps. My google-fu hasn't uncovered it. So far the best is "Bleedin obvious, ain't it?" http://www.rackspace.com/knowledge_center/article/best-practices-for-cloud-b... See "Choosing What to Backup" -- What is character but the determination of incident what is incident but the illustration of character? - Henry James -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org