[opensuse] backups via rsync leaving old junk in the backup?
All, I have a directory that I backup every night via rsync to a remote server a couple thousand miles away. (For about 2 years now.) Its a dynamic directory with files created and deleted. I just noticed that the main directory is 247GB, but the remote copy is 354GB. I assume rsync can cause the extra files on the remote to be deleted? I'm currently using: rsync -avh --delete-after --stats --links --partial-dir=<transfer_dir> --timeout=1800 <BACKUP_DIR> <remote_user>@<remote_server>/<remote_dir> Any idea what I need to change? FYI: My local dir is actually the destination of a rdiff-backup run, so it has tons of old revisions in it. Thanks Greg -- Greg Freemyer Litigation Triage Solutions Specialist http://www.linkedin.com/in/gregfreemyer First 99 Days Litigation White Paper - http://www.norcrossgroup.com/forms/whitepapers/99%20Days%20whitepaper.pdf The Norcross Group The Intersection of Evidence & Technology http://www.norcrossgroup.com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Wed, Feb 18, 2009 at 6:08 PM, Greg Freemyer <greg.freemyer@gmail.com> wrote:
All,
I have a directory that I backup every night via rsync to a remote server a couple thousand miles away. (For about 2 years now.)
Its a dynamic directory with files created and deleted.
I just noticed that the main directory is 247GB, but the remote copy is 354GB.
I assume rsync can cause the extra files on the remote to be deleted?
I'm currently using: rsync -avh --delete-after --stats --links --partial-dir=<transfer_dir> --timeout=1800 <BACKUP_DIR> <remote_user>@<remote_server>/<remote_dir>
Any idea what I need to change?
FYI: My local dir is actually the destination of a rdiff-backup run, so it has tons of old revisions in it.
Thanks Greg -- Greg Freemyer Litigation Triage Solutions Specialist http://www.linkedin.com/in/gregfreemyer First 99 Days Litigation White Paper - http://www.norcrossgroup.com/forms/whitepapers/99%20Days%20whitepaper.pdf
The Norcross Group The Intersection of Evidence & Technology http://www.norcrossgroup.com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Greg, I believe rsync is not designed to remove anything in the destination directory. Have you considered using rsnapshot? http://rsnapshot.org/ Works quite nicely for us. Boris. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Wed, Feb 18, 2009 at 6:15 PM, Boris Epstein <borepstein@gmail.com> wrote:
On Wed, Feb 18, 2009 at 6:08 PM, Greg Freemyer <greg.freemyer@gmail.com> wrote:
All,
I have a directory that I backup every night via rsync to a remote server a couple thousand miles away. (For about 2 years now.)
Its a dynamic directory with files created and deleted.
I just noticed that the main directory is 247GB, but the remote copy is 354GB.
I assume rsync can cause the extra files on the remote to be deleted?
I'm currently using: rsync -avh --delete-after --stats --links --partial-dir=<transfer_dir> --timeout=1800 <BACKUP_DIR> <remote_user>@<remote_server>/<remote_dir>
Any idea what I need to change?
FYI: My local dir is actually the destination of a rdiff-backup run, so it has tons of old revisions in it.
Thanks Greg -- Greg Freemyer Litigation Triage Solutions Specialist http://www.linkedin.com/in/gregfreemyer First 99 Days Litigation White Paper - http://www.norcrossgroup.com/forms/whitepapers/99%20Days%20whitepaper.pdf
The Norcross Group The Intersection of Evidence & Technology http://www.norcrossgroup.com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Greg,
I believe rsync is not designed to remove anything in the destination directory.
Have you considered using rsnapshot?
Works quite nicely for us.
Boris.
Boris, The --delete-after flag is supposed to delete files from the dest not present in the source. (or so I believe.) I don't remember really testing this flags use before, so maybe I don't understand what it actually does? Greg -- Greg Freemyer Litigation Triage Solutions Specialist http://www.linkedin.com/in/gregfreemyer First 99 Days Litigation White Paper - http://www.norcrossgroup.com/forms/whitepapers/99%20Days%20whitepaper.pdf The Norcross Group The Intersection of Evidence & Technology http://www.norcrossgroup.com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wednesday, 2009-02-18 at 18:20 -0500, Greg Freemyer wrote:
The --delete-after flag is supposed to delete files from the dest not present in the source. (or so I believe.)
Me too.
I don't remember really testing this flags use before, so maybe I don't understand what it actually does?
I have, but on local dirs, and it works. Wait... no, I use plain "--del". - -- Cheers, Carlos E. R. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) iEYEARECAAYFAkmcn8kACgkQtTMYHG2NR9XJxgCghCm9IOT1u5DiHByPquqhirtO R+gAnRenwBIerTRNTUI2DWJZAFe1j0iO =EHaH -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
----- Original Message ----- From: "Greg Freemyer" <greg.freemyer@gmail.com> Cc: "opensuse" <opensuse@opensuse.org> Sent: Wednesday, February 18, 2009 6:20 PM Subject: Re: [opensuse] backups via rsync leaving old junk in the backup?
On Wed, Feb 18, 2009 at 6:15 PM, Boris Epstein <borepstein@gmail.com> wrote:
On Wed, Feb 18, 2009 at 6:08 PM, Greg Freemyer <greg.freemyer@gmail.com> wrote:
All, I just noticed that the main directory is 247GB, but the remote copy is 354GB.
I assume rsync can cause the extra files on the remote to be deleted?
I'm currently using: rsync -avh --delete-after --stats --links --partial-dir=<transfer_dir> --timeout=1800 <BACKUP_DIR> <remote_user>@<remote_server>/<remote_dir>
I believe rsync is not designed to remove anything in the destination directory.
The --delete-after flag is supposed to delete files from the dest not present in the source. (or so I believe.)
I don't remember really testing this flags use before, so maybe I don't understand what it actually does?
I use --del and --delete-excluded _all_ the time and I assure you they delete things from the target. If they didn't I'd have big problems. Deleting stuff is just as important as anything else when it comes to maintaining an accurate and consistent copy of a dataset. I would look at: 1) Does the rsync job actually finish without errors or warnings, and without being killed while in progress? 2) Are there multiple rsync's running from previous days? (probably not since the timeout should be preventing exactly this, however...) 3) Is the timeout causing it to die before ending? In other words, before ever reaching the point where it would have performed the --delete-after. Take out --delete-after and juste use --del, or, increase the timeout, or even temporarily remove the timeout. 4) Do the filesystems on each side report space usage the same way? As in, are they the same kind of filesystem? Are they on the same kind of hardware platform? Were they created using the same blocking factors? 5) Are there processes on the target machine possibly holding hidden file handles open? As in, when a process has a file handle, and you delete the file, the file isn't really deleted until any such handles are released. However, it looks deleted to casual inspection. No new process can access it and it isn't shown anywhere except lsof/lslk output and indirect things like, you guessed it, mystery du/df discrepencies. I think #3 is most likely. -- Brian K. White brian@aljex.com http://profile.to/KEYofR +++++[>+++[>+++++>+++++++<<-]<-]>>+.>.+++++.+++++++.-.[>+<---]>++. filePro BBx Linux SCO FreeBSD #callahans Satriani Filk! -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Wed, Feb 18, 2009 at 06:08:09PM -0500, Greg Freemyer wrote:
All,
I have a directory that I backup every night via rsync to a remote server a couple thousand miles away. (For about 2 years now.)
Its a dynamic directory with files created and deleted.
I just noticed that the main directory is 247GB, but the remote copy is 354GB.
I assume rsync can cause the extra files on the remote to be deleted?
I'm currently using: rsync -avh --delete-after --stats --links --partial-dir=<transfer_dir> --timeout=1800 <BACKUP_DIR> <remote_user>@<remote_server>/<remote_dir>
Any idea what I need to change?
FYI: My local dir is actually the destination of a rdiff-backup run, so it has tons of old revisions in it.
Thanks Greg
You may want to use -H; if you back up a system that contains hardlinked files that'll save you from duplication on the receiver side. Several RPM packages on openSUSE do use these, for space saving reasons, e.g. timezone package: # rpm -qlv timezone | grep /usr/share/zoneinfo/Africa/Brazzaville -rw-r--r-- 2 root root 157 Oct 16 17:58 /usr/share/zoneinfo/Africa/Brazzaville # ls -l /usr/share/zoneinfo/Africa/Brazzaville -rw-r--r-- 2 root root 157 2008-10-16 17:58 /usr/share/zoneinfo/Africa/Brazzaville ^^^^ The '2' actually denotes the number of links present on the filesystem for this file. In some cases, -S can be useful -- if you have so-called sparse files in the filesystems; files with "holes" (with unused, and unallocated space). For backup purpuses, --numeric-ids is also useful, because it makes sure that rsync doesn't mess with the user ids (trying to translate them to the other system). --links is part of -a, so it is redundant in your command line. --delete-after is fine, it implies --delete, it just changes the point in time when it happens. If you used excludes, then you would need to add --delete-excluded to delete everything on the receiver side which isn't on the sender side. The fact that the local directory is an rdiff-snapshot target does not matter, if you use -H to back it up -- otherwise, stuff will be duplicated. -hi is most useful to see exactly what happens, more than -v. -avvhiH for the full treatment :-) Peter -- Contact: admin@opensuse.org (a.k.a. ftpadmin@suse.com) #opensuse-mirrors on freenode.net Info: http://en.opensuse.org/Mirror_Infrastructure SUSE LINUX Products GmbH Research & Development -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Wed, Feb 18, 2009 at 6:54 PM, Peter Poeml <poeml@suse.de> wrote:
On Wed, Feb 18, 2009 at 06:08:09PM -0500, Greg Freemyer wrote:
All,
I have a directory that I backup every night via rsync to a remote server a couple thousand miles away. (For about 2 years now.)
Its a dynamic directory with files created and deleted.
I just noticed that the main directory is 247GB, but the remote copy is 354GB.
I assume rsync can cause the extra files on the remote to be deleted?
I'm currently using: rsync -avh --delete-after --stats --links --partial-dir=<transfer_dir> --timeout=1800 <BACKUP_DIR> <remote_user>@<remote_server>/<remote_dir>
Any idea what I need to change?
FYI: My local dir is actually the destination of a rdiff-backup run, so it has tons of old revisions in it.
Thanks Greg
You may want to use -H; if you back up a system that contains hardlinked files that'll save you from duplication on the receiver side. Several RPM packages on openSUSE do use these, for space saving reasons, e.g. timezone package:
# rpm -qlv timezone | grep /usr/share/zoneinfo/Africa/Brazzaville -rw-r--r-- 2 root root 157 Oct 16 17:58 /usr/share/zoneinfo/Africa/Brazzaville
# ls -l /usr/share/zoneinfo/Africa/Brazzaville -rw-r--r-- 2 root root 157 2008-10-16 17:58 /usr/share/zoneinfo/Africa/Brazzaville
^^^^
The '2' actually denotes the number of links present on the filesystem for this file.
In some cases, -S can be useful -- if you have so-called sparse files in the filesystems; files with "holes" (with unused, and unallocated space).
For backup purpuses, --numeric-ids is also useful, because it makes sure that rsync doesn't mess with the user ids (trying to translate them to the other system).
--links is part of -a, so it is redundant in your command line.
--delete-after is fine, it implies --delete, it just changes the point in time when it happens.
If you used excludes, then you would need to add --delete-excluded to delete everything on the receiver side which isn't on the sender side.
The fact that the local directory is an rdiff-snapshot target does not matter, if you use -H to back it up -- otherwise, stuff will be duplicated.
-hi is most useful to see exactly what happens, more than -v. -avvhiH for the full treatment :-)
Peter
thanks, I'll experiment with some of that. My first effort is try just --delete and see if the after is an issue. (ie. my rsyncs are large and it may be that they are not "completing" for some reason.) Greg -- Greg Freemyer Litigation Triage Solutions Specialist http://www.linkedin.com/in/gregfreemyer First 99 Days Litigation White Paper - http://www.norcrossgroup.com/forms/whitepapers/99%20Days%20whitepaper.pdf The Norcross Group The Intersection of Evidence & Technology http://www.norcrossgroup.com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Peter Poeml wrote:
On Wed, Feb 18, 2009 at 06:08:09PM -0500, Greg Freemyer wrote:
I just noticed that the main directory is 247GB, but the remote copy is 354GB.
I assume rsync can cause the extra files on the remote to be deleted?
You may want to use -H; if you back up a system that contains hardlinked files that'll save you from duplication on the receiver side.
Just to second Peter's opinion. Investigate the hard links. I have had exactly this issue when using rdiff-backup and rsync together. My guess was there seems to be some difference between the way rsync and librsync behave. Cheers, Dave -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Wednesday February 18 2009, Greg Freemyer wrote:
All,
I have a directory that I backup every night via rsync to a remote server a couple thousand miles away. (For about 2 years now.)
Its a dynamic directory with files created and deleted.
I just noticed that the main directory is 247GB, but the remote copy is 354GB.
If the allocation block size is different on the source and destination file systems, then internal fragmentation could account for the difference. What do you get when you divide the size discrepancy by the number of files?
...
Thanks Greg
Randall Schulz -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
participants (7)
-
Boris Epstein
-
Brian K. White
-
Carlos E. R.
-
Dave Howorth
-
Greg Freemyer
-
Peter Poeml
-
Randall R Schulz