[Bug 1145194] New: network/rsync: stall during duplicity backup over NFS
http://bugzilla.opensuse.org/show_bug.cgi?id=1145194 Bug ID: 1145194 Summary: network/rsync: stall during duplicity backup over NFS Classification: openSUSE Product: openSUSE.org Version: unspecified Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: 3rd party software Assignee: tchvatal@suse.com Reporter: jon@moozaad.co.uk QA Contact: bnc-team-screening@forge.provo.novell.com Found By: --- Blocker: --- Created attachment 813627 --> http://bugzilla.opensuse.org/attachment.cgi?id=813627&action=edit command line and strace, bt When backing up a snapshot with duplicity it gets about 16MB in and then hangs. I've tried resetting the backups with fresh instances but the issue is definitely with rsync. The NFS drives it connects to works fine throughout. It stopped working around April/March this year but I don't think it's related to the patch from then. Looking at the trace it seems to be stuck on a chown... ? Maybe related to the % in the files name? Though it appears there's plenty others that transferred fine. -rw------- 1 nobody users 18227 Aug 11 12:38 %gconf-tree-nso.xml -rw------- 1 nobody users 7043 Aug 11 12:38 %gconf-tree-oc.xml -rw------- 1 nobody users 98592 Aug 11 12:38 %gconf-tree-or.xml -rw------- 1 nobody users 87970 Aug 11 12:38 %gconf-tree-pa.xml -rw------- 1 nobody users 65723 Aug 11 12:38 .%gconf-tree-pl.xml.f2lGL8 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1145194 http://bugzilla.opensuse.org/show_bug.cgi?id=1145194#c1 --- Comment #1 from Tomáš Chvátal <tchvatal@suse.com> --- I just tried to create some files with % and transfer them and it looked just fine. The trace also doe snot look like something sinster is at play... What version of openSUSE/rsync are you using that exhibit this problem? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1145194 http://bugzilla.opensuse.org/show_bug.cgi?id=1145194#c2 --- Comment #2 from Jon Brightwell <jon@moozaad.co.uk> --- Version : 3.1.3-5.1 on tumbleweed. The target is NFS on a deb box. It's weird that it fails on that file but there's plenty of similar named files with the same permission already done. Maybe a race condition or something on the chown? It runs daily, here's today's trace. #0 0x00007f59748a0377 in __GI___select (nfds=nfds@entry=5, readfds=readfds@entry=0x7ffd2c9d61b0, writefds=writefds@entry=0x7ffd2c9d60b0, exceptfds=exceptfds@entry=0x7ffd2c9d6130, timeout=timeout@entry=0x7ffd2c9d6080) at ../sysdeps/unix/sysv/linux/select.c:41 #1 0x00005580c4cb6fb7 in perform_io (needed=205, flags=flags@entry=4) at io.c:751 #2 0x00005580c4cb8572 in send_msg (code=MSG_ERROR_XFER, buf=0x7ffd2c9d6bc0 "rsync: chown \"/mnt/naspuppy-home/moozaad/mudkip-backups/backintime/mudkip.farm/root/1/new_snapshot/backup/etc/gconf/gconf.xml.schemas/.%gconf-tree-or.xml.QpfvzU\" failed: Operation not permitted (1)\n", len=198, convert=<optimized out>) at io.c:967 #3 0x00005580c4cbce5c in rwrite (code=FERROR_XFER, buf=0x7ffd2c9d6bc0 "rsync: chown \"/mnt/naspuppy-home/moozaad/mudkip-backups/backintime/mudkip.farm/root/1/new_snapshot/backup/etc/gconf/gconf.xml.schemas/.%gconf-tree-or.xml.QpfvzU\" failed: Operation not permitted (1)\n", len=198, is_utf8=<optimized out>) at log.c:278 #4 0x00005580c4cbd5c4 in rsyserr (code=FERROR_XFER, errcode=1, format=<optimized out>) at log.c:465 #5 0x00005580c4cd63c0 in set_file_attrs (fname=fname@entry=0x7ffd2c9d95a0 "etc/gconf/gconf.xml.schemas/.%gconf-tree-or.xml.QpfvzU", file=file@entry=0x7f59743bb170, sxp=<optimized out>, sxp@entry=0x0, fnamecmp=fnamecmp@entry=0x7ffd2c9db5a0 "etc/gconf/gconf.xml.schemas/%gconf-tree-or.xml", flags=flags@entry=4) at rsync.c:525 #6 0x00005580c4cd66f6 in finish_transfer (fname=0x7ffd2c9db5a0 "etc/gconf/gconf.xml.schemas/%gconf-tree-or.xml", fnametmp=0x7ffd2c9d95a0 "etc/gconf/gconf.xml.schemas/.%gconf-tree-or.xml.QpfvzU", fnamecmp=0x7ffd2c9db5a0 "etc/gconf/gconf.xml.schemas/%gconf-tree-or.xml", partialptr=0x0, file=0x7f59743bb170, ok_to_set_time=1, overwriting_basis=1) at rsync.c:674 #7 0x00005580c4cd1623 in recv_files (f_in=f_in@entry=0, f_out=f_out@entry=4, local_name=local_name@entry=0x0) at receiver.c:859 #8 0x00005580c4cd2131 in do_recv (f_in=f_in@entry=0, f_out=4, f_out@entry=1, local_name=local_name@entry=0x0) at main.c:912 #9 0x00005580c4cd2d9e in do_server_recv (argv=<optimized out>, argc=<optimized out>, f_out=1, f_in=0) at main.c:1081 #10 start_server (f_in=f_in@entry=0, f_out=f_out@entry=1, argc=<optimized out>, argv=<optimized out>) at main.c:1115 #11 0x00005580c4cd2f07 in child_main (argc=<optimized out>, argv=<optimized out>) at main.c:1088 #12 0x00005580c4ca27bc in local_child (child_main=<optimized out>, f_out=<synthetic pointer>, f_in=<synthetic pointer>, argv=0x7ffd2c9dc940, argc=2) at pipe.c:109 #13 do_cmd (f_out_p=<synthetic pointer>, f_in_p=<synthetic pointer>, remote_argc=<optimized out>, remote_argv=<optimized out>, user=0x0, machine=<optimized out>, cmd=<optimized out>) at main.c:543 #14 start_client (argv=<optimized out>, argc=<optimized out>) at main.c:1432 #15 main (argc=<optimized out>, argv=<optimized out>) at main.c:1670 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1145194 http://bugzilla.opensuse.org/show_bug.cgi?id=1145194#c3 --- Comment #3 from Tomáš Chvátal <tchvatal@suse.com> --- Interesting, it worked here the only difference is that I shoveled it from TW->TW and didn't bother with nfs. I would have to fire up some nfs mount to test that as well.... Also it seems there were no chances to chown operation since 2011: https://git.samba.org/rsync.git/?p=rsync.git&a=search&h=HEAD&st=commit&s=cho... -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1145194 http://bugzilla.opensuse.org/show_bug.cgi?id=1145194#c6 --- Comment #6 from Jon Brightwell <jon@moozaad.co.uk> --- I stopped using this approach as it consistently failed. If there are other bugs like this that need a reproducible case, then I could probably set it up again if needed. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1145194 http://bugzilla.opensuse.org/show_bug.cgi?id=1145194#c7 --- Comment #7 from Pedro Monreal Gonzalez <pmonrealgonzalez@suse.com> --- Looking at the backtraces, it complains about "rsync: chown \"${whatever_file}\" failed: Operation not permitted (1)\n". I think this is expected, the NFS user is nobody (not sure which one is in deb systems, it might be the same) and you may not have permissions to change user or group ownership on the remote server files. I've seen cli options to not change the ownership on the target system, like: --no-perms --no-group --no-owner. Other idea could be to not use 'rsync -av' since it introduces the -p option that tries to set the user and group ownership on the remote server, using -rltgoDv instead of -av should work. If keeping the ownership on the server is needed, you can also use rsync over ssh. Also, WayneD has added two new related options, but these are not available in the latest version as of yet, here: https://github.com/WayneD/rsync/issues/135. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1145194 http://bugzilla.opensuse.org/show_bug.cgi?id=1145194#c8 --- Comment #8 from Pedro Monreal Gonzalez <pmonrealgonzalez@suse.com> --- @Jon, could you try to see if any of these options work in your setup? -- You are receiving this mail because: You are on the CC list for the bug.
participants (2)
-
bugzilla_noreply@novell.com
-
bugzilla_noreply@suse.com