[opensuse-buildservice] osc co/up hangs and eventually corrupts working copy
This weekend I tried to run the following command several times: osc co --unexpand-link home:duwe:crosstools The cross-* packages which are linked were downloaded without issues, but as soon as the master-binutils package is supposed to be downloaded, osc hangs. lsof shows that 2MB of binutils-tar.bz2 are downloaded, quickly, but then osc hangs for many hours (or even days, I havent tried) without progress nor error messages. In parallel, I rolled my own via the web interface, which works really nice. Retreiving packages from remote safes me the slow upload from my side. Once I ran the same osc command for my project, I experienced the same hangs. Most of the time 2MB of a file were downloaded, then osc hung. If interrupted with ctrl c, osc may eventually be able to continue with 'osc up'. I tried it many times. osc downloads a block of 1048576 or 2097152 bytes, then it stops. This morning I was finally able to finish a fresh checkout after a few attempts. But the thing is: once more packages get added via the webinterface an 'osc up' inside the local working copy will hang as well. If interrupted it will corrupt the working copy by adding a new incomplete directory. Several 'osc up' runs will stop with 'directory XY does already exist'. Removing that XY directory does seldom help. Is there a way to fix the local working copy from such breakage? Doing a fresh checkout is appearently no option, as it starts to hang again. Perhaps the servers should return some sort of EBUSY right away when they cant service the 'checkout/update' request. This would prevent the local corruption and it probably reduces the load on the servers, because repeated checkout attempts may put even more pressure on them. I'm using osc 0.125.5 on openSuSE 11.2-x86_64. Olaf -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-buildservice+help@opensuse.org
On 26/04/10 16:16, Olaf Hering wrote:
This weekend I tried to run the following command several times:
osc co --unexpand-link home:duwe:crosstools
The cross-* packages which are linked were downloaded without issues, but as soon as the master-binutils package is supposed to be downloaded, osc hangs. lsof shows that 2MB of binutils-tar.bz2 are downloaded, quickly, but then osc hangs for many hours (or even days, I havent tried) without progress nor error messages.
...
I'm using osc 0.125.5 on openSuSE 11.2-x86_64.
Not a solution, but AFAIK osc 0.126 implemented progress bars for osc co. Its available in OBS openSUSE:Tools Might give you some more information (along the debug option flags, I forget the syntax) as to what is going on. Or maybe upgrading will fix the issue entirely. Regards, Tejas -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-buildservice+help@opensuse.org
Moin Olaf! Am Montag, 26. April 2010 17:16:11 schrieb Olaf Hering:
This weekend I tried to run the following command several times:
osc co --unexpand-link home:duwe:crosstools
The cross-* packages which are linked were downloaded without issues, but as soon as the master-binutils package is supposed to be downloaded, osc hangs. lsof shows that 2MB of binutils-tar.bz2 are downloaded, quickly, but then osc hangs for many hours (or even days, I havent tried) without progress nor error messages.
In parallel, I rolled my own via the web interface, which works really nice. Retreiving packages from remote safes me the slow upload from my side.
Once I ran the same osc command for my project, I experienced the same hangs. Most of the time 2MB of a file were downloaded, then osc hung. If interrupted with ctrl c, osc may eventually be able to continue with 'osc up'. I tried it many times. osc downloads a block of 1048576 or 2097152 bytes, then it stops. This morning I was finally able to finish a fresh checkout after a few attempts.
Can you please retry with "osc -d ..." and look from where these packages are comming ? It might be an api problem or a mirror problem, because osc tries first to use download.opensuse.org redirector.
But the thing is: once more packages get added via the webinterface an 'osc up' inside the local working copy will hang as well. If interrupted it will corrupt the working copy by adding a new incomplete directory. Several 'osc up' runs will stop with 'directory XY does already exist'. Removing that XY directory does seldom help.
Hm, do you use some proxy or anything else what makes your setup special ? I am not aware that anyone else has such problems atm.
Is there a way to fix the local working copy from such breakage? Doing a fresh checkout is appearently no option, as it starts to hang again.
Perhaps the servers should return some sort of EBUSY right away when they cant service the 'checkout/update' request. This would prevent the local corruption and it probably reduces the load on the servers, because repeated checkout attempts may put even more pressure on them.
right, local corruption should not happen in any case. But the api and source server have usually a relative low load. We have splitted them away from the loaded binary/scheduling server to guarantee short answer times. So, this can not be a load problem IMHO, but maybe we have a bug somewhere. I just wonder what is triggering this and how we can reproduce it. But it can be of course also just a broken router on your side ...
I'm using osc 0.125.5 on openSuSE 11.2-x86_64.
Should be fine bye adrian -- Adrian Schroeter SUSE Linux Products GmbH email: adrian@suse.de -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-buildservice+help@opensuse.org
On Tue, Apr 27, Adrian Schröter wrote:
Once I ran the same osc command for my project, I experienced the same hangs. Most of the time 2MB of a file were downloaded, then osc hung. If interrupted with ctrl c, osc may eventually be able to continue with 'osc up'. I tried it many times. osc downloads a block of 1048576 or 2097152 bytes, then it stops. This morning I was finally able to finish a fresh checkout after a few attempts.
Can you please retry with "osc -d ..." and look from where these packages are comming ?
-d shows all files coming from https://api.opensuse.org/source/home...
It might be an api problem or a mirror problem, because osc tries first to use download.opensuse.org redirector.
download.opensuse.org is not listed in the local named log, only api.opensuse.org is.
But the thing is: once more packages get added via the webinterface an 'osc up' inside the local working copy will hang as well. If interrupted it will corrupt the working copy by adding a new incomplete directory. Several 'osc up' runs will stop with 'directory XY does already exist'. Removing that XY directory does seldom help.
Hm, do you use some proxy or anything else what makes your setup special ? I am not aware that anyone else has such problems atm.
I looked at it briefly. Its the way things are downloaded. It probably goes like this: get package_list foreach pkg in package_list mkdir pkg get filelist > pkg/.osc/_files foreach file in filelist get file > pkg/file update .osc/_packages To make the ctrl c during checkout or update robust it may be like this: foreach pkg in package_list get filelist > ${tmp}/_files update .osc/_packages mkdir pkg cp ${tmp}/_files > pkg/.osc/_files foreach file in filelist get file > pkg/file If I update the .osc/_packages file manually, osc knows about an incomplete package and tries to update it. The osc meta data look very simple and allow edit with a texteditor, so its possible to fix things up manually.
Is there a way to fix the local working copy from such breakage? Doing a fresh checkout is appearently no option, as it starts to hang again.
Perhaps the servers should return some sort of EBUSY right away when they cant service the 'checkout/update' request. This would prevent the local corruption and it probably reduces the load on the servers, because repeated checkout attempts may put even more pressure on them.
right, local corruption should not happen in any case.
Its the way files and metadata are downloaded. If they are downloaded and stored in the correct order, osc can be more robust and handle such situations.
So, this can not be a load problem IMHO, but maybe we have a bug somewhere. I just wonder what is triggering this and how we can reproduce it.
I can trigger it most of the time. According to the 'time osc -v -v -d -H up --unexpand-link' output, it can hang in any of the GET requests: PRJ/PKG/?rev=latest PRJ/PKG/_meta PRJ/PKG/filename
But it can be of course also just a broken router on your side ...
Maybe, but my workload has not shown any issues so far with the network connection. Olaf -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-buildservice+help@opensuse.org
On 2010-04-27 18:53:07 +0200, Olaf Hering wrote:
On Tue, Apr 27, Adrian Schröter wrote:
Once I ran the same osc command for my project, I experienced the same hangs. Most of the time 2MB of a file were downloaded, then osc hung. If interrupted with ctrl c, osc may eventually be able to continue with 'osc up'. I tried it many times. osc downloads a block of 1048576 or 2097152 bytes, then it stops. This morning I was finally able to finish a fresh checkout after a few attempts.
Can you please retry with "osc -d ..." and look from where these packages are comming ?
-d shows all files coming from https://api.opensuse.org/source/home...
That's correct.
It might be an api problem or a mirror problem, because osc tries first to use download.opensuse.org redirector.
download.opensuse.org is not listed in the local named log, only api.opensuse.org is.
That's correct, too.
But the thing is: once more packages get added via the webinterface an 'osc up' inside the local working copy will hang as well. If interrupted it will corrupt the working copy by adding a new incomplete directory. Several 'osc up' runs will stop with 'directory XY does already exist'. Removing that XY directory does seldom help.
Hm, do you use some proxy or anything else what makes your setup special ? I am not aware that anyone else has such problems atm.
<SNIP>
right, local corruption should not happen in any case.
Its the way files and metadata are downloaded. If they are downloaded and stored in the correct order, osc can be more robust and handle such situations.
Yes osc should handle this more gracefully. I'm currently working on a small working copy restructuring which will make the whole update easier and more robust. At the moment I'm a bit busy but it's WIP.. :)
So, this can not be a load problem IMHO, but maybe we have a bug somewhere. I just wonder what is triggering this and how we can reproduce it.
I can trigger it most of the time. According to the 'time osc -v -v -d -H up --unexpand-link' output, it can hang in any of the GET requests:
PRJ/PKG/?rev=latest PRJ/PKG/_meta PRJ/PKG/filename
Unfortunately I can't reproduce it locally:/ Marcus -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-buildservice+help@opensuse.org
participants (4)
-
Adrian Schröter
-
Marcus Hüwe
-
Olaf Hering
-
Tejas Guruswamy