[opensuse-buildservice] Cannot upload big source files to private OBS instance

Hi, I try to build qt5.12 in our own OBS instance (running OBS version 2.9.5). When I try to upload the big complete qt-everywhere-src-5.12.3.tar.xz tarball (~485MB) from the libqt5-qtdoc package to the server the upload hangs forever regardless if I use osc commit or file upload via the WEB UI. The file never hit the /srv/obs/source directory. I did successful upload files only a little bit smaller (~442MB) some time ago. Here is enough disk space on the server. Free shows about 2.5 GB free memory without buffers on the server. No errors on the logs, but also no entry about this file in the logs. The network shows high traffic for some tens of seconds to the obs server, then low traffic, so it seems that the file got transfered, at least a big part oft it. Any hints howto debug this ? Is here a place where the file is saved temporary during the API put operation ? Do I need to increase some limits ? Thanks Karsten -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org

Hey Karsten! On 25.04.19 23:13, Karsten Keil wrote:
Any hints howto debug this ?
Logs? :-) Depending on your setup, there should be an apache log (access.log) and a rails log (production.log) in /srv/www/obs/api/log/
Is here a place where the file is saved temporary during the API put operation?
Yes, /srv/www/obs/api/tmp
Do I need to increase some limits ?
Maybe, depends on what is happening and your server configuration. By default, on the appliance, there are no limits. Do you have XForward configured? Henne -- Henne Vogelsang http://www.opensuse.org Everybody has a plan, until they get hit. - Mike Tyson -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org

Hey Henne, Last week I was ill, so here some news about the issue. The download did finish after very long time (17 hours). This is much too long other files (e.g a 120 MB) source file did finish in less a minute. The OBS server had 2.5 GB of ram free at this time and lot of space in the filesystem.
On 25.04.19 23:13, Karsten Keil wrote:
Any hints howto debug this ?
Logs? :-) Depending on your setup, there should be an apache log (access.log) and a rails log (production.log) in /srv/www/obs/api/log/
Before it did finish here was no sign of the file in the logs. Now it is, but the log entry seems to be written after the upload did finish it looks wired reordered in the ssl_request log: [27/Apr/2019:01:11:01 +0200] 192.168.3.47 TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "GET /monitor/update_building HTTP/1.1" 182 "https://buildservice.ftgs.net/monitor" "Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0" [27/Apr/2019:01:11:21 +0200] 192.168.3.47 TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "GET /monitor/events?range=24&arch=x86_64 HTTP/1.1" 9535 "https://buildservice.ftgs.net/monitor" "Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0" [27/Apr/2019:01:11:21 +0200] 192.168.3.47 TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "GET /monitor/events?range=24&arch=x86_64 HTTP/1.1" 9535 "https://buildservice.ftgs.net/monitor" "Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0" [26/Apr/2019:08:36:26 +0200] 192.168.3.47 TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "PUT /source/FTGS:common_2019.001/libqt5-qtdoc/qt-everywhere-src-5.12.3.tar.xz?rev=repository HTTP/1.1" 112 "-" "osc/0.161.1" [26/Apr/2019:08:36:26 +0200] 192.168.3.47 TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "PUT /source/FTGS:common_2019.001/libqt5-qtdoc/qt-everywhere-src-5.12.3.tar.xz?rev=repository HTTP/1.1" 112 "-" "osc/0.161.1" [27/Apr/2019:01:12:41 +0200] 192.168.3.47 TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "GET /monitor/update_building HTTP/1.1" 182 "https://buildservice.ftgs.net/monitor" "Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0" [27/Apr/2019:01:12:41 +0200] 192.168.3.47 TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "GET /monitor/update_building HTTP/1.1" 182 "https://buildservice.ftgs.net/monitor" "Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0" [27/Apr/2019:01:12:41 +0200] 192.168.3.47 TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "GET /monitor/events?range=24&arch=x86_64 HTTP/1.1" 9535 "https://buildservice.ftgs.net/monitor" "Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0" [27/Apr/2019:01:12:41 +0200] 192.168.3.47 TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "GET /monitor/events?range=24&arch=x86_64 HTTP/1.1" 9535 "https://buildservice.ftgs.net/monitor" "Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0" [27/Apr/2019:01:12:00 +0200] 192.168.3.47 TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "PUT /source/FTGS:common_2019.001/libqt5-qtdoc/qt-everywhere-src-5.12.3.tar.xz?rev=repository HTTP/1.1" 92 "-" "osc/0.161.1" [27/Apr/2019:01:12:00 +0200] 192.168.3.47 TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 "PUT /source/FTGS:common_2019.001/libqt5-qtdoc/qt-everywhere-src-5.12.3.tar.xz?rev=repository HTTP/1.1" 92 "-" "osc/0.161.1" My last attempt was that I did start an new try with osc to commit the file Friday (26.4.) morning, so the [26/Apr/2019:08:36:26 +0200] entry would correspond to this. This time I did not kill the upload attempt after few houres, so it indeed seems to finish in the night after ~17 hours. In the production log is a corresponding entry for [27/Apr/2019:01:12:41 +0200]. I did start osc with debug of http and it did stall in the PUT request, a tcpdump shows data packets with big size for about one minute, then only few (6-8)small packets (1k) did flow between the client and the the server. Very strange. I do not have so much ideas about the reason, I would expect an upload time of not more as 3-4 minutes if a 150 MB file finished in less than 1 minute. Or is here some wired cgroup/systemd magic hitting us ?
Is here a place where the file is saved temporary during the API put operation?
Yes, /srv/www/obs/api/tmp
No sign of the file in this place at the time where the upload did stall for long time. I looked with ls -atrl and even lsof did not find any open file which looks as it could be related.
Do I need to increase some limits ?
Maybe, depends on what is happening and your server configuration. By default, on the appliance, there are no limits. Do you have XForward configured? use_xforward: true in options.yml
Karsten The information in this e-mail is confidential. The contents may not be disclosed or used by anyone other than the addressee. Access to this e-mail by anyone else is unauthorised. If you are not the intended recipient, please notify Airbus immediately and delete this e-mail. Airbus cannot accept any responsibility for the accuracy or completeness of this e-mail as it has been sent over public networks. If you have any concerns over the content of this message or its Accuracy or Integrity, please contact Airbus immediately. All outgoing e-mails from Airbus are checked using regularly updated virus scanning software but you should take whatever measures you deem to be appropriate to ensure that this message and any attachments are virus free.

Hi, Am 06.05.19 um 14:36 schrieb Keil, Karsten:
Hey Henne,
Last week I was ill, so here some news about the issue. The download did finish after very long time (17 hours). This is much too long other files (e.g a 120 MB) source file did finish in less a minute. The OBS server had 2.5 GB of ram free at this time and lot of space in the filesystem.
I did some more tests - a 128MB File took less then 10 seconds to commit with osc, a 150MB file 28 minutes. The full wireshark trace show data flow in the first 10 sec then only small TCP packets until some seconds before the end then some packets again and the upload finished successful. After this I decide to pull the last apache updates into the server and the issue was gone. A 500MB file need also only 10 seconds to commit now. (Note I did reboot the server several times before without any change of the behavior). So my problem is solved, but still I do not understand why it did happen on a system which did run half a year without any issue. Karsten -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org

Hi Karsten, Am 07.05.19 um 21:09 schrieb Karsten Keil:
Hi,
I did some more tests - a 128MB File took less then 10 seconds to commit with osc, a 150MB file 28 minutes. The full wireshark trace show data flow in the first 10 sec then only small TCP packets until some seconds before the end then some packets again and the upload finished successful.
Maybe related: After updating my OBS server's base system to SLES12-SP3 (at the same time doing a minor obs update... not the smartest move), I had enormous performance problems with workers uploading big bduild results (KIWI images), in the order of "20 minutes to send back a 400MB image". Tried lots of things (updating the workers, config settings, ...) In the end, the downgrade of the Kernel on the OBS server VM to the SLES12-SP2 kernel solved the issue. I locked the kernel and never tried updating it again ;-) -- Stefan Seyfried "For a successful technology, reality must take precedence over public relations, for nature cannot be fooled." -- Richard Feynman -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
participants (4)
-
Henne Vogelsang
-
Karsten Keil
-
Keil, Karsten
-
Stefan Seyfried