Re: [opensuse-buildservice] /monitor page shows wrong info about workers
Am Montag, 19. März 2012, 17:19:40 schrieb Ed Bartosh:
Hi,
I added 2 new worker hosts and stopped obs-worker service on old host several days ago. However, OBS web UI /monitor page still shows my old host and does not show two new ones. Builds are running just fine, which means that obs server knows about new workers and uses them.
Any idea how to fix this? At least where to look in the code?
obsapidelayed job is not running/working.
I am using obs 2.1.16 on OpenSUSE 11.4
Regards, Ed
-- Adrian Schroeter SUSE Linux Products GmbH email: adrian@suse.de -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
On 3/19/2012 at 08:58 PM, Adrian Schröter<adrian@suse.de> wrote: Am Montag, 19. März 2012, 17:19:40 schrieb Ed Bartosh: Hi,
I added 2 new worker hosts and stopped obs-worker service on old host several days ago. However, OBS web UI /monitor page still shows my old host and does not show two new ones. Builds are running just fine, which means that obs server knows about new workers and uses them.
Any idea how to fix this? At least where to look in the code?
obsapidelayed job is not running/working.
Or if you're running memcached, restart it. I found the same problem when I upgraded my local OBS instance from 2.0.5 to 2.3. The /monitor page would show that builds were running for more than 32+ hrs, whereas, on the backend, no real jobs were running - /srv/obs/jobs/<arch>/ was empty. I later found out that in /srv/www/obs/api/app/controllers/status_controller.rb, "workerstatus" is written to cache without any expiry timeout. To solve this, in my local instance I changed the following code: $ diff -u /tmp/status_controller.rb /srv/www/obs/api/app/controllers/status_controller.rb --- /tmp/status_controller.rb 2012-03-19 21:18:13.000000000 +0530 +++ /srv/www/obs/api/app/controllers/status_controller.rb 2012-01-18 07:13:33.000000000 +0530 @@ -126,7 +126,7 @@ data=REXML::Document.new(ret) mytime = Time.now.to_i - Rails.cache.write('workerstatus', ret) + Rails.cache.write('workerstatus', ret, :expires_in => 20.seconds) data.root.each_element('blocked') do |e| line = StatusHistory.new line.time = mytime Things have been better since this patch. Hope this helps, Srinidhi. PS: there is one more Rails.cache.write in helpers/status_helper.rb without any :expires_in field. I haven't changed it because I haven't seen any problems so far with querying jobhistory. -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
Am Montag, 19. März 2012, 10:01:32 schrieb Srinidhi B:
On 3/19/2012 at 08:58 PM, Adrian Schröter<adrian@suse.de> wrote: Am Montag, 19. März 2012, 17:19:40 schrieb Ed Bartosh: Hi,
I added 2 new worker hosts and stopped obs-worker service on old host several days ago. However, OBS web UI /monitor page still shows my old host and does not show two new ones. Builds are running just fine, which means that obs server knows about new workers and uses them.
Any idea how to fix this? At least where to look in the code?
obsapidelayed job is not running/working.
Or if you're running memcached, restart it.
I found the same problem when I upgraded my local OBS instance from 2.0.5 to 2.3. The /monitor page would show that builds were running for more than 32+ hrs, whereas, on the backend, no real jobs were running - /srv/obs/jobs/<arch>/ was empty.
I later found out that in /srv/www/obs/api/app/controllers/status_controller.rb, "workerstatus" is written to cache without any expiry timeout.
To solve this, in my local instance I changed the following code:
I think your problem is as well the not running delayed job. It feeds the cache, so it is not necessary for the webui to do it. Which can create quite some load if many people do it in parallel. bye adrian
$ diff -u /tmp/status_controller.rb /srv/www/obs/api/app/controllers/status_controller.rb --- /tmp/status_controller.rb 2012-03-19 21:18:13.000000000 +0530 +++ /srv/www/obs/api/app/controllers/status_controller.rb 2012-01-18 07:13:33.000000000 +0530 @@ -126,7 +126,7 @@ data=REXML::Document.new(ret)
mytime = Time.now.to_i - Rails.cache.write('workerstatus', ret) + Rails.cache.write('workerstatus', ret, :expires_in => 20.seconds) data.root.each_element('blocked') do |e| line = StatusHistory.new line.time = mytime
Things have been better since this patch.
Hope this helps, Srinidhi.
PS: there is one more Rails.cache.write in helpers/status_helper.rb without any :expires_in field. I haven't changed it because I haven't seen any problems so far with querying jobhistory.
-- Adrian Schroeter SUSE Linux Products GmbH email: adrian@suse.de -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
On 20.03.2012 09:22, Adrian Schröter wrote:
Am Montag, 19. März 2012, 10:01:32 schrieb Srinidhi B:
On 3/19/2012 at 08:58 PM, Adrian Schröter<adrian@suse.de> wrote: Am Montag, 19. März 2012, 17:19:40 schrieb Ed Bartosh: Hi,
I added 2 new worker hosts and stopped obs-worker service on old host several days ago. However, OBS web UI /monitor page still shows my old host and does not show two new ones. Builds are running just fine, which means that obs server knows about new workers and uses them.
Any idea how to fix this? At least where to look in the code?
obsapidelayed job is not running/working.
Or if you're running memcached, restart it.
I found the same problem when I upgraded my local OBS instance from 2.0.5 to 2.3. The /monitor page would show that builds were running for more than 32+ hrs, whereas, on the backend, no real jobs were running - /srv/obs/jobs/<arch>/ was empty.
I later found out that in /srv/www/obs/api/app/controllers/status_controller.rb, "workerstatus" is written to cache without any expiry timeout.
To solve this, in my local instance I changed the following code:
I think your problem is as well the not running delayed job. It feeds the cache, so it is not necessary for the webui to do it. Which can create quite some load if many people do it in parallel.
That's not exactly the reasoning. The delayed job will make the statistics graphs work, without the minute tick from outside, you can't have a plot over time. And as this statistics tick will put the worker status in cache anyway, there is no need to have the webui bother the backend on its own. I commited a similiar patch now though so that people will get build jobs at least once in a while if they do not care for the plots. Greetings, Stephan -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
On 3/20/2012 at 02:58 PM, Stephan Kulow <coolo@suse.de> wrote: On 20.03.2012 09:22, Adrian Schröter wrote: Am Montag, 19. März 2012, 10:01:32 schrieb Srinidhi B:
On 3/19/2012 at 08:58 PM, Adrian Schröter<adrian@suse.de> wrote: Am Montag, 19. März 2012, 17:19:40 schrieb Ed Bartosh: Hi,
I added 2 new worker hosts and stopped obs-worker service on old host several days ago. However, OBS web UI /monitor page still shows my old host and does not show two new ones. Builds are running just fine, which means that obs server knows about new workers and uses them.
Any idea how to fix this? At least where to look in the code?
obsapidelayed job is not running/working.
Or if you're running memcached, restart it.
I found the same problem when I upgraded my local OBS instance from 2.0.5 to 2.3. The /monitor page would show that builds were running for more than 32+ hrs, whereas, on the backend, no real jobs were running - /srv/obs/jobs/<arch>/ was empty.
I later found out that in /srv/www/obs/api/app/controllers/status_controller.rb, "workerstatus" is written to cache without any expiry timeout.
To solve this, in my local instance I changed the following code:
I think your problem is as well the not running delayed job. It feeds the cache, so it is not necessary for the webui to do it. Which can create quite some load if many people do it in parallel.
That's not exactly the reasoning. The delayed job will make the statistics graphs work, without the minute tick from outside, you can't have a plot over time. And as this statistics tick will put the worker status in cache anyway, there is no need to have the webui bother the backend on its own.
I commited a similiar patch now though so that people will get build jobs at least once in a while if they do not care for the plots.
Greetings, Stephan
Thank you! 3 minutes sounds a reasonable time to expire the cache entry. I chose 20 seconds for a more "live" report when I run "osc api /build/_workerstatus" directly instead of the webui :) Srinidhi. -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
Hi,
I found the same problem when I upgraded my local OBS instance from
2.0.5 to 2.3. The /monitor page would show that builds were running for more than 32+ hrs, whereas, on the backend, no real jobs were running - /srv/obs/jobs/<arch>/ was empty.
I later found out that in
/srv/www/obs/api/app/controllers/status_controller.rb, "workerstatus" is written to cache without any expiry timeout.
To solve this, in my local instance I changed the following code:
I think your problem is as well the not running delayed job. It feeds the cache, so it is not necessary for the webui to do it. Which can create quite some load if many people do it in parallel.
If monitor can possible show incorrect info without delayed job why it's not running in default obs installation then? Regards, Ed -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
Am Dienstag, 20. März 2012, 12:35:36 schrieb Ed Bartosh:
Hi,
I found the same problem when I upgraded my local OBS instance from
2.0.5 to 2.3. The /monitor page would show that builds were running for more than 32+ hrs, whereas, on the backend, no real jobs were running - /srv/obs/jobs/<arch>/ was empty.
I later found out that in
/srv/www/obs/api/app/controllers/status_controller.rb, "workerstatus" is written to cache without any expiry timeout.
To solve this, in my local instance I changed the following code:
I think your problem is as well the not running delayed job. It feeds the cache, so it is not necessary for the webui to do it. Which can create quite some load if many people do it in parallel.
If monitor can possible show incorrect info without delayed job why it's not running in default obs installation then?
It should run in our appliances by default. -- Adrian Schroeter SUSE Linux Products GmbH email: adrian@suse.de -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
obsapidelayed job is not running/working.
Or if you're running memcached, restart it.
I found the same problem when I upgraded my local OBS instance from 2.0.5 to 2.3. The /monitor page would show that builds were running for more than 32+ hrs, whereas, on the backend, no real jobs were running - /srv/obs/jobs/<arch>/ was empty.
I later found out that in /srv/www/obs/api/app/controllers/status_controller.rb, "workerstatus" is written to cache without any expiry timeout.
Restarting memcached solved the issue. Thank you! Just out of curiosity. I checked obs git master and your change is not there. Did you send it to obs developers? Regards, Ed -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
Hi,
obsapidelayed job is not running/working.
Thanks. Where it should be running? On the worker or server machine? Regards, Ed -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
participants (4)
-
Adrian Schröter
-
Ed Bartosh
-
Srinidhi B
-
Stephan Kulow