Dear OBS Devs, I was wondering if there is a way to collect metrics from OBS. Of particular interest to me are build queue length, number of in-progress package builds and regular CPU/memory usage metrics. In my use case, I want to use the metrics to let Kubernetes determine automatically how many workers to deploy. This works almost alright with the limited metrics support that I already have from Kubernetes itself, but there are some pitfalls and edge cases where Kubernetes removes workers before they are fully done building a package. My current situation is this: I have a StatefulSet for the workers and a HorizontalPodAutoscaler which will scale up the StatefulSet once CPU usage (as determined by Kubernetes itself) exceeds the threshold (workers are no longer idle). Due to long periods of IO bound operations where CPU usage may fall below the threshold, the autoscaler needs to delay scale-down operations by at least 30 min. This is quite hacky and only really works as long as the build time of the packages does not significantly exceed 30 minutes. The downsides are that the scaling only happens once CPU usage increases, which is usually quite late in the package building process and that workers may be killed too early, if CPU usage at the end of the build process is too low for too long. If I had the build queue length and the number of in-progress package builds as scrapeable metric, I could utilize that to let Kubernetes determine the right amount of workers more reliably. Best regards, Moritz Röhrich -- Quobyte GmbH, Berlin, AG Charlottenburg HRB 149012 B, Felix Hupfeld, Bjoern Kolbeck -- Quobyte GmbH, Berlin, AG Charlottenburg HRB 149012 B, Dr. Felix Hupfeld, Dr. Bjoern Kolbeck
Hey Moritz, On 25.06.21 09:39, Moritz Röhrich wrote:
I was wondering if there is a way to collect metrics from OBS. Of particular interest to me are build queue length, number of in-progress package builds We emit various metrics in various ways.
Most simple, the OBS API exposes various metrics about workers. See the API docu about the /worker/status route: https://build.opensuse.org/apidocs/index#117 Then there are performance and metrics for the Ruby on Rails app (what we call OBS frontend). Those you can configure with the `influxdb_*` settings in /srv/www/obs/api/config/options.yml We are making use of: - https://github.com/influxdata/influxdb-rails to report - https://github.com/influxdata/influxdb as time series DB You can see examples of those metrics visualized with grafana at: https://obs-measure.opensuse.org/d/1hfbqvOMz/overview?orgId=1&refresh=5m The Ruby on Rails app also pushes out further metrics onto an AMQP message bus. You can configure this with the `amqp_*` settings in the same options.yml file. At openSUSE we are making use of: - https://rabbit.opensuse.org/ as AMQP bus - https://github.com/influxdata/telegraf reading from this bus - https://github.com/influxdata/influxdb as time series DB You can see examples of those metrics visualized with grafana at: https://obs-measure.opensuse.org/d/CD0k0sFMz/builds?orgId=1&var-Architecture...
and regular CPU/memory usage metrics.
This we would consider work of some general system reporter. At openSUSE we use incinga or telegraf to do this, same principal way as above. So all in all this is already there, you just have to configure and use it in k8s. Let us know how it goes :-) Henne -- Henne Vogelsang http://www.opensuse.org Everybody has a plan, until they get hit. - Mike Tyson
Hey, On 25.06.21 14:36, Henne Vogelsang wrote:
The Ruby on Rails app also pushes out further metrics onto an AMQP message bus.
Sorry, forgot to mention that this is documented in the OBS Admin book https://openbuildservice.org/help/manuals/obs-admin-guide/obs.cha.administra... Henne -- Henne Vogelsang http://www.opensuse.org Everybody has a plan, until they get hit. - Mike Tyson
https://obs-grafana.opensuse.org/goto/Au8ignz7z might also be worth a look. with kind regards, Lars -- Lars Vogdt <Lars.Vogdt@suse.com> - BuildOPS Engineering Team Lead - SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany GF: Felix Imendörffer (HRB 36809, AG Nürnberg)
Hi Henne, hi Lars thanks for the detailed reply. This looks promising although it is in a different format than what I was hoping for, so I probably need to do some processing before this works. The API in particular should give me the right metrics., but it would result in quite a long pipeline from the API to the Autoscaler: OBS API --> xml to prometheus exporter --> prometheus --> prometheus-to-k8s-api-adapter --> HPA On the other hand, that would allow me to avoid screwing around with a message bus. Lets see, if I ever get this autoscaler to work satisfactory, I'll let you know. Best regards, Moritz On Thu, Jul 15, 2021 at 6:05 PM Lars Vogdt <Lars.Vogdt@suse.de> wrote:
https://obs-grafana.opensuse.org/goto/Au8ignz7z
might also be worth a look.
with kind regards, Lars
-- Lars Vogdt <Lars.Vogdt@suse.com> - BuildOPS Engineering Team Lead - SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany GF: Felix Imendörffer (HRB 36809, AG Nürnberg)
-- Quobyte GmbH, Berlin, AG Charlottenburg HRB 149012 B, Felix Hupfeld, Bjoern Kolbeck -- Quobyte GmbH, Berlin, AG Charlottenburg HRB 149012 B, Dr. Felix Hupfeld, Dr. Bjoern Kolbeck
participants (3)
-
Henne Vogelsang
-
Lars Vogdt
-
Moritz Röhrich