[opensuse-buildservice] OBS on Kubernetes
Dear OBS Maintainers, I am running a deployment of OBS on a Kubernetes cluster. In particular, I am running a horizontal pod autoscaler, that creates more worker pods as-needed and removes them when they have been idle for long enough. This works reasonably well, but leaves a lot of 'dead' workers lingering in the build status page, since Kubernetes assigns new unique pod names every time a pod is created. Is there a way to configure OBS to assume a worker is not coming back if it has not been reachable some time? Best regards, Moritz -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
On Freitag, 31. Januar 2020, 14:40:58 CET Moritz Röhrich wrote:
Dear OBS Maintainers,
I am running a deployment of OBS on a Kubernetes cluster. In particular, I am running a horizontal pod autoscaler, that creates more worker pods as-needed and removes them when they have been idle for long enough. This works reasonably well, but leaves a lot of 'dead' workers lingering in the build status page, since Kubernetes assigns new unique pod names every time a pod is created. Is there a way to configure OBS to assume a worker is not coming back if it has not been reachable some time?
not atm, since is breaking the constraints concept a bit... However, we are about to work on kubernetes support as well. Sumit wrote a very first draft how to run kubernets worker here: https://github.com/openSUSE/obs-docu/commit/8e579528071c49df232aad46cdcfe259... -- Adrian Schroeter <adrian@suse.de> Build Infrastructure Project Manager SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany (HRB 247165, AG München), Geschäftsführer: Felix Imendörffer -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
Hi Adrian, thanks for that. I was using a deployment too. But I since switched to define the workers as a StatefulSet. This successfully avoids the lingering workers issue for the most part, because when new worker pods are created they don't get a completely new name. So to OBS now my pods behave more like physical machines that shut off from time to time. I still have a feature wish however, currently I am scaling based on CPU utilization which is suboptimal for several reasons: 1) pod scaling only happens once the first pod starts compute intense operations 2) during less compute intense build stages pods tend to be killed off My workaround has been to set the target CPU utilization to 5% to avoid problem 2 but that has the side effect that even if I build only one package kubernetes will create as much pods as the autoscaler allows since the CPU utilization during compilation is >5%. The only way I see to avoid that is to utilize external metrics[1] for the autoscaler, but I don't have an idea how to accomplish that with OBS. It would be really nice to query the number of queued build jobs from kubernetes to scale the amount of worker pods. I have attached my setup, if you are interested. Best regards, Moritz [1] https://github.com/kubernetes/community/blob/master/contributors/design-prop... On Fri, Jan 31, 2020 at 2:49 PM Adrian Schröter <adrian@suse.de> wrote:
On Freitag, 31. Januar 2020, 14:40:58 CET Moritz Röhrich wrote:
Dear OBS Maintainers,
I am running a deployment of OBS on a Kubernetes cluster. In particular, I am running a horizontal pod autoscaler, that creates more worker pods as-needed and removes them when they have been idle for long enough. This works reasonably well, but leaves a lot of 'dead' workers lingering in the build status page, since Kubernetes assigns new unique pod names every time a pod is created. Is there a way to configure OBS to assume a worker is not coming back if it has not been reachable some time?
not atm, since is breaking the constraints concept a bit...
However, we are about to work on kubernetes support as well. Sumit wrote a very first draft how to run kubernets worker here:
https://github.com/openSUSE/obs-docu/commit/8e579528071c49df232aad46cdcfe259...
--
Adrian Schroeter <adrian@suse.de> Build Infrastructure Project Manager
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany (HRB 247165, AG München), Geschäftsführer: Felix Imendörffer
participants (2)
-
Adrian Schröter
-
Moritz Röhrich