[opensuse-buildservice] OBS on Kubernetes
Dear OBS Maintainers, I am running a deployment of OBS on a Kubernetes cluster. In particular, I am running a horizontal pod autoscaler, that creates more worker pods as-needed and removes them when they have been idle for long enough. This works reasonably well, but leaves a lot of 'dead' workers lingering in the build status page, since Kubernetes assigns new unique pod names every time a pod is created. Is there a way to configure OBS to assume a worker is not coming back if it has not been reachable some time? Best regards, Moritz -- To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-buildservice+owner@opensuse.org
On Freitag, 31. Januar 2020, 14:40:58 CET Moritz Röhrich wrote:
Dear OBS Maintainers,
I am running a deployment of OBS on a Kubernetes cluster. In particular, I am running a horizontal pod autoscaler, that creates more worker pods as-needed and removes them when they have been idle for long enough. This works reasonably well, but leaves a lot of 'dead' workers lingering in the build status page, since Kubernetes assigns new unique pod names every time a pod is created. Is there a way to configure OBS to assume a worker is not coming back if it has not been reachable some time?
not atm, since is breaking the constraints concept a bit...
However, we are about to work on kubernetes support as well. Sumit wrote a
very first draft how to run kubernets worker here:
https://github.com/openSUSE/obs-docu/commit/8e579528071c49df232aad46cdcfe259...
--
Adrian Schroeter
Hi Adrian,
thanks for that. I was using a deployment too. But I since switched to
define the workers as a StatefulSet. This successfully avoids the
lingering workers issue for the most part, because when new worker
pods are created they don't get a completely new name. So to OBS now
my pods behave more like physical machines that shut off from time to
time.
I still have a feature wish however, currently I am scaling based on
CPU utilization which is suboptimal for several reasons:
1) pod scaling only happens once the first pod starts compute intense operations
2) during less compute intense build stages pods tend to be killed off
My workaround has been to set the target CPU utilization to 5% to
avoid problem 2 but that has the side effect that even if I build only
one package kubernetes will create as much pods as the autoscaler
allows since the CPU utilization during compilation is >5%.
The only way I see to avoid that is to utilize external metrics[1] for
the autoscaler, but I don't have an idea how to accomplish that with
OBS. It would be really nice to query the number of queued build jobs
from kubernetes to scale the amount of worker pods.
I have attached my setup, if you are interested.
Best regards,
Moritz
[1] https://github.com/kubernetes/community/blob/master/contributors/design-prop...
On Fri, Jan 31, 2020 at 2:49 PM Adrian Schröter
On Freitag, 31. Januar 2020, 14:40:58 CET Moritz Röhrich wrote:
Dear OBS Maintainers,
I am running a deployment of OBS on a Kubernetes cluster. In particular, I am running a horizontal pod autoscaler, that creates more worker pods as-needed and removes them when they have been idle for long enough. This works reasonably well, but leaves a lot of 'dead' workers lingering in the build status page, since Kubernetes assigns new unique pod names every time a pod is created. Is there a way to configure OBS to assume a worker is not coming back if it has not been reachable some time?
not atm, since is breaking the constraints concept a bit...
However, we are about to work on kubernetes support as well. Sumit wrote a very first draft how to run kubernets worker here:
https://github.com/openSUSE/obs-docu/commit/8e579528071c49df232aad46cdcfe259...
--
Adrian Schroeter
Build Infrastructure Project Manager SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany (HRB 247165, AG München), Geschäftsführer: Felix Imendörffer
participants (2)
-
Adrian Schröter
-
Moritz Röhrich