![](https://seccdn.libravatar.org/avatar/2703a7a191401e2e7d0926e255583714.jpg?s=120&d=mm&r=g)
Hi, Today we have an outage of the Yast Jenkins node because of "No space left on device". This is a virtual machine maintained by a team member (including scripts around). It's not the first time this node has died for some reason. https://ci.opensuse.org/computer/yast-jenkins/executors/0/causeOfDeath Thread has died java.io.IOException: No space left on device As this is a core infrastructure of the Yast team and everyone depends on it (as we got used to it and OBS/IBS also expects SRs to be done this way), I would like to change the current rules this way: 1. This is a core infrastructure --> it should run on a Yast-team-owned server (we have a HW ready in server room) 2. Any (core) infrastructure, including scripts, should be owned by Yast team 3. Everything needs to be at GitHub following the same (but stricter) rules for merging changes (continuous deployment from GitHub?) For instance, I'd recommend to require two different LGTMs instead of one as it is now 4. Everything needs to be well documented in a way that "everyone" in the team should be able to recover from such error or even start a new node if something goes really bad 5. Everyone in the team needs to have a root (e.g., via ssh keys) access but not to change anything they do not understand (documentation) 6. We need monitoring, maybe internal IT guys can do that for us and they might have and access to that system for urgent cases as well All this doesn't need to be changed "right now". One little step at the time. Then another. Thanks in advance Lukas -- Lukas Ocilka, Systems Management (Yast) Team Leader Cloud & Systems Management Department, SUSE Linux -- To unsubscribe, e-mail: yast-devel+unsubscribe@opensuse.org To contact the owner, e-mail: yast-devel+owner@opensuse.org