(In reply to Michal Koutn������ from comment #3) > (In reply to Fabian Vogt from comment #2) > > All of them use practically identical systemd units and FWICT it's random > > which of the containers fails in which way... > > To deal with the randomness. I'd suggest enabling debug logging of systemd > and > capture the journal logs when first issue occurs (few periods back, where I > understand period~a single container lifetime). > Could you collect such data? (Or possibly just share journal data that you > have for the current instance (without debug level).) I can try, but it's not easy to tell when it breaks as we only know that it's broken when a test starts (a couple times a day). So there's always a window of a few hours. We could try to set up a "podman exec ... su -P" loop or something. Should we focus on that or testing with cgroups v2? > > I tried that (without typo, "systemd.unified_cgroup_hierarchy=1"). > > Sorry about that, it hits me all the time. `man systemd` is correct. > > > The kernel parameter is used, but it looks like cgroupv1 is still used by > > systemd, at least for devices. Is that expected? > > No, that's suspicious. The device controller functionality is replaced with > BPF programs with unified mode. (Isn't it still the typo? ':-)) I hope my last comment shows the correct name in /proc/cmdline... > What does `grep cgroup /proc/mounts` say on such a system? Both hierarchies are mounted: openqaworker1:~ # findmnt -R /sys TARGET SOURCE FSTYPE OPTIONS /sys sysfs sysfs rw,nosuid,nodev,noexec,relatime ������������������/sys/kernel/security securityfs securityfs rw,nosuid,nodev,noexec,relatime ������������������/sys/fs/cgroup tmpfs tmpfs ro,nosuid,nodev,noexec,size=4096k,nr_inodes=1024,mode=755,inode64 ��������� ������������������/sys/fs/cgroup/unified cgroup2 cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate ��������� ������������������/sys/fs/cgroup/systemd cgroup cgroup rw,nosuid,nodev,noexec,relatime,xattr,name=systemd ��������� ������������������/sys/fs/cgroup/cpuset cgroup cgroup rw,nosuid,nodev,noexec,relatime,cpuset ��������� ������������������/sys/fs/cgroup/cpu,cpuacct cgroup cgroup rw,nosuid,nodev,noexec,relatime,cpu,cpuacct ��������� ������������������/sys/fs/cgroup/freezer cgroup cgroup rw,nosuid,nodev,noexec,relatime,freezer ��������� ������������������/sys/fs/cgroup/blkio cgroup cgroup rw,nosuid,nodev,noexec,relatime,blkio ��������� ������������������/sys/fs/cgroup/memory cgroup cgroup rw,nosuid,nodev,noexec,relatime,memory ��������� ������������������/sys/fs/cgroup/pids cgroup cgroup rw,nosuid,nodev,noexec,relatime,pids ��������� ������������������/sys/fs/cgroup/net_cls,net_prio cgroup cgroup rw,nosuid,nodev,noexec,relatime,net_cls,net_prio ��������� ������������������/sys/fs/cgroup/perf_event cgroup cgroup rw,nosuid,nodev,noexec,relatime,perf_event ��������� ������������������/sys/fs/cgroup/hugetlb cgroup cgroup rw,nosuid,nodev,noexec,relatime,hugetlb ��������� ������������������/sys/fs/cgroup/misc cgroup cgroup rw,nosuid,nodev,noexec,relatime,misc ��������� ������������������/sys/fs/cgroup/rdma cgroup cgroup rw,nosuid,nodev,noexec,relatime,rdma ��������� ������������������/sys/fs/cgroup/devices cgroup cgroup rw,nosuid,nodev,noexec,relatime,devices ������������������/sys/fs/pstore pstore pstore rw,nosuid,nodev,noexec,relatime ������������������/sys/fs/bpf none bpf rw,nosuid,nodev,noexec,relatime,mode=700 ������������������/sys/kernel/tracing tracefs tracefs rw,nosuid,nodev,noexec,relatime ������������������/sys/kernel/debug debugfs debugfs rw,nosuid,nodev,noexec,relatime ��������� ������������������/sys/kernel/debug/tracing tracefs tracefs rw,nosuid,nodev,noexec,relatime ������������������/sys/fs/fuse/connections fusectl fusectl rw,nosuid,nodev,noexec,relatime ������������������/sys/kernel/config configfs configfs rw,nosuid,nodev,noexec,relatime > > Also, this is not the default, so IMO even if it works with unified it > > should still be fixed with cgv1 or the default changed. > > Understood.