Fabian Vogt changed bug 1202821
What Removed Added
Flags needinfo?(fvogt@suse.com)  

Comment # 15 on bug 1202821 from
(In reply to Michal Koutn��� from comment #13)
> v2 case:
> 
> (I looked at the affected worker machine)
> 
> strace of `su -P`:
> > 19830 ioctl(3, TIOCSPTLCK, [0] <unfinished ...>
> > 19830 <... ioctl resumed>)              = 0
> > 19830 ioctl(3, TCGETS <unfinished ...>
> > 19830 <... ioctl resumed>, {B38400 opost isig icanon echo ...}) = 0
> > 19830 ioctl(3, TIOCGPTN <unfinished ...>
> > 19830 <... ioctl resumed>, [4])         = 0
> > 19830 stat("/dev/pts/4",  <unfinished ...>
> > 19830 <... stat resumed>{st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x4), ...}) = 0
> > 19830 openat(AT_FDCWD, "/dev/pts/4", O_RDWR|O_NOCTTY <unfinished ...>
> > 19830 <... openat resumed>)             = -1 EPERM (Operation not permitted)
> 
> this command is run in the context of container .scope unit:
> > /machine.slice/libpod-bb7f7bed785fed4e77244d316cea6ee21ba9e6a26bb609f02c894430f72beb3eb.scope
> 
> That scope among other specifies:
> > ...c894430f72beb3eb.scope.d/50-DeviceAllow.conf
> > [Scope]
> > DeviceAllow=
> > DeviceAllow=/dev/char/10:200 rwm
> > DeviceAllow=/dev/char/5:2 rwm
> > DeviceAllow=/dev/char/5:0 rwm
> > DeviceAllow=/dev/char/1:9 rwm
> > DeviceAllow=/dev/char/1:8 rwm
> > DeviceAllow=/dev/char/1:7 rwm
> > DeviceAllow=/dev/char/1:5 rwm
> > DeviceAllow=/dev/char/1:3 rwm
> 
> and
> 
> > ...c894430f72beb3eb.scope.d/50-DevicePolicy.conf
> > # /run/systemd/transient/libpod-bb7f7bed785fed4e77244d316cea6ee21ba9e6a26b609f0>
> > # This is a drop-in unit file extension, created via "systemctl set-property"
> > # or an equivalent operation. Do not edit.
> > [Scope]
> > DevicePolicy=strict
> 
> IOW, the unit is configured (by podman [1]) in such a way that it allows only
> listed devices, 136:4 (/dev/pts4) is not among them.

I assume this libpod scope is created by podman's systemd cgroup controller?

> The bug here is rather inverse, the BPF rules are not properly applied until
> `systemctl daemon-reload` is invoked.

Question is whether it's a bug that the scope is too restrictive or that
podman's
own default is too lenient. I don't know what the default set of allowed device
nodes are currently specified at.

> (I guess it might be related to the fact that .scope creation is run
> "concurrently" with ExecStart= of the service.)

The issue is reproducible even when using "podman start" manually instead of
"systemctl start container-openqaworker1_container_101.service".

> [1] The comment about `systemctl set-property` is slightly misleading as it
> means the properties were defined via DBus API.
> 
> v1 case:
> 
> I believe, it's similar (wrt device access, not non-existent cgroup). The
> device controller strict rules aren't applied until something causes systemd
> to
> re-realize cgroup settings (like daemon-reload) and then `su -P` fails.
> 
> ---
> 
> So, you (containers/openqa) may want to check why libpod scopes have strict
> device policy and me (systemd, +cc systemd-maintainers) may want to check
> why device rules are not properly applied.

Yep, I'll try to have a look.

(In reply to Michal Koutn��� from comment #14)
> When the system is the state that allows `podman exec -i $cont su -P`, could
> you please collect `systemd-analyze dump`? (I'm interested in the section of
> respective lipbod-*.scope, machine.slice and -.slice.)

Attachment incoming. container 101 is working, others are broken.

FTR, you can easily get back into the working state with
"systemctl restart container-openqaworker1_container_101.service".


You are receiving this mail because: