[kubic-bugs] [Bug 1095131] New: kubelet service (1.10.2) fails to start: failed to get device for dir "/var/lib/kubelet"
http://bugzilla.suse.com/show_bug.cgi?id=1095131 Bug ID: 1095131 Summary: kubelet service (1.10.2) fails to start: failed to get device for dir "/var/lib/kubelet" Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Kubic Assignee: kubic-bugs@opensuse.org Reporter: mmeister@suse.com QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.170 Safari/537.36 Build Identifier: May 30 04:56:24 admin hyperkube[2766]: F0530 04:56:24.779150 2766 kubelet.go:1354] Failed to start ContainerManager failed to get rootfs info: failed to get device for dir "/var/lib/kubelet": could not find device with major: 0, minor: 46 in cached partitions map kubelet wont start with kubernetes 1.10.2 it might be a regression of https://github.com/google/cadvisor/pull/1668 ? k8s is not yet updated officially, i have used: https://build.opensuse.org/package/show/home:m_meister:branches:devel:CaaSP:... tested on the Stack-hardware image, which comes with the additional 9p kernel modules and k8s preinstalled: https://download.opensuse.org/repositories/devel:/CaaSP:/images/images/openS... some previous discussion can be found also in https://bugzilla.suse.com/show_bug.cgi?id=1084766 Reproducible: Always -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1095131 http://bugzilla.suse.com/show_bug.cgi?id=1095131#c1 Maximilian Meister <mmeister@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |CONFIRMED CC| |rbrown@suse.com Flags| |needinfo?(rbrown@suse.com) --- Comment #1 from Maximilian Meister <mmeister@suse.com> --- i've run the conformance tests for 1.10.4 [0] (sle based environment with updated cri-o and crio-tools) and they were green, which makes me wonder why this only happens on kubic. i skimmed through the k8s changelogs but couldnt find any meaningful entry about sth having fixed this issue @richard any idea what could be the difference here? or should we test it again on kubic with 1.10.4? last test was done with 1.10.3 IIRC also feel free to adapt the priority of the bug [0] http://jenkins.caasp.suse.net/job/caasp-manual-sandbox/job/master/60/ -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1095131 http://bugzilla.suse.com/show_bug.cgi?id=1095131#c2 Thorsten Kukuk <kukuk@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(rbrown@suse.com) | --- Comment #2 from Thorsten Kukuk <kukuk@suse.com> --- (In reply to Maximilian Meister from comment #1)
@richard any idea what could be the difference here? or should we test it again on kubic with 1.10.4? last test was done with 1.10.3 IIRC
SLE12 SP3 (CaaSP until v3) has /var/lib/kubelet as subvolume Kubic (CaaSP from v4) has /var as subvolume and /var/lib/kubelet is a directory inside this subvolume. I bet that this is what confuses kubernetes. -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1095131 http://bugzilla.suse.com/show_bug.cgi?id=1095131#c3 Richard Brown <rbrown@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|P5 - None |P2 - High CC| |aherzig@suse.com Severity|Normal |Major --- Comment #3 from Richard Brown <rbrown@suse.com> --- (In reply to Thorsten Kukuk from comment #2)
(In reply to Maximilian Meister from comment #1)
@richard any idea what could be the difference here? or should we test it again on kubic with 1.10.4? last test was done with 1.10.3 IIRC
SLE12 SP3 (CaaSP until v3) has /var/lib/kubelet as subvolume Kubic (CaaSP from v4) has /var as subvolume and /var/lib/kubelet is a directory inside this subvolume.
I bet that this is what confuses kubernetes.
Indeed - my guestimate suggests that https://github.com/google/cadvisor/pull/1668 only works if /var/lib/kubelet is it's own subvolume It's only a guestimate because I really don't understand how go's 'stat' works, so I'm little blind as to how that fix worked in the past. But one thing we can say for sure is that it doesn't work on Kubic and the difference in the subvolume layout is the biggest change that is likely to trigger any difference in logic for volume/partition ID detection. That change isn't just present in Kubic - we can expect similar behaviour in any SLE 15 based CaaSP also (eg. CaaSP v4) So I'd recommend running any conformance tests for 1.10.x against both SLE 12/CaaSP v3 and SLE 15/Kubic/CaaSP v4 - assuming both are being targetted for k8s 1.10 releases. Bumping up the severity and priority on the grounds of Kubic/CaaSP v4 without kubernetes is as useful as a submarine with a sunroof or an inflatable dartboard ;) I'd recommend the bug be considered equally important for CaaSPv4 until it's proven that it doesn't exist there. -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1095131 http://bugzilla.suse.com/show_bug.cgi?id=1095131#c4 Maximilian Meister <mmeister@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|CONFIRMED |IN_PROGRESS Assignee|kubic-bugs@opensuse.org |mmeister@suse.com --- Comment #4 from Maximilian Meister <mmeister@suse.com> --- i've added a patch as part of [0] to fix this bug, and asmallfter a local test, k8s was running fine and the error message hasn't appeared anymore, i only ran into a failing openldap as a followup but this was more or less expected [0] https://build.opensuse.org/request/show/617020 -- You are receiving this mail because: You are the assignee for the bug.
participants (1)
-
bugzilla_noreply@novell.com