[Bug 1171770] New: Worker nodes can't acces kube-api
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 Bug ID: 1171770 Summary: Worker nodes can't acces kube-api Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: 64bit OS: Linux Status: NEW Severity: Major Priority: P5 - None Component: Kubic Assignee: kubic-bugs@opensuse.org Reporter: contact@ffreitas.io QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- Release : 20200514 Deployment method : ''' kubicctl init kubicctl node add worker01.local ''' After this deployment, none of my pods deployed on worker nodes can access the kube-api. For example with kured I get : ''' time="2020-05-14T22:18:03Z" level=info msg="Kubernetes Reboot Daemon: 1.3.0" time="2020-05-14T22:18:03Z" level=info msg="Node ID: worker01" time="2020-05-14T22:18:03Z" level=info msg="Lock Annotation: kube-system/kured:weave.works/kured-node-lock" time="2020-05-14T22:18:03Z" level=info msg="Reboot Sentinel: /var/run/reboot-required every 1h0m0s" time="2020-05-14T22:18:03Z" level=info msg="Blocking Pod Selectors: []" time="2020-05-14T22:18:03Z" level=info msg="Reboot on: SunMonTueWedThuFriSat between 00:00 and 23:59 UTC" time="2020-05-14T22:18:33Z" level=fatal msg="Error testing lock: Get https://10.96.0.1:443/apis/apps/v1/namespaces/kube-system/daemonsets/kured: dial tcp 10.96.0.1:443: i/o timeout" ''' I found a similar issue on reddit : https://www.reddit.com/r/kubernetes/comments/gjhxcj/fresh_kubeadm_install_po... -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c1 Francisco Freitas <contact@ffreitas.io> changed: What |Removed |Added ---------------------------------------------------------------------------- Severity|Major |Critical --- Comment #1 from Francisco Freitas <contact@ffreitas.io> --- After a day of testing I am seing no way of having a working kubic installation from the latest ISO. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c2 Quentin Onno <contact@qonno.fr> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |contact@qonno.fr --- Comment #2 from Quentin Onno <contact@qonno.fr> --- Hi, Same issue here Regards, -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c4 --- Comment #4 from Francisco Freitas <contact@ffreitas.io> --- (In reply to Thorsten Kukuk from comment #3)
kured 1.3.0 is not the latest build, that contains kured 1.4.0.
I'm not sure where the problem is, you can try to use flannel instead of weave for the pod network. But flannel is not really maintained anymore and reports a lot of iptables errors. DNS isn't fully working, too.
Destroyed my cluster. Done a transactional-update. Same issue with the latest kured version : ''' time="2020-05-17T09:53:27Z" level=info msg="Kubernetes Reboot Daemon: 1.4.0" time="2020-05-17T09:53:27Z" level=info msg="Node ID: worker01" time="2020-05-17T09:53:27Z" level=info msg="Lock Annotation: kube-system/kured:weave.works/kured-node-lock" time="2020-05-17T09:53:27Z" level=info msg="Reboot Sentinel: /var/run/reboot-required every 1h0m0s" time="2020-05-17T09:53:27Z" level=info msg="Blocking Pod Selectors: []" time="2020-05-17T09:53:27Z" level=info msg="Reboot on: SunMonTueWedThuFriSat between 00:00 and 23:59 UTC" ''' It does not come from kured. I got the same issue with multiple services (for example haproxy-ingress). For the CNI I tested : - cilium (build a yaml for the github repository and put it in /usr/shared/k8s-yaml/cilium) - weavenet (default init) - flannel (kubicctl init --pod-network flannel) Same issue with all of them. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c6 --- Comment #6 from Francisco Freitas <contact@ffreitas.io> --- (In reply to Thorsten Kukuk from comment #5)
With flannel the error is for me much seldom than with weave. But it looks like the best way to find out if the cluster is affected or not is: run a busybox container and use nslookup to resolve a host. On an affected cluster you will run into a timeout (temporary failure in name resolution), else you should get immediately a response.
A second kubernetes cluster is running fine for me without the issues.
Between, kured is also broken since the last systemd update is incompatible ...
Again, not a kured issue for me as it affects other services. What is the configuration on your unaffected cluster ? Is it a fresh install ? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c8 --- Comment #8 from Francisco Freitas <contact@ffreitas.io> --- (In reply to Thorsten Kukuk from comment #7)
(In reply to Francisco Freitas from comment #6)
Again, not a kured issue for me as it affects other services.
kured cannot reboot the system anymore since systemd moved binaries, so you are affected by this.
The issue I want to point to here is the timeout to 10.96.0.1 wich is the kube-api service. Kured is just an example I took. I've seen the issue you're talking about. Still not the one I'm hoping to solve
What is the configuration on your unaffected cluster ?
Is it a fresh install ?
It's a multi-master setup, but no fresh install, only always updated. So not really compareable.
I can't rollback here. I must start a new environment. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c9 Richard Brown <rbrown@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |IN_PROGRESS Assignee|kubic-bugs@opensuse.org |rbrown@suse.com --- Comment #9 from Richard Brown <rbrown@suse.com> --- Hi all - I've been looking at this all day, here is the current status: I can confirm it happens with both kubicctl and kubeadm clusters made from the current snapshots. We know this doesn't occur on kubicctl clusters with multi-masters, which suggests haproxy somehow works around the issue. Recent snapshots have had the following recent changes which I suspect could be related (listed in invasiveness according to my opinion): - busybox package reworking - kernel update from 5.6.11 to 5.6.12 - minor runc patch - kured There is also the possibility that the cause is something else, I'm at a loss to be honest and trying to debug this just by going on what few clues we have here - any more data points and examples from people would be greatly appreciated. Given the bug report shows the problem occurs with earlier kured versions and with services other than kured, I think it's safe to rule out that update. Given the busybox package fundamentally changed every image we use for kubernetes, that was my first suspicion, so today I've built all of the images based on the busybox-free Tumbleweed base image (which is much larger, but obviously more likely to have everything each k8s component requires) You can get these images from registry.opensuse.org/home/rbrownsuse/branches/devel/kubic/containers/container/kubic However, as anyone who wishes to help can see, a cluster created with "kubeadm init --image-repository registry.opensuse.org/home/rbrownsuse/branches/devel/kubic/containers/container/kubic" still demonstrates this bug with a vengeance. I've even tried using the heavyweight base containers with the weave image, to no difference. So I'm pretty convinced our images/busybox are not at fault. This now leads me to wonder if the kernel or runc updates are to blame, which I will look at tomorrow, unless someone beats me to it first. Sorry that this doesn't look like it will be a quick fix. Anyone got any other info that might help? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c10 --- Comment #10 from Francisco Freitas <contact@ffreitas.io> --- (In reply to Richard Brown from comment #9)
Hi all - I've been looking at this all day, here is the current status:
I can confirm it happens with both kubicctl and kubeadm clusters made from the current snapshots.
We know this doesn't occur on kubicctl clusters with multi-masters, which suggests haproxy somehow works around the issue.
Might want to verify this. Tried a multi-master deployment two releases back. I had no issue with the master nodes but I still got the issue on the worker nodes. (Will test it on the latest release again tonight).
This now leads me to wonder if the kernel or runc updates are to blame, which I will look at tomorrow, unless someone beats me to it first.
Sorry that this doesn't look like it will be a quick fix. Anyone got any other info that might help?
Couldn't it be tested by downgrading the kernel ? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c11 --- Comment #11 from Richard Brown <rbrown@suse.com> --- (In reply to Francisco Freitas from comment #10)
(In reply to Richard Brown from comment #9)
Hi all - I've been looking at this all day, here is the current status:
I can confirm it happens with both kubicctl and kubeadm clusters made from the current snapshots.
We know this doesn't occur on kubicctl clusters with multi-masters, which suggests haproxy somehow works around the issue.
Might want to verify this. Tried a multi-master deployment two releases back. I had no issue with the master nodes but I still got the issue on the worker nodes. (Will test it on the latest release again tonight).
This now leads me to wonder if the kernel or runc updates are to blame, which I will look at tomorrow, unless someone beats me to it first.
Sorry that this doesn't look like it will be a quick fix. Anyone got any other info that might help?
Couldn't it be tested by downgrading the kernel ?
Sure but a) from where? and b) I've worked enough today, I think I'd like a bit of a break before picking this up tomorrow ;) -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c12 --- Comment #12 from Francisco Freitas <contact@ffreitas.io> --- (In reply to Richard Brown from comment #11)
(In reply to Francisco Freitas from comment #10)
(In reply to Richard Brown from comment #9)
Hi all - I've been looking at this all day, here is the current status:
I can confirm it happens with both kubicctl and kubeadm clusters made from the current snapshots.
We know this doesn't occur on kubicctl clusters with multi-masters, which suggests haproxy somehow works around the issue.
Might want to verify this. Tried a multi-master deployment two releases back. I had no issue with the master nodes but I still got the issue on the worker nodes. (Will test it on the latest release again tonight).
This now leads me to wonder if the kernel or runc updates are to blame, which I will look at tomorrow, unless someone beats me to it first.
Sorry that this doesn't look like it will be a quick fix. Anyone got any other info that might help?
Couldn't it be tested by downgrading the kernel ?
Sure but a) from where?
It was just a genuine question. I remember using the tumbleweed-cli to access the history repositories. I do not know if it can be done with kubic.
and b) I've worked enough today, I think I'd like a bit of a break before picking this up tomorrow ;)
I was not hoping for you to work on this single issue again today :p -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c13 --- Comment #13 from Francisco Freitas <contact@ffreitas.io> --- The error is also present on multi-master cluster. I deployed a cluster from the release 20200516 using the following commands : ``` kubicctl init --haproxy loadbalancer --multi-master loadbalancer.cluster.local kubicctl node add --type master master02 kubicctl node add --type master master03 kubicctl node add worker01 ``` -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c14 --- Comment #14 from Richard Brown <rbrown@suse.com> --- (In reply to Francisco Freitas from comment #13)
The error is also present on multi-master cluster. I deployed a cluster from the release 20200516 using the following commands :
``` kubicctl init --haproxy loadbalancer --multi-master loadbalancer.cluster.local kubicctl node add --type master master02 kubicctl node add --type master master03 kubicctl node add worker01 ```
So.. I've used kubeadm init --image-repository to use only upstream containers - problem still occurs I've used rebuilt kubernetes 1.18.2 containers - problem still occurs i've deployed it on kubernetes 1.17.5 - problem still occurs I've used only upstream weave, cilium and other CNI providers - problem still occurs I've used https://download.opensuse.org/history/ to move my nodes to every version of kubic we've had in May - problem still occurs I'm officially flummoxed - does anyone have any idea when this last worked for sure? because I'm running out of things to rule out -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c15 --- Comment #15 from Francisco Freitas <contact@ffreitas.io> --- (In reply to Richard Brown from comment #14)
(In reply to Francisco Freitas from comment #13)
The error is also present on multi-master cluster. I deployed a cluster from the release 20200516 using the following commands :
``` kubicctl init --haproxy loadbalancer --multi-master loadbalancer.cluster.local kubicctl node add --type master master02 kubicctl node add --type master master03 kubicctl node add worker01 ```
So.. I've used kubeadm init --image-repository to use only upstream containers - problem still occurs I've used rebuilt kubernetes 1.18.2 containers - problem still occurs i've deployed it on kubernetes 1.17.5 - problem still occurs I've used only upstream weave, cilium and other CNI providers - problem still occurs I've used https://download.opensuse.org/history/ to move my nodes to every version of kubic we've had in May - problem still occurs
I'm officially flummoxed - does anyone have any idea when this last worked for sure? because I'm running out of things to rule out
Last time I successfully installed a Kubic cluster was on april 7th with the following configuration : - upstream cilium for the CNI - single master - release 20200405 updated from a 20200108 iso -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c16 --- Comment #16 from Richard Brown <rbrown@suse.com> --- (In reply to Francisco Freitas from comment #15)
(In reply to Richard Brown from comment #14)
(In reply to Francisco Freitas from comment #13)
The error is also present on multi-master cluster. I deployed a cluster from the release 20200516 using the following commands :
``` kubicctl init --haproxy loadbalancer --multi-master loadbalancer.cluster.local kubicctl node add --type master master02 kubicctl node add --type master master03 kubicctl node add worker01 ```
So.. I've used kubeadm init --image-repository to use only upstream containers - problem still occurs I've used rebuilt kubernetes 1.18.2 containers - problem still occurs i've deployed it on kubernetes 1.17.5 - problem still occurs I've used only upstream weave, cilium and other CNI providers - problem still occurs I've used https://download.opensuse.org/history/ to move my nodes to every version of kubic we've had in May - problem still occurs
I'm officially flummoxed - does anyone have any idea when this last worked for sure? because I'm running out of things to rule out
Last time I successfully installed a Kubic cluster was on april 7th with the following configuration : - upstream cilium for the CNI - single master - release 20200405 updated from a 20200108 iso
Do you (or anyone else) have aN iso that old somewhere I can download it to see if I can narrow this down further? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c17 --- Comment #17 from Francisco Freitas <contact@ffreitas.io> --- (In reply to Richard Brown from comment #16)
(In reply to Francisco Freitas from comment #15)
(In reply to Richard Brown from comment #14)
(In reply to Francisco Freitas from comment #13)
The error is also present on multi-master cluster. I deployed a cluster from the release 20200516 using the following commands :
``` kubicctl init --haproxy loadbalancer --multi-master loadbalancer.cluster.local kubicctl node add --type master master02 kubicctl node add --type master master03 kubicctl node add worker01 ```
So.. I've used kubeadm init --image-repository to use only upstream containers - problem still occurs I've used rebuilt kubernetes 1.18.2 containers - problem still occurs i've deployed it on kubernetes 1.17.5 - problem still occurs I've used only upstream weave, cilium and other CNI providers - problem still occurs I've used https://download.opensuse.org/history/ to move my nodes to every version of kubic we've had in May - problem still occurs
I'm officially flummoxed - does anyone have any idea when this last worked for sure? because I'm running out of things to rule out
Last time I successfully installed a Kubic cluster was on april 7th with the following configuration : - upstream cilium for the CNI - single master - release 20200405 updated from a 20200108 iso
Do you (or anyone else) have aN iso that old somewhere I can download it to see if I can narrow this down further?
I only got the 20200108 iso. Will it do the trick for you ? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c18 --- Comment #18 from Francisco Freitas <contact@ffreitas.io> --- (In reply to Richard Brown from comment #16)
(In reply to Francisco Freitas from comment #15)
(In reply to Richard Brown from comment #14)
(In reply to Francisco Freitas from comment #13)
The error is also present on multi-master cluster. I deployed a cluster from the release 20200516 using the following commands :
``` kubicctl init --haproxy loadbalancer --multi-master loadbalancer.cluster.local kubicctl node add --type master master02 kubicctl node add --type master master03 kubicctl node add worker01 ```
So.. I've used kubeadm init --image-repository to use only upstream containers - problem still occurs I've used rebuilt kubernetes 1.18.2 containers - problem still occurs i've deployed it on kubernetes 1.17.5 - problem still occurs I've used only upstream weave, cilium and other CNI providers - problem still occurs I've used https://download.opensuse.org/history/ to move my nodes to every version of kubic we've had in May - problem still occurs
I'm officially flummoxed - does anyone have any idea when this last worked for sure? because I'm running out of things to rule out
Last time I successfully installed a Kubic cluster was on april 7th with the following configuration : - upstream cilium for the CNI - single master - release 20200405 updated from a 20200108 iso
Do you (or anyone else) have aN iso that old somewhere I can download it to see if I can narrow this down further?
In case you need it I've managed to upload it here : https://send.firefox.com/download/a4f0c1b25d2d81a9/#72p-GCmfwurTQhAPLPwEsw -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c19 --- Comment #19 from Richard Brown <rbrown@suse.com> --- No need for an ISO, found what I believe to be the trigger for the problem. WORKAROUND: Delete /etc/sysctl.d/70-yast.conf If cluster is already bootstrapped, reboot all nodes. Cluster communications work properly afterwards. NEXT STEP: Figure out why the heck /etc/sysctl.d/70-yast.conf's blocking of IP forwarding is taking an effect when /usr/lib/sysctl.d/90-yast.conf should be overriding it :) -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c20 --- Comment #20 from Richard Brown <rbrown@suse.com> --- (In reply to Richard Brown from comment #19)
Figure out why the heck /etc/sysctl.d/70-yast.conf's blocking of IP forwarding is taking an effect when /usr/lib/sysctl.d/90-yast.conf should be overriding it :)
Correction.. /usr/lib/sysctl.d/90-kubeadm.conf is what should be overriding 70-yast.conf -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c21 --- Comment #21 from Francisco Freitas <contact@ffreitas.io> --- (In reply to Richard Brown from comment #19)
No need for an ISO, found what I believe to be the trigger for the problem.
WORKAROUND:
Delete /etc/sysctl.d/70-yast.conf
If cluster is already bootstrapped, reboot all nodes. Cluster communications work properly afterwards.
NEXT STEP:
Figure out why the heck /etc/sysctl.d/70-yast.conf's blocking of IP forwarding is taking an effect when /usr/lib/sysctl.d/90-yast.conf should be overriding it :)
Nice ! I will try it out tonight. Thanks for the workaround. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c22 Rafael Fernández López <rfernandezlopez@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |rfernandezlopez@suse.com --- Comment #22 from Rafael Fernández López <rfernandezlopez@suse.com> --- I have the impression that `net.ipv6.conf.all.forwarding = 0` being set by `/etc/sysctl.d/70-yast.conf` has an impact here. I did override this setting by creating a `/etc/sysctl.d/91-kubeadm.conf` file with contents: ``` net.ipv4.ip_forward = 1 net.ipv6.conf.all.forwarding = 1 ``` After rebooting the node, everything works fine. As Richard mentioned, removing `/etc/sysctl.d/70-yast.conf` altogether and rebooting also makes the trick. This makes me think that the override in `/usr/lib/sysctl.d/90-kubeadm.conf` is not enough, it currently has: ``` # The file is provided as part of the kubernetes-kubeadm package net.ipv4.ip_forward = 1 ```
From what I see, it should include `net.ipv6.conf.all.forwarding = 1` as well. I cannot explain why this is happening in a better way right now though.
-- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c23 --- Comment #23 from Rafael Fernández López <rfernandezlopez@suse.com> --- As a note, rebooting is not strictly necessary. `sysctl -w -a --system` should work as well. But I have run over funny behaviors over `sysctl` in the past (specially when mixed with values in `/etc/sysctl.conf`). This is why I recommend to directly reboot to ensure that everything is fine even after rebooting. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c24 --- Comment #24 from Francisco Freitas <contact@ffreitas.io> --- Tested the workaround on a multi-master cluster with cilium. Did not have the issue again on any of my services. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c25 --- Comment #25 from Richard Brown <rbrown@suse.com> --- (In reply to Rafael Fernández López from comment #22)
I have the impression that `net.ipv6.conf.all.forwarding = 0` being set by `/etc/sysctl.d/70-yast.conf` has an impact here.
I did override this setting by creating a `/etc/sysctl.d/91-kubeadm.conf` file with contents:
``` net.ipv4.ip_forward = 1 net.ipv6.conf.all.forwarding = 1 ```
After rebooting the node, everything works fine. As Richard mentioned, removing `/etc/sysctl.d/70-yast.conf` altogether and rebooting also makes the trick.
This makes me think that the override in `/usr/lib/sysctl.d/90-kubeadm.conf` is not enough, it currently has:
``` # The file is provided as part of the kubernetes-kubeadm package net.ipv4.ip_forward = 1 ```
From what I see, it should include `net.ipv6.conf.all.forwarding = 1` as well. I cannot explain why this is happening in a better way right now though.
I tried this before making my post, and it didn't work for me..but I trust your observation also so I'm putting it in a patch for kubernetes1.18 and kubernetes1.17 and testing those packages :) thanks! -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c26 --- Comment #26 from Richard Brown <rbrown@suse.com> --- (In reply to Richard Brown from comment #25)
(In reply to Rafael Fernández López from comment #22)
I have the impression that `net.ipv6.conf.all.forwarding = 0` being set by `/etc/sysctl.d/70-yast.conf` has an impact here.
I did override this setting by creating a `/etc/sysctl.d/91-kubeadm.conf` file with contents:
``` net.ipv4.ip_forward = 1 net.ipv6.conf.all.forwarding = 1 ```
After rebooting the node, everything works fine. As Richard mentioned, removing `/etc/sysctl.d/70-yast.conf` altogether and rebooting also makes the trick.
This makes me think that the override in `/usr/lib/sysctl.d/90-kubeadm.conf` is not enough, it currently has:
``` # The file is provided as part of the kubernetes-kubeadm package net.ipv4.ip_forward = 1 ```
From what I see, it should include `net.ipv6.conf.all.forwarding = 1` as well. I cannot explain why this is happening in a better way right now though.
I tried this before making my post, and it didn't work for me..but I trust your observation also so I'm putting it in a patch for kubernetes1.18 and kubernetes1.17 and testing those packages :)
thanks!
Put the change in the package, and confirmed - it does not work to add `net.ipv6.conf.all.forwarding = 1` However, I can confirm, if I copy 90-kubeadm.conf to /etc/sysctl.d, then it works. This means something is incorrectly parsing/not parsing /usr/share/sysctl.d Now we just need to figure out what -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c27 --- Comment #27 from Richard Brown <rbrown@suse.com> --- The problem, as we understand it right now, is that the sysctl.d settings from /usr/lib/sysctl.d/90-kubeadm.conf are not getting correctly applied/honoured by the kernel. /proc seems to suggest they are applied, but obviously they don't behave that way and IP forwarding (which is really needed for us) is not working. Various methods of toggling/reapplying settings can get things to work (eg. the WORKAROUND, echo 0 > /proc/sys/net/ipv4/ip_forward && echo 1 > /proc/sys/net/ipv4/ip_forward, sysctl -a --system and such) Quite what is going on in the kernel is still a mystery and other folk are looking at it as I'll be going on vacation. Meanwhile though, I have patches going to both kubernetes1.17 and 1.18 which will run sysctl -a --system before starting kubelet, just to be sure everything is running as configured :) Once this patch is out there the workaround of deleting other influencing sysctl.d config files will be redundant. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c28 Aleksa Sarai <asarai@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |asarai@suse.com --- Comment #28 from Aleksa Sarai <asarai@suse.com> --- For further context, the reason you have to set ip_forward to 0 is that setting ip_forward to its current value is a no-op. It appears that something is seriously broken inside the forwarding code, causing it to not forward packets properly (thus requiring a reset). I've been looking at the kernel side of this issue for quite a few days now, and though I still haven't found the cause of the issue I have discovered that the issue is not that packet forwarding is completely disabled -- the issue is that packet forwarding *from the host to the container* is broken. This can be fairly easily checked by running "dig @1.1.1.1 asdf.com" on a broken cluster -- if you packet capture the host you'll see the DNS packets leave the network and a reply is sent back to the host. However, the packets never get forwarded to the container. There actually is a coredns bug report which has comments that reference a similar issue[1] but I'm not convinced that it is actually related (not to mention the solution was "don't run coredns on your master node" which makes absolutely no sense). (As an aside, the reason my investigation of this has taken so long is because bpftrace has been fighting me every step of the way. The inability to do function_graph-style tracing, combined with endless silly restrictions on type conversions and the lack of BTF support in Tumbleweed kernels has been driving me up the wall.) [1]: https://github.com/coredns/coredns/issues/2284#issuecomment-605596767 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c29 Martin Weiss <martin.weiss@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |martin.weiss@suse.com --- Comment #29 from Martin Weiss <martin.weiss@suse.com> --- FYI - just had to realize the same issue on SLES 15 SP1 with kernel 4.12.14-197.40-default and we have realized that all forward = 1 JUST the eth0 and the lo interfaces have 0 ! While all other ipv4 forwarding were 1 we saw these two on 0: /proc/sys/net/ipv4/conf/eth0/forwarding 0 /proc/sys/net/ipv4/conf/lo/forwarding 0 In this case sysctl -w net.ipv4.ip_forward=0; sysctl -w net.ipv4.ip_forward=1 also changed the interfaces to 1 Also a network restart changes the interfaces from 0 to 1. BUT - a sysctl --system (with only net.ipv4.ip_forward=1 in the conf) did NOT change the interfaces from 0 to 1!! -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 Martin Weiss <martin.weiss@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Blocks| |1172284 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c30 --- Comment #30 from Rafael Fernández López <rfernandezlopez@suse.com> --- Case 1 ====== echo 1 | sudo tee /proc/sys/net/ipv4/conf/all/forwarding 1 echo 0 | sudo tee /proc/sys/net/ipv4/conf/wlp1s0/forwarding 0 cat /proc/sys/net/ipv4/conf/wlp1s0/forwarding 0 echo 1 | sudo tee /proc/sys/net/ipv4/conf/all/forwarding 1 cat /proc/sys/net/ipv4/conf/wlp1s0/forwarding 0 Case 2 ====== echo 1 | sudo tee /proc/sys/net/ipv4/conf/all/forwarding 1 echo 0 | sudo tee /proc/sys/net/ipv4/conf/wlp1s0/forwarding 0 cat /proc/sys/net/ipv4/conf/wlp1s0/forwarding 0 echo 0 | sudo tee /proc/sys/net/ipv4/conf/all/forwarding ## FORCE CYCLING 0 ## echo 1 | sudo tee /proc/sys/net/ipv4/conf/all/forwarding ## 1 ## cat /proc/sys/net/ipv4/conf/wlp1s0/forwarding 1 ## IT'S 1 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c31 --- Comment #31 from Rafael Fernández López <rfernandezlopez@suse.com> --- What I wrote on https://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c30 could be the kernel expected behavior. Instead of trying to set all interfaces to 1, it will be a no-op if the conf/all/forwarding was 1 already (not cycled). If we confirm that on Kubic the interfaces ip_forward are set to 0 as well, then we have to find out what is setting them to 0 and when. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c32 --- Comment #32 from Aleksa Sarai <asarai@suse.com> --- (In reply to Martin Weiss from comment #29)
FYI - just had to realize the same issue on SLES 15 SP1 with kernel 4.12.14-197.40-default and we have realized that all forward = 1 JUST the eth0 and the lo interfaces have 0 !
While all other ipv4 forwarding were 1 we saw these two on 0:
/proc/sys/net/ipv4/conf/eth0/forwarding 0 /proc/sys/net/ipv4/conf/lo/forwarding 0
Dammit. Yeah I had noticed this last week (when I was figuring out how forwarding configuration worked), but I misunderstood what I was looking at -- my assumption was that forwarding meant forwarding in *both* directions. But I think it only refers to forwarding *incoming* packets (so forwarding being disabled on the host still allows forwarded packets from the container to go to the internet).
BUT - a sysctl --system (with only net.ipv4.ip_forward=1 in the conf) did NOT change the interfaces from 0 to 1!!
Yeah, this behaviour is expected (if misguided IMHO). The kernel treats setting this sysctl to its current value as a no-op. I guess we'll need to explicitly do % echo 1 | tee /proc/sys/ipv[46]/conf/*/forwarding somewhere... -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c33 --- Comment #33 from Martin Weiss <martin.weiss@suse.com> --- Quick update on the testing I have done. 1. installed server and enabled IP forwarding 2. verified that the following gives 1 for all: for FILE in $(sudo find /proc -iname forwarding); do echo $FILE; sudo cat $FILE; done 3. run skuba to deploy first master: cd skuba cluster init caasp --control-plane caasp-api.suse cd caasp sed -i 's#podSubnet:.*#podSubnet: 10.100.0.0/16#g' kubeadm-init.conf sed -i 's#serviceSubnet:.*#serviceSubnet: 10.200.0.0/16#g' kubeadm-init.conf SERVER=caasp-master-01; skuba node bootstrap $SERVER --sudo --target $SERVER.suse --user caaspadm -v5 2>&1|tee $SERVER.log 4. verify right after skuba deployment is finished for FILE in $(sudo find /proc -iname forwarding); do echo $FILE; sudo cat $FILE; done -> all 1 - but not yet any cilium or lxc0502ff45e5f9.. network 5. wait a bit and then test again for FILE in $(sudo find /proc -iname forwarding); do echo $FILE; sudo cat $FILE; done /proc/sys/net/ipv4/conf/all/forwarding 1 /proc/sys/net/ipv4/conf/cilium_health/forwarding 1 /proc/sys/net/ipv4/conf/cilium_host/forwarding 1 /proc/sys/net/ipv4/conf/cilium_net/forwarding 1 /proc/sys/net/ipv4/conf/cilium_vxlan/forwarding 1 /proc/sys/net/ipv4/conf/default/forwarding 1 /proc/sys/net/ipv4/conf/eth0/forwarding 0 /proc/sys/net/ipv4/conf/lo/forwarding 0 /proc/sys/net/ipv4/conf/lxc0502ff45e5f9/forwarding 1 /proc/sys/net/ipv4/conf/lxc338fd9c772fe/forwarding 1 /proc/sys/net/ipv4/conf/lxc44ad1805e489/forwarding 1 /proc/sys/net/ipv4/conf/lxc45fc97ce17fc/forwarding 1 /proc/sys/net/ipv4/conf/lxc5c407b7c307a/forwarding 1 /proc/sys/net/ipv4/conf/lxc66ae87817a6a/forwarding 1 /proc/sys/net/ipv4/conf/lxc6f16187be039/forwarding 1 /proc/sys/net/ipv4/conf/lxc927bae83b605/forwarding 1 /proc/sys/net/ipv4/conf/lxc9a7c0128099e/forwarding 1 /proc/sys/net/ipv4/conf/lxcc5c6777e5605/forwarding 1 /proc/sys/net/ipv6/conf/all/forwarding 1 /proc/sys/net/ipv6/conf/cilium_health/forwarding 1 /proc/sys/net/ipv6/conf/cilium_host/forwarding 1 /proc/sys/net/ipv6/conf/cilium_net/forwarding 1 /proc/sys/net/ipv6/conf/cilium_vxlan/forwarding 1 /proc/sys/net/ipv6/conf/default/forwarding 1 /proc/sys/net/ipv6/conf/eth0/forwarding 1 /proc/sys/net/ipv6/conf/lo/forwarding 1 /proc/sys/net/ipv6/conf/lxc0502ff45e5f9/forwarding 1 /proc/sys/net/ipv6/conf/lxc338fd9c772fe/forwarding 1 /proc/sys/net/ipv6/conf/lxc44ad1805e489/forwarding 1 /proc/sys/net/ipv6/conf/lxc45fc97ce17fc/forwarding 1 /proc/sys/net/ipv6/conf/lxc5c407b7c307a/forwarding 1 /proc/sys/net/ipv6/conf/lxc66ae87817a6a/forwarding 1 /proc/sys/net/ipv6/conf/lxc6f16187be039/forwarding 1 /proc/sys/net/ipv6/conf/lxc927bae83b605/forwarding 1 /proc/sys/net/ipv6/conf/lxc9a7c0128099e/forwarding 1 /proc/sys/net/ipv6/conf/lxcc5c6777e5605/forwarding 1 --> so at least in my case the process of the cluster deployment / cilium / crio etc. start seems to change eth0 and lo to "0" ! -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c34 --- Comment #34 from Martin Weiss <martin.weiss@suse.com> --- FYI - one more update from my testing, today! Before running skuba: /proc/sys/net/ipv4/conf/eth0/forwarding 1 /proc/sys/net/ipv4/conf/lo/forwarding 1 After running skuba SERVER=caasp-master-01; skuba node bootstrap $SERVER --sudo --target $SERVER.suse --user caaspadm -v5 2>&1|tee $SERVER.log After waiting a while (until cilium pod is started): /proc/sys/net/ipv4/conf/eth0/forwarding 1 /proc/sys/net/ipv4/conf/lo/forwarding 0 --> I can reproduce this! In "dmesg" I can see this: [ 3399.078642] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this. [ 3399.082625] Bridge firewalling registered [ 3401.275015] ip_tables: (C) 2000-2006 Netfilter Core Team [ 3401.294485] nf_conntrack version 0.5.0 (16384 buckets, 65536 max) [ 3457.871676] systemd-logind[1038]: Session 4 logged out. Waiting for processes to exit. [ 3457.872918] systemd-logind[1038]: Removed session 4. [ 3471.942169] IPVS: Registered protocols (TCP, UDP, SCTP, AH, ESP) [ 3471.942482] IPVS: Connection hash table configured (size=4096, memory=64Kbytes) [ 3471.946626] IPVS: ipvs loaded. [ 3471.957053] IPVS: [rr] scheduler registered. [ 3471.971445] IPVS: [wrr] scheduler registered. [ 3471.975453] IPVS: [sh] scheduler registered. [ 3506.795122] systemd-udevd[3355]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable. [ 3506.795417] systemd-udevd[3355]: Could not generate persistent MAC address for cilium_net: No such file or directory [ 3506.795802] systemd-udevd[3354]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable. [ 3506.795877] systemd-udevd[3354]: Could not generate persistent MAC address for cilium_host: No such file or directory [ 3506.959424] systemd-udevd[3408]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable. [ 3506.959477] systemd-udevd[3408]: Could not generate persistent MAC address for cilium_vxlan: No such file or directory [ 3507.160426] NET: Registered protocol family 38 [ 3507.644490] ip6_tables: (C) 2000-2006 Netfilter Core Team [ 3509.733778] systemd-udevd[4335]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable. [ 3509.733853] systemd-udevd[4335]: Could not generate persistent MAC address for cilium_health: No such file or directory [ 3509.734029] systemd-udevd[4334]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable. [ 3509.734092] systemd-udevd[4334]: Could not generate persistent MAC address for cilium: No such file or directory [ 3510.068095] eth0: renamed from tmpf7777 [ 3510.124475] eth0: renamed from tmpeda4d [ 3510.166106] eth0: renamed from tmpa871d [ 3510.234471] eth0: renamed from tmpb7d55 [ 3510.345385] eth0: renamed from tmp64312 [ 3510.395190] eth0: renamed from tmp08aba [ 3510.526060] eth0: renamed from tmp2a58d [ 3510.770431] eth0: renamed from tmp98ddd [ 3510.798858] eth0: renamed from tmpdb501 [ 3510.862749] eth0: renamed from tmpb2d90 [ 3510.931801] lxc306256f2dd85: Caught tx_queue_len zero misconfig [ 3510.932639] lxc0881c895a035: Caught tx_queue_len zero misconfig [ 3510.942956] cilium_health: Caught tx_queue_len zero misconfig [ 3511.001898] lxcb73fb2c63181: Caught tx_queue_len zero misconfig [ 3512.559132] lxcf6ec013fd3a7: Caught tx_queue_len zero misconfig [ 3512.673046] lxcf68a5fd8bb9c: Caught tx_queue_len zero misconfig [ 3512.765450] lxc715ae4d9099a: Caught tx_queue_len zero misconfig [ 3513.019281] lxcc06f8ca4995d: Caught tx_queue_len zero misconfig [ 3513.318023] lxc8afc51443d8f: Caught tx_queue_len zero misconfig [ 3513.373336] lxc13b8308cd8b8: Caught tx_queue_len zero misconfig [ 3513.639782] lxccce1476273fa: Caught tx_queue_len zero misconfig [ 3516.457765] audit: type=1305 audit(1591093421.441:1070): audit_pid=0 old=918 auid=4294967295 ses=4294967295 res=1 [ 3516.503810] audit: type=1305 audit(1591093421.489:1071): audit_enabled=1 old=1 auid=4294967295 ses=4294967295 res=1 [ 3531.766783] systemd-journald[3599]: Received SIGTERM from PID 1 (systemd). [ 3532.168956] systemd-udevd: 50 output lines suppressed due to ratelimiting [ 3534.288116] Netfilter messages via NETLINK v0.30. [ 3534.303469] ctnetlink v0.93: registering with nfnetlink. --> probably cilium does "something" with the interfaces causing the setting for "forwarding" to change? It seems that a "reboot" or a "systemctl restart network" BEFORE doing the skuba bootstrap "prevents this problem from happening"! Could it be that systemd or wicked or something in the network stack "still remembers that forwarding was 0 at the point in time the network was started" and that the cilium process of generating the network environment for the overlay network somehow causes this "old in memory setting" to come back? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1171770 http://bugzilla.opensuse.org/show_bug.cgi?id=1171770#c36 Richard Brown <rbrown@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|IN_PROGRESS |RESOLVED Resolution|--- |FIXED --- Comment #36 from Richard Brown <rbrown@suse.com> --- Resolved in openSUSE -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@suse.com