Salt error on "kubicctl node add", while salt ping succeeds
Hi all, I just replaced one of my nodes in my kubic cluster, that was deployed last year (around this time, IIRC). Accepted the salt key, so it is properly shown on the salt master. Nevertheless trying to add it via kubicctl fails, apparently because some grains are not available:
# salt -G kubicd:kubic-worker-node grains.get kubic-worker-node No minions matched the target. No command was sent, no jid was assigned. ERROR: No return received
Running the salt ping command (salt '*' test.ping) works and all three nodes are reporting back. Did something change in the setup since I deployed my cluster? Do I need to set some grains manually? Kind Regards, Johannes -- Johannes Kastl Linux Consultant & Trainer Tel.: +49 (0) 151 2372 5802 Mail: kastl@b1-systems.de B1 Systems GmbH Osterfeldstraße 7 / 85088 Vohburg http://www.b1-systems.de GF: Ralph Dehner Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537
Kubic is at the moment in a non-working state.
Am 23.05.2021 um 21:01 schrieb Johannes Kastl <kastl@b1-systems.de>:
Hi all,
I just replaced one of my nodes in my kubic cluster, that was deployed last year (around this time, IIRC).
Accepted the salt key, so it is properly shown on the salt master. Nevertheless trying to add it via kubicctl fails, apparently because some grains are not available:
# salt -G kubicd:kubic-worker-node grains.get kubic-worker-node No minions matched the target. No command was sent, no jid was assigned. ERROR: No return received
Running the salt ping command (salt '*' test.ping) works and all three nodes are reporting back.
Did something change in the setup since I deployed my cluster? Do I need to set some grains manually?
Kind Regards, Johannes
-- Johannes Kastl Linux Consultant & Trainer Tel.: +49 (0) 151 2372 5802 Mail: kastl@b1-systems.de
B1 Systems GmbH Osterfeldstraße 7 / 85088 Vohburg http://www.b1-systems.de GF: Ralph Dehner Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537
Hi Marc, On 23.05.21 at 21:03 Marc Balmer wrote:
Kubic is at the moment in a non-working state.
[TOFU removed] Any more details? Bug numbers? Kind Regards, Johannes -- Johannes Kastl Linux Consultant & Trainer Tel.: +49 (0) 151 2372 5802 Mail: kastl@b1-systems.de B1 Systems GmbH Osterfeldstraße 7 / 85088 Vohburg http://www.b1-systems.de GF: Ralph Dehner Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537
Am 23.05.2021 um 21:05 schrieb Johannes Kastl <kastl@b1-systems.de>:
Hi Marc,
On 23.05.21 at 21:03 Marc Balmer wrote:
Kubic is at the moment in a non-working state.
[TOFU removed]
Any more details? Bug numbers?
https://bugzilla.opensuse.org/show_bug.cgi?id=1186125
Kind Regards, Johannes
-- Johannes Kastl Linux Consultant & Trainer Tel.: +49 (0) 151 2372 5802 Mail: kastl@b1-systems.de
B1 Systems GmbH Osterfeldstraße 7 / 85088 Vohburg http://www.b1-systems.de GF: Ralph Dehner Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537
Hi Marc, On 23.05.21 at 21:09 Marc Balmer wrote: [kubic currently broken]
Thanks for the link, but that would just be something inside Kubernetes. Which in my case is already existing and running fine (as far as I can see). The kubicctl command never ever reaches a stage where it tries to do kubernetes things, it already fails at the salt level. So I do not think this is related. But of course, I might be wrong... Kind Regards, Johannes -- Johannes Kastl Linux Consultant & Trainer Tel.: +49 (0) 151 2372 5802 Mail: kastl@b1-systems.de B1 Systems GmbH Osterfeldstraße 7 / 85088 Vohburg http://www.b1-systems.de GF: Ralph Dehner Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537
On Sun, May 23, Marc Balmer wrote:
Kubic is at the moment in a non-working state.
Hm, I wonder why it works out of the box for me on my Kubic clusters? If you mean your poblem with Rancher, I doubt that this is a kubic problem. Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & MicroOS SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany Managing Director: Felix Imendoerffer (HRB 36809, AG Nürnberg)
Am 23.05.2021 um 21:57 schrieb Thorsten Kukuk <kukuk@suse.de>:
On Sun, May 23, Marc Balmer wrote:
Kubic is at the moment in a non-working state.
Hm, I wonder why it works out of the box for me on my Kubic clusters? If you mean your poblem with Rancher, I doubt that this is a kubic problem.
Rancher not working is only a "second stage" problem. The Kubic nodes need "sysctl -a --system" to be manually applied to get access to the outside world (e.g. DNS resolution). When that is done, the pods can resolve adresses, but have no route to them (like the rancher cluster-agent pod not being able to access the rancher system). So there is, I guess, an iptables issue of some sort. I am a bit clueless atm how to debug this and help with the issue....
On Sun, May 23, Marc Balmer wrote:
Am 23.05.2021 um 21:57 schrieb Thorsten Kukuk <kukuk@suse.de>:
On Sun, May 23, Marc Balmer wrote:
Kubic is at the moment in a non-working state.
Hm, I wonder why it works out of the box for me on my Kubic clusters? If you mean your poblem with Rancher, I doubt that this is a kubic problem.
Rancher not working is only a "second stage" problem. The Kubic nodes need "sysctl -a --system" to be manually applied to get access to the outside world (e.g. DNS resolution). When that is done, the pods can resolve adresses, but have no route to them (like the rancher cluster-agent pod not being able to access the rancher system). So there is, I guess, an iptables issue of some sort.
As written in boo#1186125, delete /etc/sysctl/70-yast.conf Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & MicroOS SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany Managing Director: Felix Imendoerffer (HRB 36809, AG Nürnberg)
On Monday, May 24, 2021 12:56:36 PM CEST Thorsten Kukuk wrote:
On Sun, May 23, Marc Balmer wrote:
Am 23.05.2021 um 21:57 schrieb Thorsten Kukuk <kukuk@suse.de>:
On Sun, May 23, Marc Balmer wrote:
Kubic is at the moment in a non-working state.
Hm, I wonder why it works out of the box for me on my Kubic clusters? If you mean your poblem with Rancher, I doubt that this is a kubic problem.
Rancher not working is only a "second stage" problem. The Kubic nodes need "sysctl -a --system" to be manually applied to get access to the outside world (e.g. DNS resolution). When that is done, the pods can resolve adresses, but have no route to them (like the rancher cluster-agent pod not being able to access the rancher system). So there is, I guess, an iptables issue of some sort. As written in boo#1186125, delete /etc/sysctl/70-yast.conf
Thorsten
Does this fix the forwarding issues? I recently came across this problem a few days ago scratching my head for the cause of various services to fail (CoreDNS and access to an external PostgreSQL server) which I manually fixed by enabling forwarding for the ethernet card responsible for network access. Daniel Sonck
On Mon, May 24, Daniel Sonck wrote:
On Monday, May 24, 2021 12:56:36 PM CEST Thorsten Kukuk wrote:
As written in boo#1186125, delete /etc/sysctl/70-yast.conf
Does this fix the forwarding issues? I recently came across this problem a few days ago scratching my head for the cause of various services to fail (CoreDNS and access to an external PostgreSQL server) which I manually fixed by enabling forwarding for the ethernet card responsible for network access.
Yes, it should fix it. Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & MicroOS SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany Managing Director: Felix Imendoerffer (HRB 36809, AG Nürnberg)
As written in boo#1186125, delete /etc/sysctl/70-yast.conf
Does this fix the forwarding issues? I recently came across this problem a few days ago scratching my head for the cause of various services to fail (CoreDNS and access to an external PostgreSQL server) which I manually fixed by enabling forwarding for the ethernet card responsible for network access.
Yes, it should fix it.
Just for the record: Yes it did fix. Cluster is happy again, so am I.
On Tuesday, 25 May 2021 09:34:33 CEST Thorsten Kukuk wrote:
On Mon, May 24, Daniel Sonck wrote:
On Monday, May 24, 2021 12:56:36 PM CEST Thorsten Kukuk wrote:
As written in boo#1186125, delete /etc/sysctl/70-yast.conf
Does this fix the forwarding issues? I recently came across this problem a few days ago scratching my head for the cause of various services to fail (CoreDNS and access to an external PostgreSQL server) which I manually fixed by enabling forwarding for the ethernet card responsible for network access.
Yes, it should fix it.
Thorsten
I can also confirm. Removed my fix and 70-yast.conf, let the cluster reboot on its own and it's working nicely. I greatly appreciate this project as it JustWorks™ and is so much easier to maintain than my previous try with Flatcar Linux. Great work. Daniel Sonck
Hi all, On 23.05.21 at 21:01 Johannes Kastl wrote:
# salt -G kubicd:kubic-worker-node grains.get kubic-worker-node No minions matched the target. No command was sent, no jid was assigned. ERROR: No return received
Does anyone have any ideas on the actual salt error I encountered? Could anyone sent me the actual contents of a grains file created on a newer cluster? I had no time to dig deeper into this yet, but I feared it might involve spinning up a new cluster just to get the file... Kind Regards, Johannes -- Johannes Kastl Linux Consultant & Trainer Tel.: +49 (0) 151 2372 5802 Mail: kastl@b1-systems.de B1 Systems GmbH Osterfeldstraße 7 / 85088 Vohburg http://www.b1-systems.de GF: Ralph Dehner Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537
participants (4)
-
Daniel Sonck
-
Johannes Kastl
-
Marc Balmer
-
Thorsten Kukuk