Aleksa Sarai changed bug 1171770
What Removed Added
CC   asarai@suse.com

Comment # 28 on bug 1171770 from
For further context, the reason you have to set ip_forward to 0 is that setting
ip_forward to its current value is a no-op. It appears that something is
seriously broken inside the forwarding code, causing it to not forward packets
properly (thus requiring a reset).

I've been looking at the kernel side of this issue for quite a few days now,
and though I still haven't found the cause of the issue I have discovered that
the issue is not that packet forwarding is completely disabled -- the issue is
that packet forwarding *from the host to the container* is broken. This can be
fairly easily checked by running "dig @1.1.1.1 asdf.com" on a broken cluster --
if you packet capture the host you'll see the DNS packets leave the network and
a reply is sent back to the host. However, the packets never get forwarded to
the container. There actually is a coredns bug report which has comments that
reference a similar issue[1] but I'm not convinced that it is actually related
(not to mention the solution was "don't run coredns on your master node" which
makes absolutely no sense).

(As an aside, the reason my investigation of this has taken so long is because
bpftrace has been fighting me every step of the way. The inability to do
function_graph-style tracing, combined with endless silly restrictions on type
conversions and the lack of BTF support in Tumbleweed kernels has been driving
me up the wall.)

[1]: https://github.com/coredns/coredns/issues/2284#issuecomment-605596767


You are receiving this mail because: