Fixing Intermittent Connection Delays and SNAT Conflicts in Kubernetes VXLAN Clusters

#Kubernetes #Networking #VXLAN #iptables #SNAT #CNI #Linux Kernel

What's the problem?

Resolve persistent connection hangs and 63-second timeouts in Kubernetes clusters using VXLAN CNIs by fixing the double-NAT packet masquerade bug.

Why does this happen?

The issue stems from a persistent firewall mark that triggers a redundant SNAT operation on encapsulated VXLAN packets. Because the packet mark persists through the kernel stack, the iptables KUBE-POSTROUTING chain applies masquerading twice, leading to checksum failures and packet drops.

Code Example

# Logic implementation using iptables XOR for mark consumption
iptables -t nat -A KUBE-POSTROUTING -m mark ! --mark 0x4000/0x4000 -j RETURN
iptables -t nat -A KUBE-POSTROUTING -m mark --mark 0x4000/0x4000 -j MARK --xor-mark 0x4000
iptables -t nat -A KUBE-POSTROUTING -j MASQUERADE --random-fully

How to fix it

To resolve this, update the iptables logic to treat the masquerade mark as a stateful toggle. First, implement an immediate return guard if the mark is absent. Second, replace the static setting of the mark with an XOR operation during the masquerade process to consume the bit, ensuring it is cleared before the packet re-enters the stack during VXLAN encapsulation.