Fixing Kubelet Socket Exhaustion and TIME-WAIT Issues in Kubernetes

#Kubernetes #Kubelet #Networking #Performance Tuning #Socket Exhaustion #Linux Kernel

What's the problem?

Resolve Kubelet socket exhaustion caused by high-frequency liveness and readiness probes. Learn how to optimize TCP connection handling for large-scale clusters.

Why does this happen?

Kubernetes probes create short-lived TCP connections that leave sockets in a 60-second TIME-WAIT state, eventually depleting ephemeral ports and conntrack entries on high-density nodes, leading to cascading network instability.

Code Example

/* Configure SO_LINGER to 1s in your custom ProbeDialer */
func (d *ProbeDialer) Dial(network, address string) (net.Conn, error) {
    dialer := &net.Dialer{
        Control: func(network, address string, c syscall.RawConn) error {
            return c.Control(func(fd uintptr) {
                // Set SO_LINGER to 1 second to force socket cleanup
                syscall.SetsockoptLinger(int(fd), syscall.SOL_SOCKET, syscall.SO_LINGER, &syscall.Linger{Onoff: 1, Linger: 1})
            })
        },
    }
    return dialer.Dial(network, address)
}

How to fix it

To resolve this, implement a custom ProbeDialer using the SO_LINGER socket option to override the default kernel teardown behavior. By configuring a linger timeout of 1 second rather than the 60-second default, the kernel reclaims socket metadata immediately after the probe handshake, preventing port exhaustion without triggering abortive RST packets.