故障现象
容器启动报连接zookeeper time out,容器不断重启重试。
只有某一台机出现这个情况,其他节点正常。
初步排查 zk容器ip和zk service如下,尝试在故障节点telnet这两个ip的2181端口,发现容器端口可以通,service端口不通。这个情况初步怀疑是节点的NAT有问题,基本排除底层网络连通性问题。 尝试重启kube-proxy,没有效果,启动日志没有异常,zk容器运行正常。
1 test-zk-c477c89f6-n4px8 1/1 Running 0 17h 172.31.0.210 k8s-dev-node10 <none> <none>
1 zk-host ClusterIP 192.168.10.26
排查NAT及真实链接情况 使用conntrack 查看tcp链接的转发情况,实锤是NAT有问题,可以看到下面的记录,192.168.10.26对应了错的ip 172.31.0.240
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 root@k8s-dev-node2:~# conntrack -L|grep 192.168.10.26 tcp 6 3 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=36950 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=36950 mark=0 use=1 tcp 6 15 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37000 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37000 mark=0 use=1 tcp 6 76 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37244 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37244 mark=0 use=1 tcp 6 88 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37294 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37294 mark=0 use=1 tcp 6 100 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37340 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37340 mark=0 use=1 tcp 6 58 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37168 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37168 mark=0 use=1 tcp 6 33 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37076 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37076 mark=0 use=1 tcp 6 52 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37156 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37156 mark=0 use=1 tcp 6 70 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37218 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37218 mark=0 use=1 tcp 6 64 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37194 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37194 mark=0 use=1 tcp 6 113 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37400 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37400 mark=0 use=1 tcp 6 21 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37036 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37036 mark=0 use=1 tcp 6 119 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37422 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37422 mark=0 use=1 tcp 6 94 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37318 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37318 mark=0 use=1 tcp 6 39 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37096 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37096 mark=0 use=1 tcp 6 46 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37116 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37116 mark=0 use=1 tcp 6 82 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37274 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37274 mark=0 use=1 tcp 6 107 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37372 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37372 mark=0 use=1 tcp 6 9 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=36978 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=36978 mark=0 use=1 tcp 6 27 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37048 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37048 mark=0 use=1
这时候要看iptables的nat表规则,k8s service默认依赖iptables转发包,首先找到KUBE-SERVICES链的相关记录:
1 2 3 25 1500 KUBE-MARK-MASQ tcp -- * * !172.31.0.0/24 192.168.10.26 /* bxr-test/wbyb-zk-host: cluster IP */ tcp dpt:2181 463K 28M KUBE-SVC-R4ZGJCMYHQKGCXVG tcp -- * * 0.0.0.0/0 192.168.10.26 /* bxr-test/wbyb-zk-host: cluster IP */ tcp dpt:2181
接着看KUBE-SVC-R4ZGJCMYHQKGCXVG 链:
1 2 3 Chain KUBE-SVC-R4ZGJCMYHQKGCXVG (1 references) pkts bytes target prot opt in out source destination 463K 28M KUBE-SEP-SMSRJE57UUXOBS5Q all -- * * 0.0.0.0/0 0.0.0.0/0
流量又转到了KUBE-SEP-SMSRJE57UUXOBS5Q链,这里才真正看到转发规则:
1 2 3 4 Chain KUBE-SEP-SMSRJE57UUXOBS5Q (1 references) pkts bytes target prot opt in out source destination 0 0 KUBE-MARK-MASQ all -- * * 172.31.0.240 0.0.0.0/0 463K 28M DNAT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp to:172.31.0.240:2181
也可以简单点直接过滤出错的ip 172.31.0.240,因为我们上面已经通过conntrack找出来了,可能有多条规则。 具体错误如下:
1 2 3 4 5 6 7 8 9 Chain KUBE-SEP-SMSRJE57UUXOBS5Q (1 references) pkts bytes target prot opt in out source destination 0 0 KUBE-MARK-MASQ all -- * * 172.31.0.240 0.0.0.0/0 463K 28M DNAT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp to:172.31.0.240:2181 Chain KUBE-SEP-T2ZAVTI5IVHDQLMH (1 references) pkts bytes target prot opt in out source destination 0 0 KUBE-MARK-MASQ all -- * * 172.31.0.240 0.0.0.0/0 893K 54M DNAT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp to:172.31.0.240:2181
修复NAT规则 修复方式是替换规则中的ip为正确地址
1 2 3 4 iptables-legacy -t nat -R KUBE-SEP-T2ZAVTI5IVHDQLMH 1 -s 172.31.0.210 -d 0.0.0.0/0 -j KUBE-MARK-MASQ iptables-legacy -t nat -R KUBE-SEP-T2ZAVTI5IVHDQLMH 2 -s 0.0.0.0/0 -d 0.0.0.0/0 -p tcp -j DNAT --to-destination 172.31.0.210:2181 iptables-legacy -t nat -R KUBE-SEP-SMSRJE57UUXOBS5Q 1 -s 172.31.0.210 -d 0.0.0.0/0 -j KUBE-MARK-MASQ iptables-legacy -t nat -R KUBE-SEP-SMSRJE57UUXOBS5Q 2 -s 0.0.0.0/0 -d 0.0.0.0/0 -p tcp -j DNAT --to-destination 172.31.0.210:2181
验证有效性 telnet通过 1 2 3 4 5 root@k8s-dev-node2:~# telnet 192.168.10.26 2181 Trying 192.168.10.26... Connected to 192.168.10.26. Escape character is '^]'. ^CConnection closed by foreign host.
conntrack通过 记录的tcp链接正常ESTABLISHED
1 2 3 4 root@k8s-dev-node2:~# conntrack -L|grep 192.168.10.26 tcp 6 86399 ESTABLISHED src=172.31.0.189 dst=192.168.10.26 sport=40254 dport=2181 src=172.31.0.210 dst=172.31.0.189 sport=2181 dport=40254 [ASSURED] mark=0 use=1 tcp 6 86399 ESTABLISHED src=172.31.0.169 dst=192.168.10.26 sport=40804 dport=2181 src=172.31.0.210 dst=172.31.0.169 sport=2181 dport=40804 [ASSURED] mark=0 use=1 conntrack v1.4.5 (conntrack-tools): 378 flow entries have been shown.