Contents
  1. 1. 故障现象
  2. 2. 初步排查
  3. 3. 排查NAT及真实链接情况
  4. 4. 修复NAT规则
  5. 5. 验证有效性
    1. 5.1. telnet通过
    2. 5.2. conntrack通过

故障现象

  1. 容器启动报连接zookeeper time out,容器不断重启重试。
  2. 只有某一台机出现这个情况,其他节点正常。

初步排查

zk容器ip和zk service如下,尝试在故障节点telnet这两个ip的2181端口,发现容器端口可以通,service端口不通。这个情况初步怀疑是节点的NAT有问题,基本排除底层网络连通性问题。
尝试重启kube-proxy,没有效果,启动日志没有异常,zk容器运行正常。

1
test-zk-c477c89f6-n4px8                    1/1     Running   0          17h    172.31.0.210   k8s-dev-node10   <none>           <none>
1
zk-host          ClusterIP      192.168.10.26  

排查NAT及真实链接情况

使用conntrack 查看tcp链接的转发情况,实锤是NAT有问题,可以看到下面的记录,192.168.10.26对应了错的ip 172.31.0.240

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
root@k8s-dev-node2:~# conntrack -L|grep 192.168.10.26
tcp 6 3 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=36950 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=36950 mark=0 use=1
tcp 6 15 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37000 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37000 mark=0 use=1
tcp 6 76 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37244 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37244 mark=0 use=1
tcp 6 88 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37294 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37294 mark=0 use=1
tcp 6 100 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37340 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37340 mark=0 use=1
tcp 6 58 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37168 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37168 mark=0 use=1
tcp 6 33 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37076 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37076 mark=0 use=1
tcp 6 52 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37156 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37156 mark=0 use=1
tcp 6 70 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37218 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37218 mark=0 use=1
tcp 6 64 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37194 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37194 mark=0 use=1
tcp 6 113 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37400 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37400 mark=0 use=1
tcp 6 21 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37036 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37036 mark=0 use=1
tcp 6 119 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37422 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37422 mark=0 use=1
tcp 6 94 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37318 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37318 mark=0 use=1
tcp 6 39 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37096 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37096 mark=0 use=1
tcp 6 46 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37116 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37116 mark=0 use=1
tcp 6 82 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37274 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37274 mark=0 use=1
tcp 6 107 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37372 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37372 mark=0 use=1
tcp 6 9 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=36978 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=36978 mark=0 use=1
tcp 6 27 SYN_SENT src=172.31.0.169 dst=192.168.10.26 sport=37048 dport=2181 [UNREPLIED] src=172.31.0.240 dst=172.31.0.169 sport=2181 dport=37048 mark=0 use=1

这时候要看iptables的nat表规则,k8s service默认依赖iptables转发包,首先找到KUBE-SERVICES链的相关记录:

1
2
3
  25  1500 KUBE-MARK-MASQ  tcp  --  *      *      !172.31.0.0/24        192.168.10.26        /* bxr-test/wbyb-zk-host: cluster IP */ tcp dpt:2181
463K 28M KUBE-SVC-R4ZGJCMYHQKGCXVG tcp -- * * 0.0.0.0/0 192.168.10.26 /* bxr-test/wbyb-zk-host: cluster IP */ tcp dpt:2181

接着看KUBE-SVC-R4ZGJCMYHQKGCXVG 链:

1
2
3
Chain KUBE-SVC-R4ZGJCMYHQKGCXVG (1 references)
pkts bytes target prot opt in out source destination
463K 28M KUBE-SEP-SMSRJE57UUXOBS5Q all -- * * 0.0.0.0/0 0.0.0.0/0

流量又转到了KUBE-SEP-SMSRJE57UUXOBS5Q链,这里才真正看到转发规则:

1
2
3
4
Chain KUBE-SEP-SMSRJE57UUXOBS5Q (1 references)
pkts bytes target prot opt in out source destination
0 0 KUBE-MARK-MASQ all -- * * 172.31.0.240 0.0.0.0/0
463K 28M DNAT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp to:172.31.0.240:2181

也可以简单点直接过滤出错的ip 172.31.0.240,因为我们上面已经通过conntrack找出来了,可能有多条规则。
具体错误如下:

1
2
3
4
5
6
7
8
9
Chain KUBE-SEP-SMSRJE57UUXOBS5Q (1 references)
pkts bytes target prot opt in out source destination
0 0 KUBE-MARK-MASQ all -- * * 172.31.0.240 0.0.0.0/0
463K 28M DNAT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp to:172.31.0.240:2181

Chain KUBE-SEP-T2ZAVTI5IVHDQLMH (1 references)
pkts bytes target prot opt in out source destination
0 0 KUBE-MARK-MASQ all -- * * 172.31.0.240 0.0.0.0/0
893K 54M DNAT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp to:172.31.0.240:2181

修复NAT规则

修复方式是替换规则中的ip为正确地址

1
2
3
4
iptables-legacy -t nat -R KUBE-SEP-T2ZAVTI5IVHDQLMH 1 -s 172.31.0.210 -d 0.0.0.0/0 -j KUBE-MARK-MASQ
iptables-legacy -t nat -R KUBE-SEP-T2ZAVTI5IVHDQLMH 2 -s 0.0.0.0/0 -d 0.0.0.0/0 -p tcp -j DNAT --to-destination 172.31.0.210:2181
iptables-legacy -t nat -R KUBE-SEP-SMSRJE57UUXOBS5Q 1 -s 172.31.0.210 -d 0.0.0.0/0 -j KUBE-MARK-MASQ
iptables-legacy -t nat -R KUBE-SEP-SMSRJE57UUXOBS5Q 2 -s 0.0.0.0/0 -d 0.0.0.0/0 -p tcp -j DNAT --to-destination 172.31.0.210:2181

验证有效性

telnet通过

1
2
3
4
5
root@k8s-dev-node2:~# telnet 192.168.10.26 2181
Trying 192.168.10.26...
Connected to 192.168.10.26.
Escape character is '^]'.
^CConnection closed by foreign host.

conntrack通过

记录的tcp链接正常ESTABLISHED

1
2
3
4
root@k8s-dev-node2:~# conntrack -L|grep 192.168.10.26
tcp 6 86399 ESTABLISHED src=172.31.0.189 dst=192.168.10.26 sport=40254 dport=2181 src=172.31.0.210 dst=172.31.0.189 sport=2181 dport=40254 [ASSURED] mark=0 use=1
tcp 6 86399 ESTABLISHED src=172.31.0.169 dst=192.168.10.26 sport=40804 dport=2181 src=172.31.0.210 dst=172.31.0.169 sport=2181 dport=40804 [ASSURED] mark=0 use=1
conntrack v1.4.5 (conntrack-tools): 378 flow entries have been shown.
Contents
  1. 1. 故障现象
  2. 2. 初步排查
  3. 3. 排查NAT及真实链接情况
  4. 4. 修复NAT规则
  5. 5. 验证有效性
    1. 5.1. telnet通过
    2. 5.2. conntrack通过