为什么Docker容器不能相互通信？

我创build了一个testingDocker集群的小型项目。基本上，cluster.sh脚本启动三个相同的容器，并使用pipework在主机上configuration网桥（ bridge1 ），并为每个容器添加一个NIC（ eth1 ）。

如果我login到其中一个容器，我可以arping其他容器：

 # 172.17.99.1 root@d01eb56fce52:/# arping 172.17.99.2 ARPING 172.17.99.2 42 bytes from aa:b3:98:92:0b:08 (172.17.99.2): index=0 time=1.001 sec 42 bytes from aa:b3:98:92:0b:08 (172.17.99.2): index=1 time=1.001 sec 42 bytes from aa:b3:98:92:0b:08 (172.17.99.2): index=2 time=1.001 sec 42 bytes from aa:b3:98:92:0b:08 (172.17.99.2): index=3 time=1.001 sec ^C --- 172.17.99.2 statistics --- 5 packets transmitted, 4 packets received, 20% unanswered (0 extra)

所以看来数据包可以通过bridge1 。

但问题是我无法ping其他容器，也不能通过任何工具如telnet或netcat发送任何IP数据包。

相反，网桥docker0和网卡eth0在所有容器中正常工作。

这是我的路线表

 # 172.17.99.1 root@d01eb56fce52:/# ip route default via 172.17.42.1 dev eth0 172.17.0.0/16 dev eth0 proto kernel scope link src 172.17.0.17 172.17.99.0/24 dev eth1 proto kernel scope link src 172.17.99.1

和网桥configuration

 # host $ brctl show bridge name bridge id STP enabled interfaces bridge1 8000.8a6b21e27ae6 no veth1pl25432 veth1pl25587 veth1pl25753 docker0 8000.56847afe9799 no veth7c87801 veth953a086 vethe575fe2 # host $ brctl showmacs bridge1 port no mac addr is local? ageing timer 1 8a:6b:21:e2:7a:e6 yes 0.00 2 8a:a3:b8:90:f3:52 yes 0.00 3 f6:0c:c4:3d:f5:b2 yes 0.00 # host $ ifconfig bridge1 Link encap:Ethernet HWaddr 8a:6b:21:e2:7a:e6 inet6 addr: fe80::48e9:e3ff:fedb:a1b6/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:163 errors:0 dropped:0 overruns:0 frame:0 TX packets:68 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:8844 (8.8 KB) TX bytes:12833 (12.8 KB) # I'm showing only one veth here for simplicity veth1pl25432 Link encap:Ethernet HWaddr 8a:6b:21:e2:7a:e6 inet6 addr: fe80::886b:21ff:fee2:7ae6/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:155 errors:0 dropped:0 overruns:0 frame:0 TX packets:162 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:12366 (12.3 KB) TX bytes:23180 (23.1 KB) ...

和IP FORWARD链

 # host $ sudo iptables -x -v --line-numbers -L FORWARD Chain FORWARD (policy ACCEPT 10675 packets, 640500 bytes) num pkts bytes target prot opt in out source destination 1 15018 22400195 DOCKER all -- any docker0 anywhere anywhere 2 15007 22399271 ACCEPT all -- any docker0 anywhere anywhere ctstate RELATED,ESTABLISHED 3 8160 445331 ACCEPT all -- docker0 !docker0 anywhere anywhere 4 11 924 ACCEPT all -- docker0 docker0 anywhere anywhere 5 56 4704 ACCEPT all -- bridge1 bridge1 anywhere anywhere

请注意，规则5的pkts cound不是0，这意味着ping已被正确路由（FORWARD链在路由权之后执行？），但是某种程度上没有到达目的地。

我不知道为什么docker0和bridge1行为不同。任何build议？

更新1

这是目标容器上的tcpdump输出。

 $ tcpdump -i eth1 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth1, link-type EN10MB (Ethernet), capture size 65535 bytes 22:11:17.754261 IP 192.168.1.65 > 172.17.99.1: ICMP echo request, id 26443, seq 1, length 6

注意源IP是192.168.1.65 ，这是主机的eth0 ，所以似乎有一些SNAT在桥上进行。

最后，打印出nat IP表，揭示了问题的原因：

 $ sudo iptables -L -t nat ... Chain POSTROUTING (policy ACCEPT) target prot opt source destination MASQUERADE all -- 172.17.0.0/16 anywhere ...

因为我的容器的eth0的IP位于172.17.0.0/16 ，发送的数据包的源IP已经改变。这就是为什么来自ping的响应无法返回到源代码的原因。

结论

解决方法是将容器的eth0的IP更改为与默认docker0不同的networking。

从Update 1中复制

这是目标容器上的tcpdump输出。

 $ tcpdump -i eth1 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth1, link-type EN10MB (Ethernet), capture size 65535 bytes 22:11:17.754261 IP 192.168.1.65 > 172.17.99.1: ICMP echo request, id 26443, seq 1, length 6

注意源IP是192.168.1.65 ，这是主机的eth0 ，所以似乎有一些SNAT在桥上进行。

最后，打印出nat IP表，揭示了问题的原因：

 $ sudo iptables -L -t nat ... Chain POSTROUTING (policy ACCEPT) target prot opt source destination MASQUERADE all -- 172.17.0.0/16 anywhere ...

因为我的容器的eth0的IP位于172.17.0.0/16 ，发送的数据包的源IP已经改变。这就是为什么来自ping的响应无法返回到源代码的原因。

结论

解决方法是将容器的eth0的IP更改为与默认docker0不同的网络。