TCP,尽pipeKEEPALIVE挂起recv函数

是TCP保持活动(与小超时)防止客户端挂在recv,服务器死后?

场景:

服务器和客户端运行在不同的机器上

  1. 使用KEEPALIVE选项,客户端通过TCP连接到服务器
  2. 客户端发送“Hello服务器”并等待响应
  3. 服务器收到“Hello服务器”并响应“Hello客户端”
  4. 客户端收到响应,hibernate10秒,重复步骤2-4(第1步现在被跳过 – 连接被保留)

在客户端睡眠期间,服务器被closures,现在:

  1. 客户端醒来
  2. 发送“Hello服务器”并等待响应
  3. 20分钟后,recv放弃 – 我期待在45秒后KEEPALIVE打破recv的function

设置KEEPALIVE选项:

void TCPclient::setkeepalive() { int optval; socklen_t optlen = sizeof(optval); /* Check the status for the keepalive option */ if(getsockopt(sock, SOL_SOCKET, SO_KEEPALIVE, &optval, &optlen) < 0) { throw std::string("getsockopt"); } optval = 1; optlen = sizeof(optval); if(setsockopt(sock, SOL_SOCKET, SO_KEEPALIVE, &optval, optlen) < 0) { close(s); exit(EXIT_FAILURE); } optval = 2; if (setsockopt(sock, SOL_TCP, TCP_KEEPCNT, &optval, optlen) < 0) { throw std::string("setsockopt"); } optval = 15; if (setsockopt(sock, SOL_TCP, TCP_KEEPIDLE, &optval, optlen) < 0) { throw std::string("setsockopt"); } optval = 15; if (setsockopt(sock, SOL_TCP, TCP_KEEPINTVL, &optval, optlen) < 0) { throw std::string("setsockopt"); } } 

linux 3.2.0-84-generic

线路闲置15秒后,Keepalive变为活动状态。 在您的情况下,Keepalive启动超时时间为15秒,睡眠时间为10秒,这意味着“Hello服务器”将成为服务器被终止后发送的下一个命令。

您的Linux将尝试多次重传消息。 Keepalive仍然不会被触发。 重试达到极限后连接将中断 – 这将需要10-30分钟。

@ MMA的回答是正确的。 我写了一个类似的客户,等了20秒钟才写完。 一旦客户端唤醒并发送消息,由保持发送的ACK消息不再被发送(连接不再是空闲的)。

经过15次重试(在/ proc / sys / net / ipv4中配置了tcp_retries2)发送tcp段,其中timeout指数增加到2分钟(在我的情况下),连接设置为错误,挂起读或recv返回ETIMEDOUT (errno 110)。 在我的情况下花了大约15分钟。 这一次取决于RTO。 看到TCPDUMP,有三个握手之后有两个ACK(我不知道为什么这两个ack是第一个),然后是15个带有数据和推送标志的消息。

 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on p2p1, link-type EN10MB (Ethernet), capture size 65535 bytes 01:16:45.296179 IP 192.168.2.100.60895 > ec2-52-7-150-140.compute-1.amazonaws.com.10221: Flags [S], seq 515423022, win 14600, options [mss 1460,sackOK,TS val 19212623 ecr 0,nop,wscale 7], length 0 E..<.a@.@......d4.....'...........9............ .%)O........ 01:16:45.477983 IP ec2-52-7-150-140.compute-1.amazonaws.com.10221 > 192.168.2.100.60895: Flags [S.], seq 3672727778, ack 515423023, win 26847, options [mss 1436,sackOK,TS val 114765522 ecr 19212623,nop,wscale 7], length 0 E..<..@.-...4......d'.....`..../..h............ .....%)O.... 01:16:45.478046 IP 192.168.2.100.60895 > ec2-52-7-150-140.compute-1.amazonaws.com.10221: Flags [.], ack 1, win 115, options [nop,nop,TS val 19212805 ecr 114765522], length 0 E..4.b@.@......d4.....'..../..`....s....... .%*..... 01:17:00.512812 IP 192.168.2.100.60895 > ec2-52-7-150-140.compute-1.amazonaws.com.10221: Flags [.], ack 1, win 115, options [nop,nop,TS val 19227840 ecr 114765522], length 0 E..4.c@.@......d4.....'.......`....s....... .%d..... 01:17:00.731160 IP ec2-52-7-150-140.compute-1.amazonaws.com.10221 > 192.168.2.100.60895: Flags [.], ack 1, win 210, options [nop,nop,TS val 114769336 ecr 19212805], length 0 E..4N.@.-.r.4......d'.....`..../....M...... ..=..%*. 01:17:05.478933 IP 192.168.2.100.60895 > ec2-52-7-150-140.compute-1.amazonaws.com.10221: Flags [P.], seq 1:15, ack 1, win 115, options [nop,nop,TS val 19232806 ecr 114769336], length 14 E..Bd@.@......d4.....'..../..`....s....... .%x&..=.Hello Word :). 01:17:06.027768 IP 192.168.2.100.60895 > ec2-52-7-150-140.compute-1.amazonaws.com.10221: Flags [P.], seq 1:15, ack 1, win 115, options [nop,nop,TS val 19233354 ecr 114769336], length 14 E..Be@.@......d4.....'..../..`....s....... .%zJ..=.Hello Word :). 01:17:07.120879 IP 192.168.2.100.60895 > ec2-52-7-150-140.compute-1.amazonaws.com.10221: Flags [P.], seq 1:15, ack 1, win 115, options [nop,nop,TS val 19234448 ecr 114769336], length 14 E..Bf@.@......d4.....'..../..`....s....... .%~...=.Hello Word :). 01:17:09.312833 IP 192.168.2.100.60895 > ec2-52-7-150-140.compute-1.amazonaws.com.10221: Flags [P.], seq 1:15, ack 1, win 115, options [nop,nop,TS val 19236640 ecr 114769336], length 14 E..Bg@.@......d4.....'..../..`....s....... .%. ..=.Hello Word :). 01:17:13.697663 IP 192.168.2.100.60895 > ec2-52-7-150-140.compute-1.amazonaws.com.10221: Flags [P.], seq 1:15, ack 1, win 115, options [nop,nop,TS val 19241024 ecr 114769336], length 14 E..Bh@.@......d4.....'..../..`....s....... .%.@..=.Hello Word :). 01:17:22.466187 IP 192.168.2.100.60895 > ec2-52-7-150-140.compute-1.amazonaws.com.10221: Flags [P.], seq 1:15, ack 1, win 115, options [nop,nop,TS val 19249793 ecr 114769336], length 14 E..Bi@.@......d4.....'..../..`....s....... .%....=.Hello Word :). 01:17:40.001653 IP 192.168.2.100.60895 > ec2-52-7-150-140.compute-1.amazonaws.com.10221: Flags [P.], seq 1:15, ack 1, win 115, options [nop,nop,TS val 19267328 ecr 114769336], length 14 E..Bj@.@......d4.....'..../..`....s....... .%....=.Hello Word :). 01:18:15.074493 IP 192.168.2.100.60895 > ec2-52-7-150-140.compute-1.amazonaws.com.10221: Flags [P.], seq 1:15, ack 1, win 115, options [nop,nop,TS val 19302401 ecr 114769336], length 14 E..Bk@.@......d4.....'..../..`....s....... .&....=.Hello Word :). 01:19:25.217799 IP 192.168.2.100.60895 > ec2-52-7-150-140.compute-1.amazonaws.com.10221: Flags [P.], seq 1:15, ack 1, win 115, options [nop,nop,TS val 19372545 ecr 114769336], length 14 E..Bl@.@......d4.....'..../..`....s....... .'....=.Hello Word :). 01:21:25.537775 IP 192.168.2.100.60895 > ec2-52-7-150-140.compute-1.amazonaws.com.10221: Flags [P.], seq 1:15, ack 1, win 115, options [nop,nop,TS val 19492864 ecr 114769336], length 14 E..Bm@.@......d4.....'..../..`....s....... .)p...=.Hello Word :). 01:23:25.856854 IP 192.168.2.100.60895 > ec2-52-7-150-140.compute-1.amazonaws.com.10221: Flags [P.], seq 1:15, 69336], length 14 E..Bn@.@......d4.....'..../..`....s....... .+F...=.Hello Word :). 01:25:26.176894 IP 192.168.2.100.60895 > ec2-52-7-150-140.compute-1.amazonaws.com.10221: Flags [P.], seq 1:15, 69336], length 14 E..Bo@.@......d4.....'..../..`....s....... .-....=.Hello Word :). 01:27:26.497691 IP 192.168.2.100.60895 > ec2-52-7-150-140.compute-1.amazonaws.com.10221: Flags [P.], seq 1:15, 69336], length 14 E..Bp@.@......d4.....'..../..`....s....... ......=.Hello Word :). 01:29:26.816905 IP 192.168.2.100.60895 > ec2-52-7-150-140.compute-1.amazonaws.com.10221: Flags [P.], seq 1:15, 69336], length 14 E..Bq@.@......d4.....'..../..`....s....... .0....=.Hello Word :). 01:31:27.137013 IP 192.168.2.100.60895 > ec2-52-7-150-140.compute-1.amazonaws.com.10221: Flags [P.], seq 1:15, ack 1, win 115, options [nop,nop,TS val 20094464 ecr 114769336], length 14 E..Br@.@......d4.....'..../..`....s....... .2....=.Hello Word :). 

我使用的客户端代码:

 #include <sys/types.h> #include <sys/socket.h> #include <string.h> #include <errno.h> #include <unistd.h> #include <netinet/in.h> #include <net/if.h> #include <arpa/inet.h> #include <stdio.h> #include <sys/socket.h> #include <stdlib.h> #include <netinet/tcp.h> #define DEST_PORT 10221 #define ADDRLEN INET_ADDRSTRLEN int main(int argc, char** argv) { int sock; int bytesWritten; struct sockaddr_in their_addr; char buffer[] = "Hello Word :)"; char addrstr[ADDRLEN + 1]; if (argc != 2) { printf("ERROR - Number of args\n"); return 10; } strncpy(addrstr, argv[1], ADDRLEN); bzero(&their_addr, sizeof(their_addr)); their_addr.sin_family = AF_INET; their_addr.sin_port = htons(DEST_PORT); if (inet_pton(AF_INET, addrstr,(void *)&their_addr.sin_addr) != 1) { printf("ERROR - Converting Address: %d\n", errno); return 2; } if ((sock = socket(AF_INET, SOCK_STREAM, 0)) == -1) { printf("ERROR - Socket could not be open: %d\n", errno); return 1; } //// Copied option setting int optval; socklen_t optlen = sizeof(optval); /* Check the status for the keepalive option */ if(getsockopt(sock, SOL_SOCKET, SO_KEEPALIVE, &optval, &optlen) < 0) { printf("ERROR - SOL_SOCKET: %d\n", errno); return 19; } optval = 1; optlen = sizeof(optval); if(setsockopt(sock, SOL_SOCKET, SO_KEEPALIVE, &optval, optlen) < 0) { printf("ERROR - SOL_SOCKET-2: %d\n", errno); return 20; } optval = 2; if (setsockopt(sock, SOL_TCP, TCP_KEEPCNT, &optval, optlen) < 0) { printf("ERROR - SOL_TCP: %d\n", errno); return 21; } optval = 15; if (setsockopt(sock, SOL_TCP, TCP_KEEPIDLE, &optval, optlen) < 0) { printf("ERROR - SOL_TCP-2: %d\n", errno); return 22; } optval = 15; if (setsockopt(sock, SOL_TCP, TCP_KEEPINTVL, &optval, optlen) < 0) { printf("ERROR - SOL_TCP-3: %d\n", errno); return 23; } ///// if (connect(sock, (const struct sockaddr *)&their_addr, (socklen_t)sizeof(their_addr)) == -1) { printf("ERROR - Could not connect to destination: %d\n", errno); return 3; } /// Sleep 20 seconds sleep(20); printf("About to write\n"); if ((bytesWritten = write(sock, (const void *)buffer, sizeof(buffer))) == -1) { printf("ERROR - Sending message: %d\n", errno); return 4; } printf("Message Sent to Address %s, Port: %d\n\n", addrstr, DEST_PORT); int bytesRead; if ((bytesRead = read(sock, buffer, sizeof(buffer))) == -1) { printf("ERROR - Sending message: %d\n", errno); return 4; } close(sock); return 0; } 

我使用AWS托管的服务器运行此测试。 在不注意客户端的情况下模拟删除服务器的方法是:我有一个与服务器关联的公共(Elastic)IP,并且在三次握手之后立即将服务器的弹性IP解除关联。 我无法粘贴服务器代码,但在这里不相关。

请不要在这个例子中,由于发送消息,keepalive被停止。