为什么select（）有时会在客户端忙于接收数据时超时

我已经编写了简单的C / S应用程序来testing非阻塞套接字的特性，下面是关于服务器和客户端的简要信息：

//On linux The server thread will send //a file to the client using non-blocking socket void *SendFileThread(void *param){ CFile* theFile = (CFile*) param; int sockfd = theFile->GetSocket(); set_non_blocking(sockfd); set_sock_sndbuf(sockfd, 1024 * 64); //set the send buffer to 64K //get the total packets count of target file int PacketCOunt = theFile->GetFilePacketsCount(); int CurrPacket = 0; while (CurrPacket < PacketCount){ char buffer[512]; int len = 0; //get packet data by packet no. GetPacketData(currPacket, buffer, len); //send_non_blocking_sock_data will loop and send //data into buffer of sockfd until there is error int ret = send_non_blocking_sock_data(sockfd, buffer, len); if (ret < 0 && errno == EAGAIN){ continue； } else if (ret < 0 || ret == 0 ){ break; } else { currPacket++; } ...... } }

 //On windows, the client thread will do something like below //to receive the file data sent by the server via block socket void *RecvFileThread(void *param){ int sockfd = (int) param; //blocking socket set_sock_rcvbuf(sockfd, 1024 * 256); //set the send buffer to 256 while (1){ struct timeval timeout; timeout.tv_sec = 1; timeout.tv_usec = 0; fd_set rds; FD_ZERO(&rds); FD_SET(sockfd, &rds)' //actually, the first parameter of select() is //ignored on windows, though on linux this parameter //should be (maximum socket value + 1) int ret = select(sockfd + 1, &rds, NULL, NULL, &timeout ); if (ret == 0){ // log that timer expires CLogger::log("RecvFileThread---Calling select() timeouts\n"); } else if (ret) { //log the number of data it received int ret = 0; char buffer[1024 * 256]; int len = recv(sockfd, buffer, sizeof(buffer), 0); // handle error process_tcp_data(buffer, len); } else { //handle and break; break; } } }

令我惊讶的是，由于套接字缓冲区已满，服务器线程频繁发生故障，例如发送一个14M大小的文件，报告errno = EAGAIN 50000次失败。但是，通过日志logging，我发现转移过程中有数十个超时，stream程如下：

在第N个循环中，select（）成功并成功读取256K的数据。
在第（N + 1）个循环中，select（）失败，超时。
在第（N + 2）个循环中，select（）成功并成功读取256K的数据。

为什么在接收过程中会超时交错？谁能解释这种现象？

[UPDATE]
1.上传一个14M的文件到服务器只需要8秒
2.与1）使用相同的文件，服务器将花费将近30秒将所有数据发送到客户端。
3.客户端使用的所有套接字都被阻塞。服务器使用的所有套接字都是非阻塞的。

关于＃2，我认为超时是＃2需要更多时间的原因，我想知道为什么当客户端忙于接收数据时会有这么多的超时。

[UPDATE2]
感谢@Duck，@ebrob，@EJP，@ja_mesa的评论，今天我会做更多的调查，然后更新这篇文章。
关于为什么我在服务器线程中每循环发送512字节，这是因为我发现服务器线程发送数据的速度比接收它们的客户端线程快得多。我很困惑，为什么超时发生在客户端线程。

考虑这个比解答更长的评论，但是正如几个人已经注意到网络比你的处理器慢了几个数量级。非阻塞I / O点在于差别非常大，您可以真正使用它来做实际工作而不是阻塞。在这里，你只是在电梯按钮冲击，希望有所作为。

我不知道你的代码有多少是真实的，多少是切碎张贴，但在服务器，你不占（ret == 0），即对方正常关机。

客户端中的select是错误的。再次，不知道这是不是马虎编辑，但如果不是，那么参数的数量是错误的，但更重要的是，第一个参数 – 即应该是最高文件描述符选择看看加1 – 是零。根据select的实现，我想知道这是否实际上只是将select转换为幻想的sleep声明。

你应该先调用recv() ，然后只有在recv()告诉你这样做时才调用select() 。不要先调用select() ，那是浪费处理。 recv()知道数据是否立即可用或是否需要等待数据到达：

 void *RecvFileThread(void *param){ int sockfd = (int) param; //blocking socket set_sock_rcvbuf(sockfd, 1024 * 256); //set the send buffer to 256 char buffer[1024 * 256]; while (1){ int ret = 0; int len = recv(sockfd, buffer, sizeof(buffer), 0); if (len == -1) { if (WSAGetLastError() != WSAEWOULDBLOCK) { //handle error break; } struct timeval timeout; timeout.tv_sec = 1; timeout.tv_usec = 0; fd_set rds; FD_ZERO(&rds); FD_SET(sockfd, &rds)' //actually, the first parameter of select() is //ignored on windows, though on linux this parameter //should be (maximum socket value + 1) int ret = select(sockfd + 1, &rds, NULL, &timeout ); if (ret == -1) { // handle error break; } if (ret == 0) { // log that timer expires break; } // socket is readable so try read again continue; } if (len == 0) { // handle graceful disconnect break; } //log the number of data it received process_tcp_data(buffer, len); } }

在发送方也做类似的事情。首先调用send() ，然后调用select()等待可写性，只有在send()告诉你这样做的时候。