optimax 2 Posted May 22, 2020 Hi, I have a Tcp daemon thread using TSocket from System.Net.Socket (cross platform), listening for incoming TCP connection and creating client threads when clients connect. It works just fine when compiled and run under Windows. I can connect with a Windows TCP client, a Linux TCP client, or even a simple Linux telnet session, and handle the client communication in the client thread. When compiled and run under Linux, it seems that TSocket.Accept always returns nil, even when a connection is successfully established via telnet or other client. Hence the code that creates the client threads is never executed.... In Linux, my logs show an infinite series of 'Daemon - Socket.Accept with timeout 500ms' followed by 'Daemon - No new connection after 500ms timeout', although a connection can be successfully established. However, without a client thread and a reference to the client socket, there isn't much I can do to handle the client communication... Has anyone seen this behavior when compiling a TCP server for Linux based on System.Net.Socket ? Any hint or help would be very much appreciated. procedure TTcpDaemon.Execute; var LConnectionSocket: TSocket; LConnectionThread: TTcpConnectionThread; begin FServerSocket := TSocket.Create(TSocketType.TCP, TEncoding.UTF8); FServerSocket.Listen(FIP, '', FPort); Log('Dameon started (IP: ' + FIP + ' Port: ' + IntToStr(FPort) + ')'); Log('Daemon - ConnectionThreadCount: ' + IntToStr(GetConnectionThreadCount)); while not Terminated do begin try Log('Daemon - Socket.Accept with timeout 500ms'); LConnectionSocket := FServerSocket.Accept(500); if Assigned(LConnectionSocket) then begin Log('Daemon - New connection - ' + LConnectionSocket.RemoteAddress + ':' + IntToStr(LConnectionSocket.RemotePort)); LConnectionThread := TTcpConnectionThread.Create(LConnectionSocket); FThreadList.Add(LConnectionThread); Log('Daemon - Adding new connection thread - ConnectionThreadCount: ' + IntToStr(GetConnectionThreadCount)); Log('Daemon - Starting new connection thread'); LConnectionThread.Start; end else Log('Daemon - No new connection after 500ms timeout'); except on E: Exception do Log(Self.ClassName + ' Exception (' + E.ClassName + '): ' + E.Message); end; end; TerminateAllThreads; FServerSocket.Close(True); end; Share this post Link to post
optimax 2 Posted May 23, 2020 So, I found some interesting details, which all seem to point to a buggy implementation of TSocket.Accept when targeting Linux. Instead of LConnectionSocket := FServerSocket.Accept(500); if Assigned(LConnectionSocket) then begin ... I wrote the following code: LReadFds := TFDSet.Create(Self); LWaitResult := FServerSocket.Select(ReadFds, nil, nil, 500); if LWaitResult := TWaitResult.wrSignaled then begin LConnectionSocket := FServerSocket.Accept; if Assigned(LConnectionSocket) then begin Log('Daemon - New connection - ' + LConnectionSocket.RemoteAddress + ':' + IntToStr(LConnectionSocket.RemotePort)); LConnectionThread := TTCpConnectionThread.Create(LConnectionSocket); ... ... and it works, both when compiled for Windows and Linux ! The difference being that TSocket.Accept is now called without a parameter (infinite wait, but returns immediately because a client connection is pending), this pointed to the problematic implementation of TSocket.Accept with a timeout: if not (TSocketState.Listening in FState) then raise ESocketError.CreateRes(@sSocketNotListening); Result := nil; if Timeout <> INFINITE then begin FD_ZERO(FDR); _FD_SET(FSocket, FDR); time.tv_sec := Timeout div 1000; time.tv_usec := (Timeout mod 1000) * 1000; Res := socketselect(1, @FDR, nil, nil, @time); // << this is where things go wrong when compiled for Linux CheckSocketResult(Res, 'select'); if Res = 0 then Exit; end; Len := SizeOf(Addr); ClientHandle := acceptsocket(FSocket, Addr, Len); CheckSocketResult(ClientHandle, 'accept'); Result := GetClientSocket(ClientHandle, Addr); The problem is the call to socketselect(1, @FDR, nil, nil, @time). This code works fine in Windows because the first hardcoded parameter (nfds = 1) is ignored. However, in Linux, this parameter must be the set to the highest-numbered file descriptor in any of the three sets + 1. So for Linux, this implementation is telling socketselect() to only look at file descriptor 0, which is almost certainly not what the socket is set to. This is however correctly implemented in TSocket.Select (I am not sure why TSocket.Accept does not simply call TSocket.Select, rather than re-implementing the same functionality at low level): ReadFds := GetFds(CheckRead); WriteFds := GetFds(CheckWrite); ErrorFds := GetFds(CheckError); if Microseconds >= 0 then begin time.tv_sec := Microseconds div 1000000; time.tv_usec := Microseconds mod 1000000; TimePtr := @time; end else TimePtr := nil; Res := socketselect(Max(GetMaxFds(CheckRead), Max(GetMaxFds(CheckWrite), GetMaxFds(CheckError))) + 1, ReadFds, WriteFds, ErrorFds, PTimeVal(TimePtr)); CheckSocketResult(Res, 'select'); if Res = 0 then Result := TWaitResult.wrTimeout else Result := TWaitResult.wrSignaled; More information about the low level call to select() can be found here:https://stackoverflow.com/questions/2008059/socket-select-works-in-windows-and-times-out-in-linux I guess the next step is to report this as a bug to Embarcadero... 1 1 Share this post Link to post
Dmitry Arefiev 105 Posted May 25, 2020 This is known issue: https://quality.embarcadero.com/browse/RSP-19708 Share this post Link to post
optimax 2 Posted May 25, 2020 (edited) Thanks for the link. I just added a comment to RSP-19708 with the details of the code for TSocket.Accept and TSocket.Select, showing the error in the implementation of TSocket.Accept and the correct implementation of TSocket.Select. Edited May 25, 2020 by optimax Share this post Link to post