DusMac 1 Posted August 23, 2022 Have specific service, having to handle 2000+ connections while processing large amonuts of data. Data is processed in separate thread using FIFO buffer. The problems appear when many TCP clients disconnect / reconnect in short time. When repeated few times service shows no client connections, but clients are unable to reconnect until service restart. Using TSslWSocketServer with TSslContext (TLS 1.2 encryption). In memory I am keeping track of each socket/context with timestamp of last data received, if idle for more than 60 seconds socket gets disconnected/closed. In normal conditions there should be some data at least every 5s from every client. Any idea what am I doing wrong ? Share this post Link to post
Fr0sT.Brutal 900 Posted August 23, 2022 The issue could be in dangling sockets - Windows keeps them for a while after closing. Ensure you have linger options disabled for server's client sockets https://docs.microsoft.com/en-us/windows/win32/api/winsock/nf-winsock-closesocket Share this post Link to post
FPiette 383 Posted August 23, 2022 What is the error code experience when not able to connect? What does "netstat -t" display on the server when client not able to connect? The number of active socket is limited by the OS. Are you using a Windows Server OS or a Windows Desktop? Share this post Link to post
DusMac 1 Posted August 23, 2022 Thanks for the answers, linger option was enabled, let me try with this first. OS is Windows Server 2019 Datacenter Edition. Share this post Link to post
Angus Robertson 574 Posted August 23, 2022 My previous SSL/TLS testing shows there is a limit to the number of new connections per second, due to time taken to setup SSL/TLS and all the handshaking involved. There is a server setting ListenBackLog which defaults to 15, that means Windows will only queue 15 new connections before rejecting any further attempts. You could also try increasing your timeout beyond 60 seconds, to reduce the number of new connections. An established connection is low overhead, new one high overhead, provided you have sufficient handles and resources to open the total number. TLS/1.3 has faster setup than TLS/1.2, if you can use it. You may also be able to optimize certificate chain checking, the less the faster. Angus Share this post Link to post
DusMac 1 Posted August 23, 2022 OK, some news - linger is disabled, no change - client error is "ConnectionRefusedError : the connection was refused by the peer (or timeout) - netstat shows no TCP connections to port used by service Share this post Link to post
Fr0sT.Brutal 900 Posted August 23, 2022 (edited) Check if the server rejects the very connections or just handshakes by telnet'ing to the host:port Is the server responsive when the issue occurs (i.e. if the message pump is active)? Does it listen on the port at all? Are you catching server's and clients' exceptions? Edited August 23, 2022 by Fr0sT.Brutal Share this post Link to post
FPiette 383 Posted August 23, 2022 Quote netstat shows no TCP connections to port used by service To be sure, I gave you the wrong command (Sorry). It is "netstat -a". You should at least see the listening port plus one port per already connected client. If the listening port is missing, somehow your server socket has been closed. Could be caused by a bug in your code. If you have a lot of TCP services, you may use the command: netstat -a | find "TCP" | find ":5000 " Assuming your server listen on port 5000. Don't forget that if you run the client on the same computer as the server, netstat will give TWO lines for each client. Example of output when two connected clients run on the same computer as the server: C:\Users\fpiette>netstat -a | find "TCP" | find ":5000 " TCP 0.0.0.0:5000 Z600:0 LISTENING TCP 192.168.1.6:5000 Z600:52740 ESTABLISHED TCP 192.168.1.6:5000 Z600:52787 ESTABLISHED TCP 192.168.1.6:52740 Z600:5000 ESTABLISHED TCP 192.168.1.6:52787 Z600:5000 ESTABLISHED BTW: You can replace netstat by TcpView which gives more details about each connection and is dynamic (Showing changes during noticeable time). Share this post Link to post
DusMac 1 Posted August 24, 2022 Using TCPView and netstat revealed that service doesn't even listen to this port anymore. It seems I will have to implement some kind of watchdog to see if server is active at all Share this post Link to post
Angus Robertson 574 Posted August 24, 2022 I was having a problem with one of my servers not listening under some circumstances, so implemented an internal watch dog that made a connection attempt every minute, very simple using TSslHttpRest;, just a few lines of code triggered from the maintenance timer. Worked well, but showed the server was still listening. The problem turned out to be two services using the same named firewall rule updated each time they started to the current EXE name, so dependent upon the order in which the servers started. Angus Share this post Link to post
Angus Robertson 574 Posted August 24, 2022 It is important to handle the onBgException exception event, and Application.OnException. Recent ICS releases have improved BgException messages to help track the origin. I log the errors, send an admin email and then stop the service, so Windows can immediately restart it. Unfortunately, some SSL problems do not trigger any events and just crash the program, but it does restart. Angus Share this post Link to post
DusMac 1 Posted August 25, 2022 It seems like the latest version of ICS solved the problem. Tested with 2000 clients disconnecting / reconnecting many times, now working like a charm. Nevertheless I implemented watchdog in case service stops listening, but didn't manage to trigger it. Anyway, thanks for the tips 1 Share this post Link to post
Angus Robertson 574 Posted August 25, 2022 Which ICS version were you using? As I said earlier, there is a limit to the number of new SSL/TLS connections per second due to socket server being a single thread. I've previously said in this forum there is a plan for a heavy socket server, which will be configurable for x clients per thread, two threads in theory would double the number of new connections per second, but it could be one thread per client. This will need a new web server as well. But I really need ICS end users with server applications involving thousands of clients to justify the effort developing it. Would that be of interest? Angus Share this post Link to post