Jump to content
Eric Bonilha

Connection refused issue

Recommended Posts

Hello!

I'm trying to diagnose an issue I have with our application and would like to pick the minds of the experts here.

The application has a TCP Server running on the main thread, so all connections are processed by the main thread and basically the issue I'm having is that if the main thread is somewhat busy (not even too busy), some connections are straight away refused by the OS (When clients try to access the application).

Now, we have this application running in thousands and thousands of machines and this is a VERY RARE occurrence... as even if the main thread is busy (processing something else) when an inbound connection is pending, the OS should keep this connection in a backlog and the effect is that the client trying to access might take some time to get a reply, but it will establish the connection as soon as the thread accepts the connection, and I believe this is the standard way that the OS works, please correct me if I'm wrong...

I don't know why, in some very rare cases (like this one), if the thread is slightly busy, many connections will be straightly refused... Do you have any idea why? Could this be some setting on the OS? This is Windows Server (I have to remotely access the customer to get the specific version).

I tried increasing the ListenBacklog value to high values like 200 (instead of the default 15), but still the problem persists, in the latest tests I did, I could see about 14 or 15 connections being accepted and processed, then all other connections (We were opening about 80 connections simultaneously) are immediately refused.

 

Any ideas or suggestions are appreciated!

Thanks

 

Eric

Share this post


Link to post
2 hours ago, Eric Bonilha said:

In the latest tests I did, I could see about 14 or 15 connections being accepted and processed, then all other connections (We were opening about 80 connections simultaneously) are immediately refused.

Is there a firewall, load balancer, or other similar system running in front of the server?  Typically, the only reason for a "connection refused" error is either:

 

- the connection is trying to access a port that is not open.

- the connection reaches the port but the backlog is full.

- a firewall or other system is blocking the connection.

Edited by Remy Lebeau

Share this post


Link to post
8 hours ago, Eric Bonilha said:

Any ideas or suggestions are appreciated!

Remy listed few things, and i will list more thought to follow on this,

 

Windows OS has its own DDoS protection implemented, it almost useless or more like very naïve as any more advanced one will cause wide range of problems and require more advanced knowledge to tweak so Microsoft kept it as simple as possible with very limited settings to tweak, anyway

 

1) start with this link and see your dynamic port with "show dynamicport" before changing and adjusting

https://learn.microsoft.com/en-us/troubleshoot/windows-client/networking/tcp-ip-port-exhaustion-troubleshooting

 

2) It is disgusting how Microsoft manage to just lose links to 404, valuable information and documentation, for stupid site miss-manage, i have very little time to write and search and searching almost always land me on 404 !, found this though 

https://serverfault.com/questions/43252/how-can-i-harden-the-tcp-ip-stack-in-windows-server-2008

ECN can play a role,

 

and there was some more registry settings , later will look for them if your couldn't find the problem, but in general more information is needed, like how many new connection established per second and average time for connection staying connected ... 

Share this post


Link to post

Is the server dead once the problem arises, or does it start accepting connections again at some point? 

 

The backlog of 15 suggests the default is not being changed, but it is set immediately before Listen so can not be skipped.

 

There is a fix in V9.4 relating to the wrong connection state when connections open very quickly, usually localhost, that could stall WSocket, not sure if it applies to your situation.

 

Angus

 

Share this post


Link to post
12 hours ago, Eric Bonilha said:

Do you have any idea why? Could this be some setting on the OS?

Also i witnessed this behavior on many Windows and Linux servers hosted on dedicated servers, it is almost was the host problem or a specific ISP, you needed to study the dropped connections, if we are talking about dropped connection not accepted ones, does your host have some sort of DDoS protection, because it might be triggered on their hardware before your server by unrelated server attack happens to be the same switch and this could lead to such dropping/losing connection or refusing new connections for few minutes then everything come back as normal, and the load return to its normal.

 

For this case, track and record the time of this and ask your host technical support to confirm if that is the case, also record these IP(s) refused or dropped connections, and try to geo locate them see if they belongs to one or more than one but close ISP(s).

Share this post


Link to post
14 hours ago, Eric Bonilha said:

some connections are straight away refused by the OS (When clients try to access the application).

After that happens, is the application start working again without restarting it or one it happens no other connection is accepted.

If no other connection is accepted, it could be that the listen socket has been closed unexpectedly. This could happens by a bug elsewhere leading to a CloseSocket or even plain winapi CloseHandle function is called with the socket handle.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×