Or how to force a csharp HTTP client to only use one persistent connection
Within PALFINGER we rely on some fundamental services. Being an enterprise one of those services is central authentication. Being a Microsoft shop the authentication service is Active Directory. Enabling Windows Authentication for applications is done with just a few lines of csharp code .. trivial thing.
This was the same for one of our internal backend systems. The backend system is responsible for storing data of our products used by technical departments like engineering or documentation. The architecture to achieve this is straightforward. A csharp client application reading/writing data from a central csharp REST endpoint. The client app uses the credentials of the logged-in Windows for authentication. The REST endpoint is configured to use Integrated Windows Authentication and to validate the Windows user.
HTTP persistent connections
The HTTP client is kept simple using the System.Net.Http.HttpClient, nothing fancy. The standard settings are to use HTTP 1.1 which uses persistent connections by default.
This is the desired behavior, from a resource perspective and from a performance perspective. Creating new connections for each REST call causes a performance penalty due the TCP three-way-handshake.
One of our colleagues insisted, that it would be a good idea to check the performance of the architecture end-to-end. To achieve this we used a simple client implementation which hammered the backend with requests continuously. To turn up the heat further on the backend we started a number of clients in parallel. Everything looked fine on the server-side but after some minutes into the test-run a strange error surfaced:
Unable to connect to the remote server System.Net.Sockets.SocketException: Only one usage of each socket address (protocol/network address/port) is normally permitted.
When checking the network connections, we observed a high number of TIME_WAIT entries – in fact too many TIME_WAIT entries!
Proto Local Address Foreign Address State TCP 10.10.2.138:12050 10.10.1.206:9001 TIME_WAIT TCP 10.10.2.138:12051 10.10.1.206:9001 TIME_WAIT TCP 10.10.2.138:12052 10.10.1.206:9001 TIME_WAIT TCP 10.10.2.138:12053 10.10.1.206:9001 TIME_WAIT TCP 10.10.2.138:12054 10.10.1.206:9001 TIME_WAIT TCP 10.10.2.138:12055 10.10.1.206:9001 TIME_WAIT TCP 10.10.2.138:12056 10.10.1.206:9001 TIME_WAIT TCP 10.10.2.138:12057 10.10.1.206:9001 TIME_WAIT TCP 10.10.2.138:12058 10.10.1.206:9001 TIME_WAIT TCP 10.10.2.138:12059 10.10.1.206:9001 TIME_WAIT TCP 10.10.2.138:12060 10.10.1.206:9001 TIME_WAIT TCP 10.10.2.138:12061 10.10.1.206:9001 TIME_WAIT TCP 10.10.2.138:12062 10.10.1.206:9001 TIME_WAIT TCP 10.10.2.138:12063 10.10.1.206:9001 TIME_WAIT ... (more entries) ...
Hm, we did not expect this kind of behavior. According to the theory, by using HTTP 1.1 only one persistent connection is used. A quick recap of the TCP connection state-machine.
"TIME_WAIT: The socket is waiting after close to handle packets still in the network."
(netstat man-page: http://man7.org/linux/man-pages/man8/netstat.8.html)
Therefor any reuse of a port can only start after the TIME_WAIT timeout is over. Again, the expected behavior is that there is only one persistent connection for each client. The real behavior is, that the client constantly opens new connections. If it opens connections too fast it exhausts local resources because of the timeout required to release ports.
Make the connection persistent
Back to the code again and review it.
Easy one, should have done it in the first place: Share the HTTP client object. Same behavior – eats up local ports
Introduce a cookie store to hold authentication information. Same behavior – eats up local ports
Deactivate Windows Authentication on the server. Surprise, persistent connections are used. This is not a solution because the Windows Authentication security is needed, but at least we have found a hint what the problem might be.
The csharp HTTP client wants to initiate a new connection only when Windows Authentication is used. There has to be a method to tell the HTTP client to stop behaving this way. After hours of search, and experimenting, I came across a StackOverflow question.
There is a setting available which triggers the intended behavior.
Is it safe to use the property – the name suggests otherwise. From the documentation:
"Because it is possible for an application to use the connection without being authenticated, you need to be sure that there is no administrative vulnerability in your system when setting this property to true. If your application sends requests for multiple users (impersonates multiple user accounts) and relies on authentication to protect resources, do not set this property to true unless you use connection groups as described below."
If you have a use case where the application sends a request for different users, setting this property might breach the security. Because after the first authenticated request, the same connection is used for subsequent requests, even if a different users initiated those requests.
Understood, but this is not an issue for my use case. One user owns one application, and one application owns one HTTP client, thus it is safe to set the „UnsafeAuthenticatedConnectionSharing“ property to true.
Utilizing the „UnsafeAuthenticatedConnectionSharing“ property on the httpclient object ended a long journey and finally gave us to the intended behavior of a response time in the mllisecond range and using only one persistent connection per client.
Conclusion, takeaway for developers
- Network basics: As a developer it is essential to understand the network stack. Bonus: You will have a common topic for a coffee discussion with your infrastructure/operations colleagues.
- Performance/stress tests: Simply put – apply them! Without the parallel test-processes we would not have uncovered the bug in the first place; Furthermore the effort required to squash the bug in production would have been tremendously higher.