Increase Idle Timeout on Internal Load Balancers to 120 Mins
We use Azure Internal Load Balancers to front services which make use of direct port mappings for backend connections that are longer than the 30 min upper limit on the ILB. That is, our ILBs accept port connections on a nominated set of ports and pass those connections to the backend services running on the same ports.
We are experiencing dropped TCP connections from clients connecting to the backend services via the ILB. After investigating the issue in collaboration with the Azure Networking Team it was verified that altering the default OS TCP keep alive duration to below 30mins would mitigate the issues manifest from the DNAT that is performed by the ILB. However reducing the timeout to below 30min at the OS level on all servers and containers across our estate is undesirable due to the scale on impact it would have.
Therefore we would like to have the idle timeout upper limit on the ILB increased to 120mins to bring it in line with the default limits used by the Linux NW stack.
With Standard internal LB’s, you can also enable TCP Reset on Idle timeout. We will send TCP RST at the time of idle timeout to both client and server side. This is documented at https://aka.ms/tcpreset and may resolve some of these issues by notifying the client to reestablish the connection.
We are exploring additional options for this scenario as well. More soon.
Kevin Chandler commented
We have an application that requires a timeout of 24 hours, could this be possible?
We can use idle timeout of public LB only in 30 minutes. I want to use it more than in 30 minutes. E.g. 60, 90 or 120 minutes.
Dave Paddon commented
We have the same issue, I have raised a support case as our transfers are sometimes failing. AWS supports 4000 seconds (~66 minutes) so something of that mark would work for me