Scale Out wait for WarmUp complete before being added to LoadBalancer
When adding new instances to the LoadBalancer, scale out mechanism doesn't wait for those application instances to fully warm up.
Availability checker, via AppInsights, logs these responses, with the header:
Which means that IIS knows that the application is in it's WarmUp cycle, but the LoadBalancer is already trying to serve requests.
N.B. we _can_ add a rewrite rule to redirect the user to the original request, and hope that the LoadBalancer sends the user to a ready instance - but this feels like a hack.
Thanks for the feedback. We are thinking of ways to make the warmup model better and fully ready for the app’s traffic. Right now, we just see the instance is ready before directing traffic.
Please refer to this blog post for more information about slots and AppInit.
Tom Iverson commented
Any update on this? We are also experiencing this issue due to the long warm-up time of the particular workload we are running. Any other suggestions?
Dirk Boer commented
Any updates on this? If have the feeling it used to work, but now it's broken again. This is a really, really horrible experience for end users. Our whole application comes across some kind of amateur project thanks to this behaviour.
I've been troubleshooting this for some time for production applications but the problem is that i cannot replicate this in a test env. There the auto scaling works.
What i've identified so far with the use of FREB logging is that when it works there is several requests with User-Agent: "ReadyForRequest/1.0 (LocalCache)" (since Local Cache feature is used) and then a "ReadyForRequest/1.0 (AppInit)" request. But when it fails these are missing and i assume that since they are missing LB will not get the correct X-AppInit-WarmingUp value back and lets on traffic.
Luke Ballantine commented
This is a massive problem with how load balancing and auto-scale works in Azure PaaS for us. It basically means we can't use auto-scale unless we are comfortable with some requests timing-out or returning 503 errors while the application warms up.
This really needs fixing urgently. Chris Hewitt's solution should be implemented.
I'm quite surprised that auto-scale has been released without this feature to be honest. Its been out of Preview mode for about 5 years already!
Johan Eriksson commented
I have the same issue when I run performance tests to provoke scaling of the application. It really looks like the LB sends real requests too early. (Leading to a mixture of long response times and 503 responses.) I have client affinity disabled and I am using Least Response Time as the load balancing algorithm. I can also see that when my app does not start properly in a scaled out instance and that instance keeps returning 500 errors, the LB is still happily sending requests to the failing instance as if it was behaving normally.
Benjamin Desjardins commented
Since we can not control the warm up, it has major incident when new instances are launched (increase of load on the services). Users are getting 503 errors and this generates issue for our clients. Please take into account that it prevent us from offering a proper SLA if every time we increase the number of server we return 5xx error.
Vincent Gravel commented
Any update on this request? Cold starts are very bad we the autoscale is scaling new instances (doesn't seem to take in consideration the applicationInitialization).
Amir Keibi commented
It'd be awesome to be able to control the warm up and App Service's load balancer's behavior.
zhagnpeng chen commented
Any update on this request? we did a few tests today, 'cold' starts seem still occur when scaling out / adding additional instances into the instance pool.
applicationInitialization doesn't help.
Karthik Jambulingam commented
Any update on this request? We have had an issue recently related to this issue that one of the instances went down for some reason (probably due to server failure or planned maintenance, but not due to autoscale rules) and another instance spun up immediately (which is good) but seems load balancer routed requests too early before the new instance was fully ready which caused an outage. Would like to see such a key problem gets addressed soon in a popular cloud platform. Appreciate if this can be expedited to implement as soon as possible. Thanks!
Shukhrat Nekbaev commented
Definitely a better solution is needed. I occasionally run into the situation if there's traffic during the scale-out and that, say, second instance goes into some unresponsive limbo mode with very high CPU and it many cases does start functioning normally after several minutes given that traffic is not routed to it anymore. Can simulate that behaviour with Fiddler too. If load-balancer could have waited for the preconfigured app response before routing then the problem would have been eliminated. Thank you!
Christopher Hewitt commented
Have already looked upon the blog post at ruslany.net;
We have already applied the AppInit / secure redirects as illustrated on the blog post - and this did resolve our 'cold' starts with Slot Swaps.
However, 'cold' starts seem to still occur when scaling out / adding additional instances into the instance pool.