Scale Out wait for WarmUp complete before being added to LoadBalancer
When adding new instances to the LoadBalancer, scale out mechanism doesn't wait for those application instances to fully warm up.
Availability checker, via AppInsights, logs these responses, with the header:
Which means that IIS knows that the application is in it's WarmUp cycle, but the LoadBalancer is already trying to serve requests.
N.B. we can add a rewrite rule to redirect the user to the original request, and hope that the LoadBalancer sends the user to a ready instance - but this feels like a hack.
Thanks for the feedback. We are thinking of ways to make the warmup model better and fully ready for the app’s traffic. Right now, we just see the instance is ready before directing traffic.
Please refer to this blog post for more information about slots and AppInit.
I am surprised that App service team is not addressing this critical issue of auto-scaling. Is it that hard problem to solve which have been already solved by AWS in early 2010 ?
Christian Rondeau commented
Please review the current link in Oded's post, it is unrelated to the issue of scaling out. Swapping works fine, but scaling out brings in an unresponsive instance, which can mean downtime for 5-10 minutes in some case, accounting for the queued requests. The same problem happens with scaling up/down. Since the last update to this was more than a year ago, should we understand that App Services is not applicable to companies targeting a 99.98% uptime?
Any update on this? This is really frustrating. We are using the auto scaling to scale in/out at night and in the morning, however the site goes down during these operations.
Andreas Warberg commented
According to this article (https://michaelcandido.com/app-service-warm-up-demystified/) the App Service LoadBalancer should wait for the AppInit page to load before directing traffic to the new instance.
Is it possible to resolve this problem by making sure that the AppInit page hangs (does not return any response) until the instance is fully warmed up?
"A slightly confusing quirk of AppInit is that it does not look at response status codes and will consider even 500's as successful initialization of a route. Therefore, to block completion of initialization and keep the instance from being marked as warm, the proper behavior is to hang the ping from IIS until initialization is complete."
NB: My team experience a similar problem i.e Scale Out routes traffic to the new instance much too early, but in our case the warmup page is never called.
Andrew Gambin commented
We are experiencing this issue as well (west-europe). Our APIs are serving the warm-up page on production URLs because the traffic is being directed there before the actual warm-up is finished! This is quite catastrophic.
Ryan Whitmire commented
It is unbelievable to me that it isn't possible to execute a simple health check prior to sending requests to a newly added instance. Come on Microsoft, do better.
Tom Iverson commented
Any update on this? We are also experiencing this issue due to the long warm-up time of the particular workload we are running. Any other suggestions?
Dirk Boer commented
Any updates on this? If have the feeling it used to work, but now it's broken again. This is a really, really horrible experience for end users. Our whole application comes across some kind of amateur project thanks to this behaviour.
I've been troubleshooting this for some time for production applications but the problem is that i cannot replicate this in a test env. There the auto scaling works.
What i've identified so far with the use of FREB logging is that when it works there is several requests with User-Agent: "ReadyForRequest/1.0 (LocalCache)" (since Local Cache feature is used) and then a "ReadyForRequest/1.0 (AppInit)" request. But when it fails these are missing and i assume that since they are missing LB will not get the correct X-AppInit-WarmingUp value back and lets on traffic.
Luke Ballantine commented
This is a massive problem with how load balancing and auto-scale works in Azure PaaS for us. It basically means we can't use auto-scale unless we are comfortable with some requests timing-out or returning 503 errors while the application warms up.
This really needs fixing urgently. Chris Hewitt's solution should be implemented.
I'm quite surprised that auto-scale has been released without this feature to be honest. Its been out of Preview mode for about 5 years already!
Johan Eriksson commented
I have the same issue when I run performance tests to provoke scaling of the application. It really looks like the LB sends real requests too early. (Leading to a mixture of long response times and 503 responses.) I have client affinity disabled and I am using Least Response Time as the load balancing algorithm. I can also see that when my app does not start properly in a scaled out instance and that instance keeps returning 500 errors, the LB is still happily sending requests to the failing instance as if it was behaving normally.
Benjamin Desjardins commented
Since we can not control the warm up, it has major incident when new instances are launched (increase of load on the services). Users are getting 503 errors and this generates issue for our clients. Please take into account that it prevent us from offering a proper SLA if every time we increase the number of server we return 5xx error.
Vincent Gravel commented
Any update on this request? Cold starts are very bad we the autoscale is scaling new instances (doesn't seem to take in consideration the applicationInitialization).
Amir Keibi commented
It'd be awesome to be able to control the warm up and App Service's load balancer's behavior.
zhagnpeng chen commented
Any update on this request? we did a few tests today, 'cold' starts seem still occur when scaling out / adding additional instances into the instance pool.
applicationInitialization doesn't help.
Karthik Jambulingam commented
Any update on this request? We have had an issue recently related to this issue that one of the instances went down for some reason (probably due to server failure or planned maintenance, but not due to autoscale rules) and another instance spun up immediately (which is good) but seems load balancer routed requests too early before the new instance was fully ready which caused an outage. Would like to see such a key problem gets addressed soon in a popular cloud platform. Appreciate if this can be expedited to implement as soon as possible. Thanks!
Shukhrat Nekbaev commented
Definitely a better solution is needed. I occasionally run into the situation if there's traffic during the scale-out and that, say, second instance goes into some unresponsive limbo mode with very high CPU and it many cases does start functioning normally after several minutes given that traffic is not routed to it anymore. Can simulate that behaviour with Fiddler too. If load-balancer could have waited for the preconfigured app response before routing then the problem would have been eliminated. Thank you!
Christopher Hewitt commented
Have already looked upon the blog post at ruslany.net;
We have already applied the AppInit / secure redirects as illustrated on the blog post - and this did resolve our 'cold' starts with Slot Swaps.
However, 'cold' starts seem to still occur when scaling out / adding additional instances into the instance pool.