Ability to stop/disable services without removing them
Right now, as soon as a service gets deployed to Service Fabric, it will start immediately and the only way to stop it is to completely remove the service. In case of a stateful service, this would also remove the data.
However, there are scenarios, where stopping/deactivating a service - without removing it - is important:
Troubleshooting: If the service is not behaving correctly, we might want to stop it. (it's not always possible to do a rollback)
Scheduled services, which should not run all the time
Planned downtime: Although against the idea of Service Fabric, there might be scenarios where I want to stop a service temporarily
There should be a Powershell command for this and ideally it should also be possible to change this through the UI.
Ideally, the UI should also show information about stopped services in the dashboard.
umesh kanase commented
Is this issue resolved? Is it possible to stop application or service already running in service fabric.
We also need this function
It is very important to pause and restart any service or pause the entire app, for example, doing data recovery and migration process.
We need this too
This would be useful in our business as well. I would like to have the ability to trigger a runbook or logic app which would disable or stop a service once a budget threshold has been hit, but as an EA customer I don't see the ability to do this.
Has this been resolved yet?
I am a newbie to azure, my azure is running eating up my $200... could not find a way to stop the services until I read this. That is crazy.. you have to delete and not just pause it???
I don't think there is a need to change Service Fabric to add service management, like start/stop, schedule, etc. This functionality belongs to a business layer of your app. Look for example at the SupercondActor - it sits on top of the Service Fabric, does everything you requested.
Edward Mindlin commented
This is also important when you have services that communicate in real time with third party endpoints that undergo maintenance or downtime and as an admin just want to "pause" manually while the third party is unavailable. This prevents unwanted health alerts and false positives in alerting and monitoring systems that might generate support tickets and SMS/pages... I ended up implementing the feature (only for Actors) using a custom Web API over the top of custom Actor interface. All of the Actor Service "State" management (ie STARTED, STOPPED, STOPPING, etc.). had to be implemented in the Actor. This should be baked into the Fabric and available from the SF Admin rather than having to write a custom UI to manage the starting/stopping of the Actor/Roles...
Alan Hemmings commented
+1 for being able to do this in powershell.
Distributed systems result in complex choreographies of messages, and working on A might mean needing tp temporarily disabling C, E, and G, and sending test or probe messages through the system, then re-enabling C, E and G. My only conclusion is that most service fabric deployments must be push and pray? Would prefer to be able to design my services so that I can turn services on and off without having to burn the whole house down.
Nandun Wijesinghe commented
upvoted - need to test an application that consumes queue messages. we'd like the ability to stop the service when we want to inspect the messages in the queue.
I upvote this, we are just starting out with service fabric and this seems to be an obviously missing feature. I thought I was stupid because I couldn't figure where to pause an app... but obviously the ability is not there.
We just had a deployment last night, where a common application was getting patched, and we needed to shut down a dependent application while it was happening. Since there is no way to just "shutdown" the app, we had to remove it as well. After the patching was done, we reinstalled the wrong dependent app, and it took 2 hours to figure out what happened. If we had been able to just shut down the app without uninstalling, there wouldn't have been an issue.
Gunnar Siréus commented
Would be a good feature. I vote Yes!
Sujit Gokhale commented
Really handy in DEV cluster specifically when same service codebase is running under different applications but same application type
John Mao commented
In DEV cluster when debugging a particular service, after code change, stop and restart the entire application is really painful.
Francesc Castells commented
Another scenario: a service processes messages from a queue and you want to stop it. You might want this for several reasons:
1.- The service has a bug and all messages fail. Let the messages pile on the queue, fix the bug and redeploy
2.- The service relies on an external service that is down (similar than 1)
3.- You want to run the service locally for debugging, but the service in the cluster is taking the messages from the queue faster. Stop the service so your local instance can pick the messages.
Starting and stopping services are really helpful in my scenario as well. Also for dev clusters this can be a really nice feature.
Osita Chris Okonkwo commented
Need to migrate mulitple Windows/ServiceHost services with complex async logic to SF as stateful services. The services are shutdown during weekends by a managed shutdown of the server. Server is restarted early Monday morning.
From a deep dive into Actors, something similar may be possible: Actors are garbage collected if they do not receive a method invocation after a configurable period of time. The rub is that Actors are single-threaded. I can rewrite my multiple async services as amalgams of single-threaded Actor workloads that are orchestrated by a single 'master' async stateful service. Thus, over the weekend, my services (minus the 'master' service) will be stopped since no Actors will be active.
But that will be one helluva re-write. :-(
Would be a much happier camper if a configurable stop/disable of the stateful and stateless services were possible.
"In my opinion this should be done in the service, rather than be baked into Service Fabric itself. There have to be a dozen different ways to do each one of these things. Every feature adds complexity and subtracts reliability." ... Disagree you want small services managed by SF ,,not monoliths 99% of services should be able to be restarted ( like every other system to recover most errors without having to rdp into vm's and kill the process )