New Database Level Health Detection can send the whole AOAG down
Since Error 1101 is now monitored by a health check routine, it can cause a failure of the whole AO AG (when DB_FAILOVER option is enabled) in cases where it has no sense.
I am not sure if Microsoft realized that error 1101 can be encountered in two scenarios:
- A database filegroup can’t grow because a drive(s) is/are out of space
- You configured a MAXSIZE quota
While I can imagine how beneficial DB_FAILOVER caused by this error can be in the first case, I can hardly find any case when it could help in the second case.
As a consequence, the Availability Group ends up in RESOLVING state from SQL Server perspective and in FAILED state from WSFC perspective. It means the user / app can’t access the data.
Repro script attached