Diagnostics and Monitoring

How can we improve Azure Diagnostics and Monitoring?

You've used all your votes and won't be able to post a new idea, but you can still search and comment on existing ideas.

There are two ways to get more votes:

  • When an admin closes an idea you've voted on, you'll get your votes back from that idea.
  • You can remove your votes from an open idea you support.
  • To see ideas you have already voted on, select the "My feedback" filter and select "My open ideas".
(thinking…)

Enter your idea and we'll search to see if someone has already suggested it.

If a similar idea already exists, you can support and comment on it.

If it doesn't exist, you can post your idea so others can support it.

Enter your idea and we'll search to see if someone has already suggested it.

  1. Add the ability to monitor total RAM usage on a VM

    We have a graph that monitors CPU usage, Network traffic, and disk read/write, but it would be very nice to have a graph to show RAM usage on a VM over a period of time (much like the CPU). Especially when deciding to switch between say an A2 and an A5.

    27 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: oidc
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    under review  ·  1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  2. Bring back the dashboard tiles that can show me the performance tier and number of instances my app service plan is running at.

    I have a dashboard with 7 app service plans on it. For each plan, I had 1 tile that showed me the current tier: S1, S2, S3, etc. and 1 tile that show me the instance count; 1 or 5 or 10, etc.

    A week or 2 ago, these tiles stopped working and I got a notise that those tiles have been "Retired".

    Now it seems that there is no replacement tile to provide the same information.

    How can that be. I hope I am wrong.

    Please advise.

    See this for more info: https://social.msdn.microsoft.com/Forums/en-US/9c1c0633-ceff-493f-be9b-f62f8cb279e2/how-can-i-monitore-the-instance-count-of-an-app-service-plan-on-a-dashboard?forum=windowsazurewebsitespreview

    26 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: oidc
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  3. Copy Alert Rules from one Azure resource to another

    It would be really great if there was a way to copy a set of Alert Rules from one Azure resource to another.

    Use Case: I made 15 Alert Rules on our Staging db. I want those on the Prod db now. Same thing with our WebApp, CloudService, SQL Server, etc. It takes a really long time to add these manually and you might forget one or type an email in wrong, and then you miss out on important alerts.

    24 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: oidc
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  4. Better Management of WAD Diagnostic Tables

    Today it is really easy to configure the collection of Perf Counters for your PAAS apps. AND nearly impossible to manage the data that produces.

    1. I'd like Management Portal to let me set a "Truncate after N Days" setting on all the WAD tables. especially WADPerformanceCountersTable

    2. I'd like a Report or Display to help me manage my Azure Tables. Show me "Table Name" "Storage used (MB)" "Num Rows / Num BLOBs" it doesn't have to be exact, a close approximate will do. This would help to understand billing. As it is easy to have 100's GB's in Storage…

    20 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: oidc
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  5. Download performance metrics from portal

    It would be really great to be able to download the performance metrics from a chart as a csv or excel file.

    20 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: oidc
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  6. Add support for exporting ARM templates

    Please add support for exporting Action Groups and Alert Rules as ARM Templates, in the same manner as the Data Factory V2 team allows exporting pipeline definitions and all their related artifacts as ARM templates. This is incredibly useful for cases where we're creating a product with multiple environments.

    20 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: oidc
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  7. 19 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: oidc
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  8. Vanity/DisplayName correlation to Azure resource type.

    The service health activity alert service scoping seems to be using some arbitrary vanity/display names that have no direct correlation or mapping to typical Azure resource type convention.

    For example here is what the field value would be to scope an alert down to "Virtual Machines" within the Azure Monitor REST API.

    properties.impactedServices[?(@.ServiceName == 'Virtual Machines'].ImpactedRegions[*].RegionName

    And this would be the correlating Azure resource type with the providerNamespace/resourcetype convention.

    Microsoft.Compute/virtualMachines

    The scoping/query language used for Azure Monitor activity alerts related to service health does not appear to have any dictionary or direct correlation to the actual service type it's alerting…

    18 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: oidc
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    2 comments  ·  Bugs  ·  Flag idea as inappropriate…  ·  Admin →
  9. The new version of the graphics in the dashboard are bad and the usability is terrible.

    The new version of the graphics in the dashboard are bad and the usability is terrible. The dynamic y axis forces us to have to look at all the graphs one by one to monitor the applications. Ideally, the Y axis should be fixed at 100% (it can not be dynamic). It was beautiful but it was useless.

    17 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: oidc
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Bugs  ·  Flag idea as inappropriate…  ·  Admin →
  10. Add support for enabling diagnostics 1.3 via TFS build

    Prior to SDK 2.5, the diagnostics config was part of the Azure Service Configuration, and was enabled automatically when we built and published using TFS / Visual Studio Online.

    We use the Staging > Production VIP swap approach to deployments, so we are always deploying to an empty staging slot.

    Now that diagnostics is an extension, we need to manually enable the diagnostics using Powershell every time we deploy to Staging.

    Furthermore, because we need to wait for the slot to exist before we can enable diagnostics, we have no way of retrieving any diagnostics during role startup.

    For now…

    15 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: oidc
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    2 comments  ·  Flag idea as inappropriate…  ·  Admin →
  11. Is LogRhythm supported?

    On the page, it says;
    Only IBM QRadar, Splunk and SumoLogic are supported for routing the logs to these vendors.

    Are you also supporting LogRhythm?

    15 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: oidc
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    2 comments  ·  Flag idea as inappropriate…  ·  Admin →
  12. Enhance Audit Logs on new portal with Timestamp and Started status

    Old management portal shows Audit Log in pretty good way: Each record has detailed timestamps with seconds. It's absolutely clear when operation is Started and when it is finished (Succeeded/Failed).

    Please consider enhancing Audit Logs on new portal with this information. It is very useful during troubleshooting.

    14 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: oidc
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →

    This is available in the Azure Portal. To configure which columns are shown for the Activity Log, simply use the “columns” button at the top of the Activity Log blade.

    Thanks,

    John Kemnetz
    Program Manager, Azure Monitor

  13. fired alerts not working

    When an alert is send (Alert(preview)) I receive an email as defined but the fired alert is not showing in the portal

    12 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: oidc
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    7 comments  ·  Bugs  ·  Flag idea as inappropriate…  ·  Admin →
  14. Add a button for copying an existing Alert Rule

    I am currently creating duplicates for every Alert Rule so that I can have two versions of each one. I want one copy of each Alert Rule to have a lower threshold which will go out to the engineers. I want another Alert Rule with a much higher threshold which will go out to DevOps.

    Would save me a lot of time if there was a copy button.

    12 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: oidc
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    2 comments  ·  Flag idea as inappropriate…  ·  Admin →
  15. When there is a scaling alert it should tell me which one of my rules cause the scaling up or down.

    When I get an email scale up or down alert for my site. I have no idea which one of my rules has caused the action. It could have been CPU, Memory, or any one of my scale up metrics I am monitoring. It would be really nice to know which one it is.

    12 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: oidc
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    under review  ·  0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  16. Audit and IT fundamental services

    Provide unalterable audit logs for services like ACS.

    Provide management stats and backup/recovery mechanisms and services for all provided services.

    12 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: oidc
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →

    Thanks for the feedback. We’ll look into this. Each service has their own auditing capabilities, but we can consider standardization, if that would be helpful for you. Let us know if you have any specifics you’re looking for or ways this could help you.

  17. Make alerts more actionable (e.g. "Open a support ticket")

    This was feedback given in the mvp summit session on the current (Old-New) portal.

    When I have an alert or something is 'limited' that shows me the red exclamation point I expect to be able to "do something" or "go somewhere for help".

    I would love to get "this problem is causing your service to be down - go here to open up a support ticket for that".

    I also expect any 'limited' message to be archived so that I can go back to them. Many times I've had 'limited' show up and I send a message to someone about…

    11 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: oidc
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  18. Increase Azure Alert Rule Description lenght

    Alert rule description in Azure Alerts is currently limited to 128 characters.
    In OMS Alert description doesn’t seem to have a limit?
    Soon OMS Alerts will be auto extended in to Azure Alerts and I have descriptions of alerts that are way longer than 128 characters.

    They cant be short because the team that get the alert needs to understand what to do with it.
    I cant explain it with less characters than a tweet.

    11 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: oidc
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  19. Fix alert rules

    Alert rules that use a time metric like CPU time or average response time are incorrect.

    Setting a threshold of 1.5 seconds updates the metric graph correctly, showing a dotted line at the 1.5 second mark - however the test is actually set to a threshold of 1.5 milliseconds. You can confirm this by viewing the alert in the old portal, and in the fact that even though the dotted line remains above the blue line in the graph the whole time, the alert is still considered active.

    Please fix it

    Also, if you can make it more clear where…

    11 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: oidc
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    under review  ·  3 comments  ·  Flag idea as inappropriate…  ·  Admin →
  20. Implement syslog events out of azure

    create the capability to syslog out events of all activities within a given subscription in the Azure infrastructure (create/change/delete VM, network security group, endpoints, users, etc.)

    11 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: oidc
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  • Don't see your idea?

Diagnostics and Monitoring

Feedback and Knowledge Base