Diagnostics and Monitoring

  1. Monitor Resource Creation in an Azure Subscription.

    DevOps teams and also IT teams want to know when a resource has been either created or removed from the subscription they manage.

    Resource creation and delete may effect a Hugh range of issue from billing to product functionality.

    As for now, there is no way to get an alert once a resource is either created or deleted on the subscription.

    95 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  2. Provide a mechanism to surface CPU Steal time

    Please provide a mechanism to surface CPU steal time - that is time spent by the VM for Host CPU resources to become available. Shared Resource tools like VMs are sensitive to host-based performance issues, but there appears to be no easy way to see that from the VM level.

    18 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  3. mdsd

    Hi,

    as of now the MDSD (mandatory diagnostics service daemon) is only provided as a binary. There is no source code nor documentation of the protocols spoken available.

    The MDSD binary provided is not suitable to be used on linux distributions other than the officially supported distros.

    Withouth MDSD it is not possible to use the autoscaling features of a VMSS.

    Thanks!

    5 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  4. Add a button for copying an existing Alert Rule

    I am currently creating duplicates for every Alert Rule so that I can have two versions of each one. I want one copy of each Alert Rule to have a lower threshold which will go out to the engineers. I want another Alert Rule with a much higher threshold which will go out to DevOps.

    Would save me a lot of time if there was a copy button.

    15 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    2 comments  ·  Flag idea as inappropriate…  ·  Admin →
  5. Change Instance Size dynamically on Auto scaling.

    Can we also increase decrease Instance size while auto scaling . For example lets say I have some instances and I want whenever CPU load is less I want small size instances to be running if CPU load increases then instance size should increase to large size VM(instance).

    7 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    2 comments  ·  Flag idea as inappropriate…  ·  Admin →
  6. Adding reports, like SLA/Uptime reports for Virtual Machines, Availability Sets, and Traffic Managers

    Clients like to see reports that show that SLAs are being met or the Uptime of a Virtual Machine, Availability Set and/or Traffic Manager. Could Azure provide reports that could be generated from the data they are already collecting and presenting on the graphs?

    Thanks,
    Scott Weigand

    450 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    23 comments  ·  Flag idea as inappropriate…  ·  Admin →
  7. Add custom charts that pull data from our own SQL database (dashboard like features)

    Since I go to the portal to view the metrics around usage of the site plus CPU data, HTTP response codes, etc. it would be nice to be able to add my own custom data to the portal for viewing alongside these existing charts. For me it would be nice to include custom SQL to build a chart that pulls certain metrics from the database that I have attached to my website. I could see using it for tracking customer registrations, order information, etc. Things that are specific to my site but that provide valuable insight into what is going…

    65 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  8. Add CPU and Memory usage metrics per instance

    We use New Relic to monitor our cloud services. Sometimes we see an instance going above 80% CPU, which would possibly be solved by rebooting the instance, but it's imposible to identify which of the instances is in trouble in the portal (Azure portal uses _N to name the instances, New Relic uses an ID).

    82 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    5 comments  ·  Flag idea as inappropriate…  ·  Admin →
  9. Monitoring CPU percentage?

    Where does the CPU percentage come from or how is it calculated? I setup an alert, to alert me if CPU usage goes over 50%. If I go to my VM machine, task manager shows CPU use at 100% but in Azure monitoring it shows 66%.

    So again, how is the CPU usage computed?

    Thanks
    Robert Orsino

    5 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  10. Show memory and network metrics as percentages

    Since each VM / Cloud Service has 3 major metrics: CPU, Memory , Network - diagnostics should collect this metrics by default (logging level minimal).

    Because there are allot of instance types absolute values of Network and Memory counters are not informative and % should be used.
    It is more informative to know that instance is using 90% of memory than to have a value of 600 MB. Same is for the network channel load. For example my service is network intensive and now it is really hard to understand when network channel load comes to it's limit and service…

    63 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    2 comments  ·  Flag idea as inappropriate…  ·  Admin →

    This is very good feedback. Today we have separate datasources for what your quotas are (e.g. network, memory), and the metrics that we emit. Ideally, we could bring those two points together to give you a percentage of those metrics.

    Also, once these are exposed, they will automatically be available for autoscale — already today you can use any exposed metric for scaling.

  11. Better Management of WAD Diagnostic Tables

    Today it is really easy to configure the collection of Perf Counters for your PAAS apps. AND nearly impossible to manage the data that produces.


    1. I'd like Management Portal to let me set a "Truncate after N Days" setting on all the WAD tables. especially WADPerformanceCountersTable


    2. I'd like a Report or Display to help me manage my Azure Tables. Show me "Table Name" "Storage used (MB)" "Num Rows / Num BLOBs" it doesn't have to be exact, a close approximate will do. This would help to understand billing. As it is easy to have 100's GB's in Storage tables that…

    29 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  12. Auto-Scale explanation of Cool-Down period

    We need better explanation of auto-scaling rules. Even after reading MSDN documentation, it is unclear.

    Does a Cool-Down period for scale-down interfere with a scale-up operation? For instance, we have a polling period of 5 minutes for scale up, and 120 minutes Cool-Down for scale down. If however, during the 120 minute Cool-Down period of scale-down no further scale-up operation can take place, this would be awful. Sadly, no information is available.

    Or what happens if a CPU-Metric-Rule says scale down: but a scale-up Memory-Metric is still firing?

    Lastly, we need infos as to whether rule-sets (more than one) are…

    38 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    6 comments  ·  Flag idea as inappropriate…  ·  Admin →
  13. Support auto-scaling cloud services by service bus topic / subscription

    At the moment you can only scale by queue size but not topic/subscription size. Given subscriptions can have filters it would be ideal to be able to scale by subscription size and not just general topic size.

    113 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    3 comments  ·  Flag idea as inappropriate…  ·  Admin →

    This is something that we are looking at. In the meanwhile, there is a workaround to forward the items you want to scale by to a queue and scale by that queue.

  14. Alerts based on Queue Size

    I would like to be able to setup an alert and monitor a Cloud Service based on Queue size. So if a queue has more than 10,000 items for 15 minutes send alert.

    505 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    under review  ·  18 comments  ·  Flag idea as inappropriate…  ·  Admin →
  15. Monintor VM Status

    Add a feature to monitor the Status of a VM with some conditions.
    Ex.: I want to receive an alert when the Status of VM "X" is not "Running".

    32 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  16. integrate Windows Application Proxy network traffic logs with OMS Security and Audit, Threat Intelligence module.

    Currently the OMS Security and Audit Threat Intelligence functionality only supports logs from IIS and some firewall appliances. Since the Windows Application Proxy role is intended for a DMZ secure deployment pulling the traffic logs from the WAP and sending them to OMS security and audit seems like a good fit.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  17. Pause the Streaming Log

    When you scroll up in the streaming log it shouldn't continue pushing you to the bottom. And when you scroll back to the bottom, it should continue to stream in the UI.

    6 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  18. Retention Policy for Diagnostics

    Add a retention policy to Azure Diagnostics much like Azure Storage has for logging and analytics. It is currently WAY too hard to clean up old diagnostics data.

    128 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    4 comments  ·  Flag idea as inappropriate…  ·  Admin →
  19. Assign Action Groups to multiple Resources.

    You should be able to assign Action Groups to multiple resources if you choose to do so.

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  20. Is there a roadmap for support in more regions?

    What is the roadmap for support in more regions? Specifically East US 2?

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  • Don't see your idea?

Diagnostics and Monitoring

Categories

Feedback and Knowledge Base