HDInsight

Welcome! You can use this site to tell the Microsoft HDInsight team what features you would like to see.

Remember that this site is for feature suggestions and ideas…

If you have technical questions, please visit our forums.
If you are looking for tutorials and documentation, please visit our getting started page.

How can we improve HDInsight?

You've used all your votes and won't be able to post a new idea, but you can still search and comment on existing ideas.

There are two ways to get more votes:

  • When an admin closes an idea you've voted on, you'll get your votes back from that idea.
  • You can remove your votes from an open idea you support.
  • To see ideas you have already voted on, select the "My feedback" filter and select "My open ideas".
(thinking…)

Enter your idea and we'll search to see if someone has already suggested it.

If a similar idea already exists, you can support and comment on it.

If it doesn't exist, you can post your idea so others can support it.

Enter your idea and we'll search to see if someone has already suggested it.

  1. Start/Stop cluster HDInsight

    The possibility to start and stop a cluster. Now is only available delete the cluster and I do not want any charge unnecessarily if I don't use the cluster for several days.

    1,149 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    45 comments  ·  Platform  ·  Flag idea as inappropriate…  ·  Admin →

    [Update] Thanks for your continued feedback on this capability! Rest assured that we are tracking this request closely along with several other platform capabilities our customers have requested. In the meanwhile, you can leverage cluster scaling capability to adjust HDInsight cluster size according to your varying compute needs. Azure Data Factory is another option you can explore for scheduling jobs with automatic creation and deletion of clusters: https://azure.microsoft.com/en-us/documentation/articles/data-factory-data-transformation-activities/

    Adnan Ijaz
    Program Manager
    Microsoft Azure HDInsight

  2. Need HDInsight attach/detach edge node capability

    Need to be able to attach/detach edge nodes from an HDInsight cluster. This is not currently supported.

    196 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    19 comments  ·  Platform  ·  Flag idea as inappropriate…  ·  Admin →
  3. Add flink to HDInsight

    Apache flink is essential for implementing kappa architecture.

    146 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
  4. Add Apache NiFi to HDInsight Ambari

    Add Apache NiFi to the HDInsight cluster and Ambari UI.

    https://nifi.apache.org/

    136 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
  5. HDInsight Security insight and integration with Active Directory documentation

    Document how security is implemented with AD integration in an Enterprise HDInsight multi-node cluster.

    127 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    7 comments  ·  Security  ·  Flag idea as inappropriate…  ·  Admin →
  6. Add a feature to "shut down" an HD Insight cluster instead of deleting it when not in use.

    With HDinsight clusters being promoted as something that one can disable or turn off when not in use (cost concerns), I would like to suggest a way to just "shut down" or "deallocate" a cluster when not in use to avoid charges. This can work out pretty much the same as VMs. Users would expect to be billed for the SQL and/or storage parts while the cluster is disabled.

    111 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    5 comments  ·  Platform  ·  Flag idea as inappropriate…  ·  Admin →

    Thanks, This is a common ask from our customers and something we are seriously thinking about. In the meantime you can use Azure Data Factory to “delete” the cluster and you can use persistent metastore using Azure SQL and persistent store like Azure Data Lake Store or Azure Blob which will make it seem like it is “shut down”. Thanks for your feedback. Rashim Gupta (HDInsight Engineering team)

  7. HDInsight AutoScale

    Please provide auto scale option in HDInsight for scaling down and scale up of Cluster based on usage/query running.

    100 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    5 comments  ·  Platform  ·  Flag idea as inappropriate…  ·  Admin →
  8. 87 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    3 comments  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
  9. Provide support for reading Azure Table Storage data from Apache Spark

    Currently Azure Tables are not supported. Only Azure blobs support the HDFS interface required by Hadoop & Spark.

    75 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
  10. HDInsight on private vNet network

    The deployment of HDInsight configure the cluster with PublicIPs and and makes it accessable from internet. Please make an option to set up the clutser so that it can only be accessed from the private IP in a vNet . The vNet can then have VPN or Express route connectivity to on-premise networks and all access to the cluster should be limited to this.

    63 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    3 comments  ·  Security  ·  Flag idea as inappropriate…  ·  Admin →
  11. Add support for AppendBlob in hdinsight

    According to https://social.msdn.microsoft.com/Forums/sqlserver/en-US/3001af0c-7f0b-440a-ae65-08d563a5823f/azure-append-blob-storage-does-not-support-spark-textfile-api?forum=hdinsight

    HDInsight only support blockblob.

    Appendblob is ideal for archiving data in time slices, but it can't be consumed by Spark over hdinsight etc.

    59 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    3 comments  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
  12. Support reading Azure Data Lake data from Apache Spark on HDInsight

    Currently many open source applications (eg. Apache Hive) are supported (https://azure.microsoft.com/en-gb/documentation/articles/data-lake-store-compatible-oss-other-applications/). It would be great to have support for Apache Spark running in HDInsight clusters, too.

    55 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    4 comments  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
  13. Supported JSON.SerDe for HIVE in HDinsight

    In our setup we're dealing with data with a complex schemas, so we're using a custom build json SerDe downloaded from here https://github.com/rcongiu/Hive-JSON-Serde in relation with HIVE. Each time HDinsight is updated to a newer version we run into issues related to this SerDe. It could be nice if MS could provide a SerDe that was tested and supported when a new HDinsight distribution is released.

    54 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    under review  ·  0 comments  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
  14. Define NSG Rules for Restricting Outbound Internet Access

    The documentation states clearly that if you add an HDInsight cluster to a VNet, then you cannot apply outbound NSG rules. Having unrestricted outbound internet access is a significant risk. Are there any other mitigating controls in place to detect data leakage?

    50 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    2 comments  ·  Security  ·  Flag idea as inappropriate…  ·  Admin →
  15. Create a developer 'sandbox' option

    As an alternative to the emulator, create a low cost single machine 'sandbox' option that runs on a single server for developers, data scientist etc to use, similar to HortonWorks/Cloudera's VM download.

    50 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    under review  ·  2 comments  ·  Platform  ·  Flag idea as inappropriate…  ·  Admin →
  16. Support Mobius out of the box in HDInsight Spark cluster

    Several Mobius[1] customers have asked about the support in HDInsight Spark. Currently the experience is not smooth[2]. It would be nice to make Mobius work out of the box in HDInsight Spark and possibly even make the end-to-end experience building and deploying Spark jobs in .NET richer.

    [1] Mobius: .NET API for Spark - https://github.com/Microsoft/Mobius

    [2] Using Mobius in HDInsight - https://github.com/Microsoft/Mobius/blob/master/notes/running-mobius-app.md#mobius-in-azure-hdinsight-spark-cluster

    46 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
  17. apache atlas

    Having Apache Atlas with Hdinsight for Data Catalog and lineage would be great feature. Any plans for this on the road map ?

    43 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
  18. 40 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Flag idea as inappropriate…  ·  Admin →
  19. Currently Custom Dns is not supported in HDInsight.

    "Currently Custom Dns is not supported in HDInsight."
    We tested the configuration (HDInsight cluster with Windows/Linux, Hadoop and HDInsight 3.2 & 3.4) on new portal and got the error.
    However, if we use the classic portal, and create the classic virtual network with the custom DNS server registered, and then specify the virtual network during Windows version of HDInsight cluster provisioning, it seems that we can start the provision.
    But we use Linux Hadoop and cannot provision Linux version of Hadoop with custom DNS in virtual network, it is not supported in the old classic portal.
    Is there any suggestion…

    39 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Security  ·  Flag idea as inappropriate…  ·  Admin →
  20. Support Selecting a Certain Node When Scaling In

    The existing scaling out/in feature in HDInsight has a bad implication when it comes to scaling in, which is the inevitable failure of any pending or running jobs. It would be nice to have the ability to select certain nodes when scaling in, in order to safely shrink the cluster without loosing active jobs.

    37 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    3 comments  ·  Platform  ·  Flag idea as inappropriate…  ·  Admin →
← Previous 1 3 4 5 6 7
  • Don't see your idea?

HDInsight

Feedback and Knowledge Base