HDInsight

Welcome! You can use this site to tell the Microsoft HDInsight team what features you would like to see.

Remember that this site is for feature suggestions and ideas…

If you have technical questions, please visit our forums.
If you are looking for tutorials and documentation, please visit our getting started page.

  1. Add flink to HDInsight

    Apache flink is essential for implementing kappa architecture.

    150 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
  2. Add Apache NiFi to HDInsight Ambari

    Add Apache NiFi to the HDInsight cluster and Ambari UI.

    https://nifi.apache.org/

    136 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
  3. 87 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    3 comments  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
  4. Provide support for reading Azure Table Storage data from Apache Spark

    Currently Azure Tables are not supported. Only Azure blobs support the HDFS interface required by Hadoop & Spark.

    75 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
  5. apache atlas

    Having Apache Atlas with Hdinsight for Data Catalog and lineage would be great feature. Any plans for this on the road map ?

    65 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    2 comments  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
  6. Add support for AppendBlob in hdinsight

    According to https://social.msdn.microsoft.com/Forums/sqlserver/en-US/3001af0c-7f0b-440a-ae65-08d563a5823f/azure-append-blob-storage-does-not-support-spark-textfile-api?forum=hdinsight

    HDInsight only support blockblob.

    Appendblob is ideal for archiving data in time slices, but it can't be consumed by Spark over hdinsight etc.

    59 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    3 comments  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
  7. Support reading Azure Data Lake data from Apache Spark on HDInsight

    Currently many open source applications (eg. Apache Hive) are supported (https://azure.microsoft.com/en-gb/documentation/articles/data-lake-store-compatible-oss-other-applications/). It would be great to have support for Apache Spark running in HDInsight clusters, too.

    57 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    4 comments  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
  8. Supported JSON.SerDe for HIVE in HDinsight

    In our setup we're dealing with data with a complex schemas, so we're using a custom build json SerDe downloaded from here https://github.com/rcongiu/Hive-JSON-Serde in relation with HIVE. Each time HDinsight is updated to a newer version we run into issues related to this SerDe. It could be nice if MS could provide a SerDe that was tested and supported when a new HDinsight distribution is released.

    54 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    under review  ·  0 comments  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
  9. Support Mobius out of the box in HDInsight Spark cluster

    Several Mobius[1] customers have asked about the support in HDInsight Spark. Currently the experience is not smooth[2]. It would be nice to make Mobius work out of the box in HDInsight Spark and possibly even make the end-to-end experience building and deploying Spark jobs in .NET richer.

    [1] Mobius: .NET API for Spark - https://github.com/Microsoft/Mobius

    [2] Using Mobius in HDInsight - https://github.com/Microsoft/Mobius/blob/master/notes/running-mobius-app.md#mobius-in-azure-hdinsight-spark-cluster

    46 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
  10. Provide several industry standard data mining algorithms designed to be processed in a mapreduce hadoop cluster; complete with visualization

    Looking at data mining in analysis services along with its visualization. Provide these same algorithms (maybe more) to be processed instead of on a data source view, in a mapreduce fashion against data in HDFS, whereby data selection and algorithm processing is distributed, collected, re-distributed, until a logical regression limit is met, then assemble the results and provide great visualizations.

    34 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
    under review  ·  matt winkler responded

    Dave,

    Thanks for the feedback, we’re looking into enabling scenarios like this. I would be curious to learn the type of algorithms you’d like to see here.

    —matt

  11. 29 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
  12. Provide the %pyspark intrpreter for Zeppelin

    Other distributions of Zeppelin notebook include %pyspark interpreter. The one on HDinsight has only %spark, %sql, %dep, %md. Would be really nice to have %pyspark.

    24 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
  13. Support Spark SQL job submission using .NET Client Library

    Currently it's not possible to submit s Spark SQL jobs to spark cluster using Livy (https://issues.cloudera.org/browse/LIVY-19). As there are many teams who would want to convert their Hive code to Spark SQL, and benefit from interactivity of Spark, it would be very nice if Microsoft would create a .NET library that would allow submission of Spark SQL jobs to the HDInsight cluster, ideally using .NET library (or at least an implementation of the LIVY-19 ticket would be nice).

    24 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    2 comments  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
  14. Install Microsoft R Open on non-premium Spark clusters

    The non-premium Spark clusters include support for SparkR, but the nodes don't have R installed - which SparkR requires to be used.

    Please update the HDInsight clusters to include the R binaries (with CRAN R or MRO).

    24 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    3 comments  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
  15. Add Support for Seaborn data visualization python library

    Deploying Spark code that runs using PySpark kernel on HDInsight does not support code that includes Seaborn libraries for visualization.

    12 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
  16. Use Apache Spark for reading data from a U-SQL Catalog.

    Implement a Spark package for reading data in a U-SQL Catalog.
    Similar to DataStax Cassandra Spark driver which knows also the internals of U-SQL Catalog and hence can read structured data efficiently.

    12 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
  17. Enable dynamic allocation for Spark executors by default

    I would like the default executor allocation in Spark to be dynamic instead of static as it is now.

    10 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
  18. spark 2.1 support BI connector

    Can you please support the BI Connector in Spark 2.1 HDI 3.6?

    Thanks!

    9 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
  19. Add devtools package to HDInsight R Server edge node by default

    devtools (https://github.com/hadley/devtools) is a very popular package for package management in R. It is also quite large, and has many dependencies, so it can take a long time to install. It would be very convenient if this was installed by default.

    9 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
  20. Add persistent storage(ADL/Blob) as a backend storage for HDInsight Kafka Cluster

    Like other HDInsight cluster i.e. Hadoop, HBase , Kafka cluster should also have option to use Azure storage account or datalake as backend storage.
    It will help user to restore the kafka logs in case if cluster crashes.

    9 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
← Previous 1 3
  • Don't see your idea?

HDInsight

Categories

Feedback and Knowledge Base