HDInsight

Welcome! You can use this site to tell the Microsoft HDInsight team what features you would like to see.

Remember that this site is for feature suggestions and ideas…

If you have technical questions, please visit our forums.
If you are looking for tutorials and documentation, please visit our getting started page.

  1. Start/Stop cluster HDInsight

    The possibility to start and stop a cluster. Now is only available delete the cluster and I do not want any charge unnecessarily if I don't use the cluster for several days.

    1,161 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    46 comments  ·  Platform  ·  Flag idea as inappropriate…  ·  Admin →

    [Update] Thanks for your continued feedback on this capability! Rest assured that we are tracking this request closely along with several other platform capabilities our customers have requested. In the meanwhile, you can leverage cluster scaling capability to adjust HDInsight cluster size according to your varying compute needs. Azure Data Factory is another option you can explore for scheduling jobs with automatic creation and deletion of clusters: https://azure.microsoft.com/en-us/documentation/articles/data-factory-data-transformation-activities/

    Adnan Ijaz
    Program Manager
    Microsoft Azure HDInsight

  2. Add a feature to "shut down" an HD Insight cluster instead of deleting it when not in use.

    With HDinsight clusters being promoted as something that one can disable or turn off when not in use (cost concerns), I would like to suggest a way to just "shut down" or "deallocate" a cluster when not in use to avoid charges. This can work out pretty much the same as VMs. Users would expect to be billed for the SQL and/or storage parts while the cluster is disabled.

    111 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    5 comments  ·  Platform  ·  Flag idea as inappropriate…  ·  Admin →

    Thanks, This is a common ask from our customers and something we are seriously thinking about. In the meantime you can use Azure Data Factory to “delete” the cluster and you can use persistent metastore using Azure SQL and persistent store like Azure Data Lake Store or Azure Blob which will make it seem like it is “shut down”. Thanks for your feedback. Rashim Gupta (HDInsight Engineering team)

  3. Supported JSON.SerDe for HIVE in HDinsight

    In our setup we're dealing with data with a complex schemas, so we're using a custom build json SerDe downloaded from here https://github.com/rcongiu/Hive-JSON-Serde in relation with HIVE. Each time HDinsight is updated to a newer version we run into issues related to this SerDe. It could be nice if MS could provide a SerDe that was tested and supported when a new HDinsight distribution is released.

    54 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    under review  ·  0 comments  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
  4. Create a developer 'sandbox' option

    As an alternative to the emulator, create a low cost single machine 'sandbox' option that runs on a single server for developers, data scientist etc to use, similar to HortonWorks/Cloudera's VM download.

    50 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    under review  ·  2 comments  ·  Platform  ·  Flag idea as inappropriate…  ·  Admin →
  5. Provide several industry standard data mining algorithms designed to be processed in a mapreduce hadoop cluster; complete with visualization

    Looking at data mining in analysis services along with its visualization. Provide these same algorithms (maybe more) to be processed instead of on a data source view, in a mapreduce fashion against data in HDFS, whereby data selection and algorithm processing is distributed, collected, re-distributed, until a logical regression limit is met, then assemble the results and provide great visualizations.

    34 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Workload  ·  Flag idea as inappropriate…  ·  Admin →
    under review  ·  matt winkler responded

    Dave,

    Thanks for the feedback, we’re looking into enabling scenarios like this. I would be curious to learn the type of algorithms you’d like to see here.

    —matt

  6. Do not automatically start charging for HDInsight when a new cluster is created

    I find that I create a cluster, and as I'm waiting for it to finish being set up I move onto other things. Only to return a day or two later and find I've already been billed over a hundred dollars. Has happened twice now.

    9 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Platform  ·  Flag idea as inappropriate…  ·  Admin →

    Thanks for feedback Badrul. I recommend using Azure Data Factory which can bring up and delete the cluster when not in use. You can also consider using Azure Data Lake Analytics which only bills you for the time your jobs are running. In the meantime, we will brainstorm how we can bring this feature in HDInsight.

  7. Create Eclipse plugin to connect to HDinsight and deploy jobs directly

    Create an eclipse plugin which will have a HDinsight perspective to be able to create MapReduce Applications in Java and deploy the jar directly in HDinsight server.

    8 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    2 comments  ·  Platform  ·  Flag idea as inappropriate…  ·  Admin →
    under review  ·  matt winkler responded

    Thanks for the suggestion. We’re currently evaluating a number of potential integration points within Eclipse. Would you prefer to see the Azure Eclipse tooling provide help here, or would you prefer to see the HADOOP Developer Tooling project offer support for HDINSIGHT[ http://hdt.incubator.apache.org/ ] ?

  8. Better error descriptions when scaling cluster size

    When trying to scale the number of nodes from 12 to 24, I received a generic error message stating "Failed updating cluster <CLUSTERNAME>, with the detail just saying "Submitting the scale request". When trying to scale the cluster from powershell, I get no output whatsoever - not even an error.

    The problem i was having was that my subnet was out of ip addresses, and would not let us scale any larger because it was out of addresses. The error output, while accurate was not descriptive and resulted in needing to open a help desk ticket to identify the issue.…

    2 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Platform  ·  Flag idea as inappropriate…  ·  Admin →
  9. Grrr

    The fact that you're assuming I want to talk about hdinsight is part of the problem. I created an azure account to try document db. And on the management page it is nowhere to be found. This is not rocket science if your ad says you sell some service then I should be able to add some service to my account. If I can't for some reason then it should still show up and explain why not. Making simple things hard is going to make me spend more time at amazon. Grrr

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Flag idea as inappropriate…  ·  Admin →
  10. Bug: Cluster password is invalid

    While deploying a new HDInside Cluster (Hadoop), the Cluster Login Password field indicates that accepts a password whit "a one upper or lower case latter". I used A123456789* and it was accepted. While deploying the cluster, the following error occured:

    Type
    Microsoft.HDInsight/clusters

    StatusMessage
    {"code":"BadRequest","message":"ClusterPasswordInvalid,Cluster password is invalid. Password must be at least 10 characters long and must contain at least one number, uppercase letter, lowercase letter and special character with no spaces and should not contain the username as part of it."}

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    3 comments  ·  Flag idea as inappropriate…  ·  Admin →
  • Don't see your idea?

HDInsight

Feedback and Knowledge Base