HDInsight

Welcome! You can use this site to tell the Microsoft HDInsight team what features you would like to see.

Remember that this site is for feature suggestions and ideas…

If you have technical questions, please visit our forums.
If you are looking for tutorials and documentation, please visit our getting started page.

How can we improve HDInsight?

You've used all your votes and won't be able to post a new idea, but you can still search and comment on existing ideas.

There are two ways to get more votes:

  • When an admin closes an idea you've voted on, you'll get your votes back from that idea.
  • You can remove your votes from an open idea you support.
  • To see ideas you have already voted on, select the "My feedback" filter and select "My open ideas".
(thinking…)

Enter your idea and we'll search to see if someone has already suggested it.

If a similar idea already exists, you can support and comment on it.

If it doesn't exist, you can post your idea so others can support it.

Enter your idea and we'll search to see if someone has already suggested it.

  1. Bug: Cluster password is invalid

    While deploying a new HDInside Cluster (Hadoop), the Cluster Login Password field indicates that accepts a password whit "a one upper or lower case latter". I used A123456789* and it was accepted. While deploying the cluster, the following error occured:

    Type
    Microsoft.HDInsight/clusters

    StatusMessage
    {"code":"BadRequest","message":"ClusterPasswordInvalid,Cluster password is invalid. Password must be at least 10 characters long and must contain at least one number, uppercase letter, lowercase letter and special character with no spaces and should not contain the username as part of it."}

    0 votes
    Vote
    Sign in
    Check!
    (thinking…)
    Reset
    or sign in with
    • facebook
    • google
      Password icon
      I agree to the terms of service
      Signed in as (Sign out)
      You have left! (?) (thinking…)
      0 comments  ·  Flag idea as inappropriate…  ·  Admin →
    • Grrr

      The fact that you're assuming I want to talk about hdinsight is part of the problem. I created an azure account to try document db. And on the management page it is nowhere to be found. This is not rocket science if your ad says you sell some service then I should be able to add some service to my account. If I can't for some reason then it should still show up and explain why not. Making simple things hard is going to make me spend more time at amazon. Grrr

      1 vote
      Vote
      Sign in
      Check!
      (thinking…)
      Reset
      or sign in with
      • facebook
      • google
        Password icon
        I agree to the terms of service
        Signed in as (Sign out)
        You have left! (?) (thinking…)
        1 comment  ·  Flag idea as inappropriate…  ·  Admin →
      • Better error descriptions when scaling cluster size

        When trying to scale the number of nodes from 12 to 24, I received a generic error message stating "Failed updating cluster <CLUSTERNAME>, with the detail just saying "Submitting the scale request". When trying to scale the cluster from powershell, I get no output whatsoever - not even an error.

        The problem i was having was that my subnet was out of ip addresses, and would not let us scale any larger because it was out of addresses. The error output, while accurate was not descriptive and resulted in needing to open a help desk ticket to identify the issue.…

        2 votes
        Vote
        Sign in
        Check!
        (thinking…)
        Reset
        or sign in with
        • facebook
        • google
          Password icon
          I agree to the terms of service
          Signed in as (Sign out)
          You have left! (?) (thinking…)
          1 comment  ·  Flag idea as inappropriate…  ·  Admin →
        • Do not automatically start charging for HDInsight when a new cluster is created

          I find that I create a cluster, and as I'm waiting for it to finish being set up I move onto other things. Only to return a day or two later and find I've already been billed over a hundred dollars. Has happened twice now.

          9 votes
          Vote
          Sign in
          Check!
          (thinking…)
          Reset
          or sign in with
          • facebook
          • google
            Password icon
            I agree to the terms of service
            Signed in as (Sign out)
            You have left! (?) (thinking…)
            1 comment  ·  Flag idea as inappropriate…  ·  Admin →

            Thanks for feedback Badrul. I recommend using Azure Data Factory which can bring up and delete the cluster when not in use. You can also consider using Azure Data Lake Analytics which only bills you for the time your jobs are running. In the meantime, we will brainstorm how we can bring this feature in HDInsight.

          • Add a feature to "shut down" an HD Insight cluster instead of deleting it when not in use.

            With HDinsight clusters being promoted as something that one can disable or turn off when not in use (cost concerns), I would like to suggest a way to just "shut down" or "deallocate" a cluster when not in use to avoid charges. This can work out pretty much the same as VMs. Users would expect to be billed for the SQL and/or storage parts while the cluster is disabled.

            76 votes
            Vote
            Sign in
            Check!
            (thinking…)
            Reset
            or sign in with
            • facebook
            • google
              Password icon
              I agree to the terms of service
              Signed in as (Sign out)
              You have left! (?) (thinking…)
              2 comments  ·  Flag idea as inappropriate…  ·  Admin →

              Thanks, This is a common ask from our customers and something we are seriously thinking about. In the meantime you can use Azure Data Factory to “delete” the cluster and you can use persistent metastore using Azure SQL and persistent store like Azure Data Lake Store or Azure Blob which will make it seem like it is “shut down”. Thanks for your feedback. Rashim Gupta (HDInsight Engineering team)

            • Create a developer 'sandbox' option

              As an alternative to the emulator, create a low cost single machine 'sandbox' option that runs on a single server for developers, data scientist etc to use, similar to HortonWorks/Cloudera's VM download.

              46 votes
              Vote
              Sign in
              Check!
              (thinking…)
              Reset
              or sign in with
              • facebook
              • google
                Password icon
                I agree to the terms of service
                Signed in as (Sign out)
                You have left! (?) (thinking…)
                under review  ·  2 comments  ·  Flag idea as inappropriate…  ·  Admin →
              • Supported JSON.SerDe for HIVE in HDinsight

                In our setup we're dealing with data with a complex schemas, so we're using a custom build json SerDe downloaded from here https://github.com/rcongiu/Hive-JSON-Serde in relation with HIVE. Each time HDinsight is updated to a newer version we run into issues related to this SerDe. It could be nice if MS could provide a SerDe that was tested and supported when a new HDinsight distribution is released.

                51 votes
                Vote
                Sign in
                Check!
                (thinking…)
                Reset
                or sign in with
                • facebook
                • google
                  Password icon
                  I agree to the terms of service
                  Signed in as (Sign out)
                  You have left! (?) (thinking…)
                  under review  ·  0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                • Start/Stop cluster HDInsight

                  The possibility to start and stop a cluster. Now is only available delete the cluster and I do not want any charge unnecessarily if I don't use the cluster for several days.

                  856 votes
                  Vote
                  Sign in
                  Check!
                  (thinking…)
                  Reset
                  or sign in with
                  • facebook
                  • google
                    Password icon
                    I agree to the terms of service
                    Signed in as (Sign out)
                    You have left! (?) (thinking…)
                    27 comments  ·  Flag idea as inappropriate…  ·  Admin →

                    [Update] Thanks for your continued feedback on this capability! Rest assured that we are tracking this request closely along with several other platform capabilities our customers have requested. In the meanwhile, you can leverage cluster scaling capability to adjust HDInsight cluster size according to your varying compute needs. Azure Data Factory is another option you can explore for scheduling jobs with automatic creation and deletion of clusters: https://azure.microsoft.com/en-us/documentation/articles/data-factory-data-transformation-activities/

                    Adnan Ijaz
                    Program Manager
                    Microsoft Azure HDInsight

                  • Provide several industry standard data mining algorithms designed to be processed in a mapreduce hadoop cluster; complete with visualization

                    Looking at data mining in analysis services along with its visualization. Provide these same algorithms (maybe more) to be processed instead of on a data source view, in a mapreduce fashion against data in HDFS, whereby data selection and algorithm processing is distributed, collected, re-distributed, until a logical regression limit is met, then assemble the results and provide great visualizations.

                    34 votes
                    Vote
                    Sign in
                    Check!
                    (thinking…)
                    Reset
                    or sign in with
                    • facebook
                    • google
                      Password icon
                      I agree to the terms of service
                      Signed in as (Sign out)
                      You have left! (?) (thinking…)
                      0 comments  ·  Flag idea as inappropriate…  ·  Admin →
                      under review  ·  matt winkler responded

                      Dave,

                      Thanks for the feedback, we’re looking into enabling scenarios like this. I would be curious to learn the type of algorithms you’d like to see here.

                      —matt

                    • Create Eclipse plugin to connect to HDinsight and deploy jobs directly

                      Create an eclipse plugin which will have a HDinsight perspective to be able to create MapReduce Applications in Java and deploy the jar directly in HDinsight server.

                      8 votes
                      Vote
                      Sign in
                      Check!
                      (thinking…)
                      Reset
                      or sign in with
                      • facebook
                      • google
                        Password icon
                        I agree to the terms of service
                        Signed in as (Sign out)
                        You have left! (?) (thinking…)
                        2 comments  ·  Flag idea as inappropriate…  ·  Admin →
                        under review  ·  matt winkler responded

                        Thanks for the suggestion. We’re currently evaluating a number of potential integration points within Eclipse. Would you prefer to see the Azure Eclipse tooling provide help here, or would you prefer to see the HADOOP Developer Tooling project offer support for HDINSIGHT[ http://hdt.incubator.apache.org/ ] ?

                      • Don't see your idea?

                      HDInsight

                      Feedback and Knowledge Base