SQL Server - Big Data Clusters

Summary:


SQL Server Big Data Clusters is the multi-cloud, open data platform for analytics at any scale. Big Data Clusters (BDC) unites SQL Server with Apache Spark to deliver the best compute engines available for analytics in a single, easy to use deployment. With these engines, BDC is the ideal data platform for AI, ML, M/R, Streaming, BI, T-SQL, and Spark. Delivered as part of the SQL Server 2019 release, BDC is a cloud-native solution orchestrated by Kubernetes. Our mission is to accelerate, delight and empower our users as they quench their thirst for data driven insights.
 
More information about SQL Server Big Data Clusters is available in the documentation.

  • Hot ideas
  • Top ideas
  • New ideas
  • My feedback
  1. Add Delta Lake OSS to Spark API

    From https://github.com/delta-io/delta

    Add functionality to Spark engines to better handle Upserts/Merges, data versioning/CDC functionality inside BDC cluster. Audit history is a plus.

    7 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Apache Spark  ·  Flag idea as inappropriate…  ·  Admin →
  2. Update python version running on storage pool / spark worker nodes

    I would like to use some external libraries with pyspark that require python version >= 3.6.1. the version of python currently included in the hadoop containers is 3.5.2. Please update the container images to include a more recent version of python, or allow us to do so.

    1 vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Apache Spark  ·  Flag idea as inappropriate…  ·  Admin →
  3. Job Scheduling / Orchestration in BDC

    Are there any plans to introduce orchestration in BDC? I would like to be able to schedule Spark jobs for example

    1 vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Apache Spark  ·  Flag idea as inappropriate…  ·  Admin →
  4. Anaconda in BDC for Spark and ML Services

    Anaconda Enterprise lets you manage packages on a cluster and on your computer and keep them in sync.

    Can BDC include this? Some way to manage packages for spark

    3 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Apache Spark  ·  Flag idea as inappropriate…  ·  Admin →
  • Don't see your idea?

Feedback and Knowledge Base