SQL Server - Big Data Clusters

Summary:


SQL Server Big Data Clusters is the multi-cloud, open data platform for analytics at any scale. Big Data Clusters (BDC) unites SQL Server with Apache Spark to deliver the best compute engines available for analytics in a single, easy to use deployment. With these engines, BDC is the ideal data platform for AI, ML, M/R, Streaming, BI, T-SQL, and Spark. Delivered as part of the SQL Server 2019 release, BDC is a cloud-native solution orchestrated by Kubernetes. Our mission is to accelerate, delight and empower our users as they quench their thirst for data driven insights.
 
More information about SQL Server Big Data Clusters is available in the documentation.

  • Hot ideas
  • Top ideas
  • New ideas
  • My feedback
  1. Add PolyBase feature (CREATE EXTERNAL TABLE AS SELECT) to SQL Server 2019 and Big Data Clusters

    Please add the PolyBase feature (CREATE EXTERNAL TABLE AS SELECT) which is only available on Azure Synapse Analytics (SQLDW) and PDW right now.

    We require this on SQL 2019 and Big Data Clusters for creating a data hub/catalog catalog from curated views that are federated across many SQL Server platforms or Big Data Clusters platforms. Thus extending the data virtualization capability to views for SQL Server as source and not just tables.

    11 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  Data Virtualization  ·  Flag idea as inappropriate…  ·  Admin →
  2. Parquet - add support for zstandard compression when reading from External FILES

    Zstandard (zstd) is a compression codec that seems to give great results, "equals-superior to Gzip level compression with Lz4 level CPU".

    The official Parquet format supports ZStandard compression (*), as well as common Big Data technologies.
    But BDC Parquet doesn't supports it, it only supports (None, Snappy, Gzip) compression. This user voice is to add ZStandard.

    List of currently supported compression formats in BDC Parquet files: https://docs.microsoft.com/fr-fr/sql/t-sql/statements/create-external-file-format-transact-sql?view=sql-server-ver15&tabs=parquet

    Reference links for ZStandard:
    - Official link for zstd: https://facebook.github.io/zstd/
    - (*) Parquet supports for zstardard: https://github.com/apache/parquet-format/blob/54e6133e887a6ea90501ddd72fff5312b7038a7c/src/main/thrift/parquet.thrift#L461
    - Hadoop supports for zstardard: https://issues.apache.org/jira/browse/HADOOP-13578
    - Kafka supports for zstardard: https://issues.apache.org/jira/browse/KAFKA-4514
    - Cloudflare benchmark of…

    2 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Data Virtualization  ·  Flag idea as inappropriate…  ·  Admin →
  3. Need samples or documentation for how to setup HDFS tiering with Cloudera

    I know it is possible to configure HDFS tiering with S3, ADLS, and Cloudera. It's been near impossible to find any clear documentation on how to setup HDFS tiering for Cloudera. I'd like to see some materials added to the GItHub SQL Samples site for this. https://github.com/microsoft/sql-server-samples/tree/master/samples/features/sql-big-data-cluster

    3 votes

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Data Virtualization  ·  Flag idea as inappropriate…  ·  Admin →
  4. SQL 2019 out of box driver 2.3.8 does not work with CosmosDB (MongoDB API)

    I have a BDC SQL 2019 Kubernetes cluster deployed.
    When connected via master node and following instructions to create an external table against CosmosDB (MongoDB API) it fails with error:

    ODBC error: [Microsoft][MongoDBODBC] (110) Error from MongoDB Client: Server at xxxxx.documents.azure.com:10255 reports wire version 2, but this version of libmongoc requires at least 3 (MongoDB 3.0) (Error Code: 15) Additional error <2>: ErrorMsg: [Microsoft][MongoDBODBC] (110) Error from MongoDB Client: Server at darminm.documents.azure.com:10255 reports wire version 2, but this version of libmongoc requires at least 3 (MongoDB 3.0) (Error Code: 15), SqlState: HY000, NativeError: 110 .
    Total execution time: 00:00:02.187

    Same…

    1 vote

    We're glad you're here

    Please sign in to leave feedback

    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  Data Virtualization  ·  Flag idea as inappropriate…  ·  Admin →
  • Don't see your idea?

Feedback and Knowledge Base