SQL Server - Big Data Clusters
Summary:
-
Parquet - add support for zstandard compression when reading from External FILES
Zstandard (zstd) is a compression codec that seems to give great results, "equals-superior to Gzip level compression with Lz4 level CPU".
The official Parquet format supports ZStandard compression (*), as well as common Big Data technologies.
But BDC Parquet doesn't supports it, it only supports (None, Snappy, Gzip) compression. This user voice is to add ZStandard.List of currently supported compression formats in BDC Parquet files: https://docs.microsoft.com/fr-fr/sql/t-sql/statements/create-external-file-format-transact-sql?view=sql-server-ver15&tabs=parquet
Reference links for ZStandard:
- Official link for zstd: https://facebook.github.io/zstd/
- (*) Parquet supports for zstardard: https://github.com/apache/parquet-format/blob/54e6133e887a6ea90501ddd72fff5312b7038a7c/src/main/thrift/parquet.thrift#L461
- Hadoop supports for zstardard: https://issues.apache.org/jira/browse/HADOOP-13578
- Kafka supports for zstardard: https://issues.apache.org/jira/browse/KAFKA-4514
- Cloudflare benchmark of…2 votes -
SQL 2019 out of box driver 2.3.8 does not work with CosmosDB (MongoDB API)
I have a BDC SQL 2019 Kubernetes cluster deployed.
When connected via master node and following instructions to create an external table against CosmosDB (MongoDB API) it fails with error:ODBC error: [Microsoft][MongoDBODBC] (110) Error from MongoDB Client: Server at xxxxx.documents.azure.com:10255 reports wire version 2, but this version of libmongoc requires at least 3 (MongoDB 3.0) (Error Code: 15) Additional error <2>: ErrorMsg: [Microsoft][MongoDBODBC] (110) Error from MongoDB Client: Server at darminm.documents.azure.com:10255 reports wire version 2, but this version of libmongoc requires at least 3 (MongoDB 3.0) (Error Code: 15), SqlState: HY000, NativeError: 110 .
Total execution time: 00:00:02.187Same…
1 vote -
Need samples or documentation for how to setup HDFS tiering with Cloudera
I know it is possible to configure HDFS tiering with S3, ADLS, and Cloudera. It's been near impossible to find any clear documentation on how to setup HDFS tiering for Cloudera. I'd like to see some materials added to the GItHub SQL Samples site for this. https://github.com/microsoft/sql-server-samples/tree/master/samples/features/sql-big-data-cluster
3 votes -
Add PolyBase feature (CREATE EXTERNAL TABLE AS SELECT) to SQL Server 2019 and Big Data Clusters
Please add the PolyBase feature (CREATE EXTERNAL TABLE AS SELECT) which is only available on Azure Synapse Analytics (SQLDW) and PDW right now.
We require this on SQL 2019 and Big Data Clusters for creating a data hub/catalog catalog from curated views that are federated across many SQL Server platforms or Big Data Clusters platforms. Thus extending the data virtualization capability to views for SQL Server as source and not just tables.
11 votes
- Don't see your idea?