Parquet - add support for zstandard compression when reading from External FILES
Zstandard (zstd) is a compression codec that seems to give great results, "equals-superior to Gzip level compression with Lz4 level CPU".
The official Parquet format supports ZStandard compression (*), as well as common Big Data technologies.
But BDC Parquet doesn't supports it, it only supports (None, Snappy, Gzip) compression. This user voice is to add ZStandard.
List of currently supported compression formats in BDC Parquet files: https://docs.microsoft.com/fr-fr/sql/t-sql/statements/create-external-file-format-transact-sql?view=sql-server-ver15&tabs=parquet
Reference links for ZStandard:
- Official link for zstd: https://facebook.github.io/zstd/
- (*) Parquet supports for zstardard: https://github.com/apache/parquet-format/blob/54e6133e887a6ea90501ddd72fff5312b7038a7c/src/main/thrift/parquet.thrift#L461
- Hadoop supports for zstardard: https://issues.apache.org/jira/browse/HADOOP-13578
- Kafka supports for zstardard: https://issues.apache.org/jira/browse/KAFKA-4514
- Cloudflare benchmark of (Kafka with zstd): https://blog.cloudflare.com/squeezing-the-firehose/