I have a large number of files which are read with Hive using a partitioning scheme. PARTITIONED BY functionality, which is so commonly used in HIVE is missing from polybase.93 votes
Thank you for your request. I would like to understand if this request concerns SQL DW or SQL Server implementation of PolyBase?
Today we can query data stored in parquet files on ADLS. It would be fantastic to extend this to support the new "Delta Lake" file format recently open-sourced by the DataBricks team ( see https://delta.io )
This would allow us to take advantage of ACID guarantees that the delta format brings to the data lake.42 votes
Querying SQL DW from SQL DB via external tables currently does not work due to an implicit SET LANGUAGE statement issued by the elastic queries component. It sets the Language to N'usenglish' (not supported) vs. usenglish (supported I guess). Same for SET NUMERICROUNDABORT, set to OFF (not supported). While you can explicitly set the NUMERICROUNDABORT to ON at run-time, you cannot do SET LANGUAGE us_english on SQL DB. Maybe other issuesd are to be found but this scenario is important.36 votes
we currently has Polybase to Azure Blob Store.
we need Polybase to the following data stores:
HDInsight with push down compute?
SQL Azure with push down compute?
DocumentDB with push down compute?
Azure Data Lake Store with push down to Azure Data Lake Analytics?26 votes
Thanks for your suggestion. We are looking into this scenario for a future release. 10686397
As Azure SQL DW now also supports ADLS, it would be great if we could also leverage the distributed processing capabilities of ADLS to push computation down to ADLS in a similar way as it works for HDFS/Hadoop already22 votes
Please add support of Azure TableStorage as PolyBase DataSource.
To be honest I can't believe its not already available. :O21 votes
Thanks for your suggestion. We are looking into this scenario for a future release. 10697404
Thanks for the request! This item is currently on our backlog. We will update this item when the status changes.
If a data source moves or changes then all external tables must be deleted, the data source deleted and recreated, then the external tables recreated.
This may be experienced during a data warehouse migration, and during a disaster recovery.
Support for ALTER EXTERNAL DATA SOURCE would make this a much more simple task.17 votes
Support for Azure storage Append Blob with PolyBase External tables14 votes
Thank you for the request. This item is currently on our back log and under review. We will update this thread when the state changes.
Currently SAS is the only option for Polybase to access blob storage or datawarehouse.13 votes
Thanks for the request! Currently PolyBase only supports Storage Account Keys when connecting to Azure Storage Blobs. SAS token support is on our backlog.
Today, Storage does not support AD as an authentication method.
We will update when we move this feature through the backlog.
Hive metastore integration for polybase / Azure SQL DW
I want to be able to seamless access/ import /join on tables that are already captured in my common hive metastore. This would significantly help integrating DW in our big data infrastructure and eliminate the huge duplicated maintenance effort of keeping external table definitions in sync.13 votes
Thanks for your suggestion. We are looking into this scenario for a future release. 6891253
PolyBase: Allow encoding in file format and polybase will take care of encoding.8 votes
PolyBase in SQL DW now supports UTF-16 and UTF-8 encoding. Are there other encodings that you require?
Enable exporting data to a single file with custom export options: file format, max size, custom null, customer escape character, encryption, compression, etc7 votes
Predicate pushdown to Hadoop sources in SQL Server 2017 and 2019 CTP 2.2 works for date, time, and numeric data types (as the documentation at https://docs.microsoft.com/en-us/sql/relational-databases/polybase/polybase-pushdown-computation?view=sql-server-2017 correctly notes). It would be great if we could also push down string predicates from VARCHAR or NVARCHAR data types.
Supporting LIKE filters would be nice, but even exact matches and support just for what's covered for numeric data types could help.7 votes
right now after an external table has been created we can't add data to it, please update the product so that would be possible7 votes
Thanks for the request! This item is on our backlog. We will reply to this thread when the status changes.
For building an on prem dwh it would be great, if polybase could add s3 as external data source. External tables could be used as the staging area and s3 object storage (e.g. open source minio) for the Persistent Staging Area (psa). Gzipped csv or parquet ly on s3 and could be queried on demand as a long time storage.6 votes
Polybase (at least on SQL Server 2016) is limited to a maximum row size of 32k. Request to increase largely this value!6 votes
The following CETAS statement fails. Apparently there's an issue with old dates and ORC files? Please fix.
CREATE EXTERNAL FILE FORMAT ORC_Snappy
FORMAT_TYPE = ORC
, DATA_COMPRESSION = 'org.apache.hadoop.io.compress.SnappyCodec'
create EXTERNAL TABLE dbo.testexternaltable WITH (DATASOURCE = myds, LOCATION = '/testexternaltable', FILEFORMAT = ORC_Snappy) AS
select cast('1910-08-13' as date) as dt;6 votes
The Apache Knox Gateway (“Knox”) provides perimeter security so that the enterprise can confidently extend Hadoop access to more of those new users while also maintaining compliance with enterprise security policies.
More and more enterprises are using it as the only way to access to the cluster data.
The security is more than never a point of attention so it would be great if Polybase could connect to a Hadoop cluster using the Knox Gateway.6 votes
PolyBase in SQL DW does not support connecting to Hadoop clusters today. When we enable that functionality in the future, we will take this request into consideration
Azure Synapse Analytics currently does not support creating external data sources to RDBMS databases such as Azure SQL, SQL Server, Orcale, etc.). However, this functionality is available in MS SQL Server 2019.5 votes
- Don't see your idea?