How can we improve Microsoft Azure SQL Data Warehouse ?

Polybase: allow field/row terminators within string fields

If a line end is present within a text field (qualified) importing data using Polybase will fail.

Loading this line will work fine:
1;2.1;"zzzz"

Loading this will fail:
1;2.1;"zz
zz"

Code:

CREATE EXTERNAL FILE FORMAT text_file_format_raw
WITH
(
FORMAT_TYPE = DELIMITEDTEXT,
FORMAT_OPTIONS
(
FIELD_TERMINATOR =';',
STRING_DELIMITER = '0x22',
USE_TYPE_DEFAULT = TRUE
)
);

CREATE EXTERNAL TABLE dbo.test_external
(
col1 INT,
col2 DECIMAL(2,1),
col3 NVARCHAR(5)
)
WITH
(
LOCATION = 'test_failing.txt',
DATA_SOURCE = azure_storage,
FILE_FORMAT = text_file_format_raw
);

CREATE TABLE dbo.test
WITH (CLUSTERED COLUMNSTORE INDEX, DISTRIBUTION = ROUND_ROBIN)
AS
SELECT * FROM dbo.test_external;

69 votes
Sign in
Check!
(thinking…)
Reset
or sign in with
  • facebook
  • google
    Password icon
    I agree to the terms of service
    Signed in as (Sign out)

    We’ll send you updates on this idea

    AdamAdam shared this idea  ·   ·  Flag idea as inappropriate…  ·  Admin →
    under review  ·  SQL DW TeamAdminSQL DW Team (Feedback, Microsoft Azure) responded  · 

    Thanks for the suggestion. We are looking into this scenario for a future release. “PolyBase should allow escaping row, field and string delimiters in string fields” – 7284026.

    7 comments

    Sign in
    Check!
    (thinking…)
    Reset
    or sign in with
    • facebook
    • google
      Password icon
      I agree to the terms of service
      Signed in as (Sign out)
      Submitting...
      • Himalayan NinjaHimalayan Ninja commented  ·   ·  Flag as inappropriate

        Because of this feature we are completely stuck and can't use polybase at all. How can a Dev team miss this basic and most important feature? Please fix this asap as it is very crucial for success of polybase

      • Ashish KumarAshish Kumar commented  ·   ·  Flag as inappropriate

        This is an absolutely desired functionality. Minimally, we need to be able to escape the field if they are used in the date.

        Example:
        CREATE EXTERNAL FILE FORMAT TextFileFormat
        WITH
        ( FORMAT_TYPE = DELIMITEDTEXT
        , FORMAT_OPTIONS ( FIELD_TERMINATOR = ','
        , STRING_DELIMITER = '0x22'
        , DATE_FORMAT = 'dd-MMM-yyyy HH:mm:ss'
        , USE_TYPE_DEFAULT = FALSE
        )
        );

        Does now allow the following string data to be loaded.
        "Allowed codes are "A", "B", and "C"."

      • Anonymous commented  ·   ·  Flag as inappropriate

        This is really important feature need, escaping characters is very basic in DB work.

      • Kapil JainKapil Jain commented  ·   ·  Flag as inappropriate

        Hi Team, Any update on this. Its really very important for us as our production jobs are failing due to this.

      • Anonymous commented  ·   ·  Flag as inappropriate

        this was raised in Nov last year... we are not heading towards the end of July on a product that is going GA soon. This is a bug, and a massive bug at that. if i were to read this very same file in Excel, or SSIS, it would work absolutely fine.

        What is the a) workaround, or b) timeline for resolution please?

        Both of these will tell me what i need to know / do in terms of getting my system working right now.

      Feedback and Knowledge Base