How can we improve Microsoft Azure Data Lake?

Enable code/U-SQL to read header row

provide functionality to read column row/schema of file dynamically

47 votes
Sign in
(thinking…)
Sign in with: oidc
Signed in as (Sign out)

We’ll send you updates on this idea

Hemant Chandurkar shared this idea  ·   ·  Flag idea as inappropriate…  ·  Admin →

3 comments

Sign in
(thinking…)
Sign in with: oidc
Signed in as (Sign out)
Submitting...
  • Hodza Nassredin commented  ·   ·  Flag as inappropriate

    Event worse it is not possible to validate column names with the same type in incorrect order.
    For example you have SELECT a Int32, b Int32 but in file you have header "b,a".

  • Mike R commented  ·   ·  Flag as inappropriate

    I assume you mean that the extractor "automatically" detects the schema if a header row is present and provides a schema at extraction time?

    E.g.,

    @data = EXTRACT * FROM "filewithhdr.csv" USING Extractors.Csv(inferschema:true);

    Although how will you then continue to query the data? Note that you can already build a custom extractor that generates a map, something like:

    @data = EXTRACT data SqlMap<string,string> FROM "filewithhdr.csv" USING new CustomHdrExtractor();

  • Anonymous commented  ·   ·  Flag as inappropriate

    You could create a custom row processor to do this;

    public class ReadSchema : IProcessor
    {
    public override IRow Process(IRow input, IUpdatableRow output)
    {
    ISchema schema = input.Schema;

    for (int i = 0; i < schema.Count(); i++)
    {

    var col = schema[i];
    var name = col.Name;
    output.Set(i, name);

    }

    return output.AsReadOnly();

    }
    }

Feedback and Knowledge Base