How can we improve Microsoft Azure Data Lake?

Enable code/U-SQL to read header row

provide functionality to read column row/schema of file dynamically

44 votes
Sign in
Check!
(thinking…)
Reset
or sign in with
  • facebook
  • google
    Password icon
    Signed in as (Sign out)

    We’ll send you updates on this idea

    Hemant Chandurkar shared this idea  ·   ·  Flag idea as inappropriate…  ·  Admin →

    3 comments

    Sign in
    Check!
    (thinking…)
    Reset
    or sign in with
    • facebook
    • google
      Password icon
      Signed in as (Sign out)
      Submitting...
      • Hodza Nassredin commented  ·   ·  Flag as inappropriate

        Event worse it is not possible to validate column names with the same type in incorrect order.
        For example you have SELECT a Int32, b Int32 but in file you have header "b,a".

      • Mike R commented  ·   ·  Flag as inappropriate

        I assume you mean that the extractor "automatically" detects the schema if a header row is present and provides a schema at extraction time?

        E.g.,

        @data = EXTRACT * FROM "filewithhdr.csv" USING Extractors.Csv(inferschema:true);

        Although how will you then continue to query the data? Note that you can already build a custom extractor that generates a map, something like:

        @data = EXTRACT data SqlMap<string,string> FROM "filewithhdr.csv" USING new CustomHdrExtractor();

      • Anonymous commented  ·   ·  Flag as inappropriate

        You could create a custom row processor to do this;

        public class ReadSchema : IProcessor
        {
        public override IRow Process(IRow input, IUpdatableRow output)
        {
        ISchema schema = input.Schema;

        for (int i = 0; i < schema.Count(); i++)
        {

        var col = schema[i];
        var name = col.Name;
        output.Set(i, name);

        }

        return output.AsReadOnly();

        }
        }

      Feedback and Knowledge Base