Support 'dynamic' output file names in ADLA
When reading files, it's easy to read from files, using the wildcard characters and dynamic parts of the URI. (see the Month variable)
@mydata =
EXTRACT Customer string,
Active bool,
Month string
FROM @"/MySolution/Energy/InputFiles/{Month:*}.csv"
USING new USQLExtractors.CustomExtractor();
I als want to have this behavior at the output side. I want to split out by customer (and don't know these customer upfront)
OUTPUT @mydata
TO @"/MySolution/Energy/OutputFiles/{Customer:*}.csv"
USING Outputters.Text();
Please make this possible.
The private preview is in full swing and will soon become public preview. See https://github.com/Azure/AzureDataLake/blob/master/docs/Release_Notes/2018/2018_Spring/USQL_Release_Notes_2018_Spring.md#data-driven-output-partitioning-with-output-fileset-is-in-private-preview
30 comments
-
Mike R commented
Hi Piotrek
It has been released as GA now. I am still behind with updating the documentation/release notes due to other work.
Since there is a scale limit that we could not easily address, you will still need to specify the preview flag in the code, to acknowledge that you understand that the finalization phase can take a long time if a lot of files (in the thousands or more) are being created.
-
Piotrek commented
Hi,
Any updates on this feature?
Will it be GA?Thanks.
-
Max Barbour commented
Hi, what's the latest on the next release for ADLA? You mentioned later this year... I'm anxious to use this and some other preview features (GZipOutput, InputFileGrouping) in production, but I'm hesitant till they have gone fully live. Thanks!
-
Mike R commented
The feature can be considered in public preview (although documentation is going to go online with the GA announcing release note). An example of it can be found on our blog at https://blogs.msdn.microsoft.com/azuredatalake/2018/06/11/process-more-files-than-ever-and-use-parquet-with-azure-data-lake-analytics/. GA is planned later this year.
-
Max Barbour commented
Hello, any update on this? is the feature in Public Preview yet, and are there any plans for its release as a production feature? Thanks-
-
Victor Fedianine commented
Hi Richard, try something similar to following:
DECLARE EXTERNAL @summaryStorage string = "/Results/output/" + @Id + "/Summary.csv";
SET @@FeaturePreviews = "DataPartitionedOutput:on";OUTPUT @JobSummary
TO @summaryStorage
ORDER BY Id
USING Outputters.Csv(dateTimeFormat: "yyyy-MM-dd", outputHeader: false);Also, one of the problems I've had is some special characters, so please ensure at least in initial test code, not to have any special characters neither in file name or file path.
-
Richard commented
Hi, so I tried it but it says data partioned output no supported. I guess that means it's still not available? Or my code is incorrect....
OUTPUT @cleanseTable
TO @outputFolder + "/" + "{CleansedColumn}" + @processYear + ".csv"
USING Outputters.Text(delimiter : ';', outputHeader : true); -
Jakub Krupa commented
May I ask what is the current status? It was supposed to get to GA a few months ago.
-
Michael Rys commented
Sorry Karthik, it does not address your question.
-
Karthik commented
Does this feature also address declaring input files from a filename column from internal table/file as referenced in my question https://stackoverflow.com/a/52476803/8663707. Thanks.
-
Kumaraguru Sambandam commented
How to get this private preview?
-
Michael Rys commented
It will turn into GA later this summer or early fall, depending on the feedback we are getting during the preview phase and the time we need to address it.
-
Rodney commented
This is very useful (we are trying to split the data up into folders by UserID and then putting an ACL on each folder) - When does Private Preview turn into GA?
-
Michael Rys commented
The private preview is in full swing and will soon become public preview. See https://github.com/Azure/AzureDataLake/blob/master/docs/Release_Notes/2018/2018_Spring/USQL_Release_Notes_2018_Spring.md#data-driven-output-partitioning-with-output-fileset-is-in-private-preview
-
ivan padron commented
Please add this feature ASAP
-
Mike R commented
Hi all... sorry for the delay in answering. But I am happy to announce that we have an early version of this feature in a limited private preview. If you are interested in helping us test it, please send me a message at usql at microsoft dot com with your scenario.
-
Anonymous commented
Over two years, we've been waiting on this. The work around is two costly and doesn't work well with Data Factory. Whats the ETA for preview?
-
Mike R commented
@Arun: The script is generating that script.
@AK: The feature is under active development. We are not giving any detailed timelines in public settings, but will update the community once the feature is in preview. -
AK commented
This is exactly the feature i was looking for.. will have to do with the workaround for now..
Is there a road map that is publicly available where we can track the status of this feature.. thanks -
Anonymous commented
Hi,
http://stackoverflow.com/questions/42636855/u-sql-output-in-azure-data-lake/42676271#42676271
is not having full work around, they are missing the "/output/genscript.usql".
If it is given in that post, its very useful for us to work around that sample.
Please provide some examples.
Thank You,
Arun (user:6127385 [stackoverflow])