Michael Rys
My feedback
-
3 votes
Michael Rys supported this idea ·
-
3 votes
-
8 votes
Michael Rys supported this idea ·
-
4 votes
Michael Rys supported this idea ·
-
4 votes
Michael Rys commented
Do you mean catalog views a la usql.tables? Or some visual tool experience?
-
4 votes
Michael Rys commented
Parquet support with snappy is now in public preview. See https://github.com/Azure/AzureDataLake/blob/master/docs/Release_Notes/2018/2018_Spring/USQL_Release_Notes_2018_Spring.md#built-in-parquet-extractor-and-outputter-is-in-public-preview
ORC support is in private preview.
Are there other places you need SNAPPY support?
-
5 votes
Michael Rys supported this idea ·
-
6 votes
Michael Rys supported this idea ·
-
7 votes
Michael Rys supported this idea ·
-
8 votes
Michael Rys commented
Why can't you just move the constant comparison into the WHERE clause?
-
10 votes
Michael Rys supported this idea ·
Michael Rys commented
Thanks for filing this request. since we currently do not want to go through the cost of enforcing PK/FK constraints (which could be very costly given tera byte sized tables) we would need to introduce some special syntax to support this. We are adding this now to our backlog.
-
12 votes
Michael Rys supported this idea ·
-
11 votes
Michael Rys supported this idea ·
-
19 votes
Michael Rys commented
One of the problems in scale-out systems is to provide a scalable IDENTITY function to avoid the need to synchronize or serialize. One way to work around it is with ROW_NUMBER(), another is to use GUIDs instead of an integer value, since GUID creation does not need a global coordination.
-
43 votes
Michael Rys supported this idea ·
-
41 votes
Michael Rys supported this idea ·
Michael Rys commented
Thanks for your request. Indeed row-level updates and deletes are useful and it is on our longterm roadmap. Given the current processing architecture, the current work-arounds are:
1. Use partitioned tables to partition the data so that you can either drop partitions or at least minimize the reprocessing costs.
2. Reprocess the data by creating a new version with the dropped data deleted. -
15 votes
Michael Rys supported this idea ·
-
13 votes
Michael Rys commented
I would suggest to file this in the ML Studio feedback section.
-
16 votes
Michael Rys commented
We have R support as part of the U-SQL Extension libraries for a while now. I would be interested in hearing what you need in addition to this. Note that we recently increased the memory a vertex can use to 5GB.
-
48 votes
Michael Rys commented
Note that there is a community contributed Excel extractor available at https://github.com/Azure/AzureDataLake/tree/master/Samples/ExcelExtractor
Michael Rys commented
Thanks Jayme
Note that U-SQL can read most CSV and TSV files that are generated by Excel (without header and no CR/LF in content). XLSX files are harder to support: They are a compressed archive of XML files, so it makes it rather difficult to give you good performing processing.
We will look into it though if there are enough votes.
We just added CASE support: https://github.com/Azure/AzureDataLake/blob/master/docs/Release_Notes/2018/2018_Spring/USQL_Release_Notes_2018_Spring.md#u-sql-supports-ansi-sql-case-expression