Gen2 not all it was touted to be
Our organization was assured that Gen2 would do wonders for our big data performance. It didn't. We were told that Gen2 adds in data caching that will make things run faster. It doesn't. To make use of data caching, we added clustered columnstore indexes to our tables that have an excess of 100 million records, and our data analytics now run a few minutes longer, not shorter; our result sets are too large to cache and reuse. Looking at the data warehouse usage stats, compute used doesn't exceed 40 DWUs, but the only way to get faster runtimes is to scale up the data warehouse to 1000 DWUs or higher. Memory is the bottleneck for really big data queries, not compute power, but there is no way to increase memory without paying premium dollars for additional compute power that isn't needed or utilized. Lastly, I can't help but notice that the additional memory provided in Gen2 comes at the expense of parallel data processing potential due to the drop in compute nodes; where Gen1 provided 5 compute nodes at 500DWUs and 10 compute notes at 1000DWUs, Gen2 provides 1 and 2 compute nodes respectively.
In short, for really big data analytics, the only way we'll be able to harness the full potential of Azure Data Warehouse is if Microsoft provides a way to significantly scale up available memory per data warehouse without having to scale up compute power.