Raise the number of concurrent tasks per node core
The idea I have in mind is to use Azure Batch to export a >100 TB Azure Table, at a much faster rate than Azure Data Factory can manage.
Azure Tables is quoted to cope with 10,000s of simultaneous requests. To maximize my dev velocity, I would create millions of lightweight tasks and let Azure Batch runtime handle all the scheduling details.
But there is a limit of 4 concurrent tasks per core so this would make for extremely underutilized nodes and long runtime.
That's a good suggestion. Currently we have a custom solution and run up to 20 tasks per A1 VM (with 1 core) to get the most of it, and will never look at Azure Batch which allows only 4.
Yes, if your application is highly optimized, then you definitely don't need that degree of parallelism per node, because you'll just kill your perf, but that's not how the industry works to go fast.