Azure Table data sink does not allow building in a simple way re-runnable data slices
I attempted unsuccessfully to build a pipeline copying monthly sales data from an (on premise) database to an Azure Table.
Idea / hope was
- use the data slice start date as partition key
- deleting all previously loaded data with that partition key in when re-running a data slice
Current functionality only provides merge or replace functionality but does not allow for purge and reload as we would do with a database table, leading to high risk of keeping orphan records from old loads.
It is feasible to populate the partitionkey field with the slice start date using a macro, however, as far as I know, it is not possible to pre-delete a partition before (re)loading.
A second difficulty arrises if we wish to make use of the merge or replace functionality - using the default autogenerated guid for the RowKey will lead, as far as I can tell, to duplicating all records on each reload, bypassing the merge or replace condition.
Unless the source data already has a single field with a stable unique id, there is no way of populating the rowkey field with a combined value of all fields composing the primary key since there appears to be no way to write expressions in ADF field mappings.
The combination of above issues rules out for me using ADF and Azure Tables together, which is very unfortunate.
Can you please consider
1. an option to purge all records in an azure table matching the current partitionkey value.
2. ability to calculate the rowkey value based on a combination of fields.