Distribute nulls evenly - to avoid skew when using hash distribution
My best candidate distribution key is 40% nulls.
At present, distributing on this value will result in one overloaded distribution, where all the nulls will land.
As nulls will never join to any other table, it would suit this use case to have equal distribution of null rows across all distributions, so that the non-null foreign keys can benefit from being distribution aligned.
I understand this would be a change in behaviour, so perhaps a new distribution option could be created i.e. HASH_AND_NULL(my_foreign_key)
Hi there. Depending on the size of the table and your workload, if your best candidate has so many nulls, you may be better off maintaining a round-robin distribution.