Cardinality estimator expects rows to be filtered in OUTER JOIN when dimension table has higher cardinality than fact
Executing a query with an OUTER JOIN can cause the cardinality estimator to expect rows being filtered out. This should never be the case if we're doing an OUTER JOIN and can result in performance issues as the row set and memory needed for downstream operators is underestimated.
Attached is a reproduction script and some sample data. A few additional observations:
- This seems to happen when the dimension has a higher cardinality than the fact table. Perhaps not the main use case, but we have financials where codes may be added/removed over time, while the dimension table will contain all the historical codes.
- When running with trace flags 2363 and 3604, the selectivity computation output showed lines where we were scaling the join selectivity up - "Scaling join selectivity up by X to compensate for more distinct values in dimension than in fact". This make sense for an INNER JOIN, but seems to be a bug for OUTER JOIN.
- When reverting to the legacy cardinality estimator using, this behavior is fixed.