Batch mode window aggregate arithmetic overflow bug
The following query performs a windowed SUM over a columnstore table with 1500 total rows, each of which has the value 0 or 1, and it overflows the INT data type. This appears to be a bug in the batch mode window aggregate operator.
SELECT a, p, s, v, m, n,
SUM(CASE WHEN n IS NULL THEN 0 ELSE 1 END)
OVER (PARTITION BY s, v, a ORDER BY p) AS lastNonNullPartition
FROM (
SELECT a, p, s, v, m, n,
RANK() OVER (PARTITION BY v, s, a, p ORDER BY m) AS rank
FROM #t /* A columnstore table with 1,500 rows */
) x
WHERE x.rank = 1
--Msg 8115, Level 16, State 2, Line 1521
--Arithmetic overflow error converting expression to data type int.
If any of the following modifications are made, the error does not occur:
- Use trace flag 8649 to prefer a parallel plan regardless of the cost threshold for parallelism
- Use trace flag 9453 to disable batch mode
- Use the COUNT aggregation function instead of the SUM function
- Remove the WHERE x.rank = 1 predicate
See https://dba.stackexchange.com/questions/218634 for many more details and comments.
See https://gist.github.com/pattertall/2875082fe045c90b253c24a954d4f6fc (or the attached file) for a full reproduction script. This bug reproduces in SQL Server 2016, 2017, and 2019 CTP.


Thank you for reporting. This is fixed in SQL Server 2019 CTP 2.1 and will also be fixed soon in Azure SQL Database.
2 comments
-
Daniel Hutmacher commented
OP reports that this bug also reproduces in 2016 and 2017 - have those been patched as well?
-
Geoff Patterson commented
Thank you for the fix!