Support DISTINCT for STRING_AGG
Currently STRING_AGG aggregates all strings passed as an input. It would be very useful to support DISTINCT, so it would concatenate unique strings only.
CREATE TABLE dbo.Test (a VARCHAR(255))
INSERT INTO dbo.Test (a)
SELECT STRING_AGG(a, ',') FROM dbo.Test t
It returns: 'Str1,Str2,Str1'
SELECT STRING_AGG(DISTINCT a, ',') FROM dbo.Test t
It would return: 'Str1,Str2'
please add this ...
Marcus Koch commented
I'm puzzled why this needs a vote and has not been implemented in the first place. Why is Microsoft so often just selling half-baked bread?
Nima Veze commented
Would be nice, thumbs up!
same time,hope string_agg become windows function,support over clause.
Agreed. Postgresql has had support for the DISTINCT keyword in STRING_AGG clauses for a while now, and the need for a subquery to handle the deduplicating makes the query a lot messier than it would otherwise need to be.
Solomon Rutzky commented
While I did up-vote this suggestion, there's no telling if/when it will be implemented. So, just to put this out there: I have included a SQLCLR aggregate function in my SQL# SQLCLR library ( https://SQLsharp.com/ ). It's called "Agg_JoinPlus", and is only available in the Full (i.e. paid) version, not in the Free version (which does include "Agg_Join" and "Agg_JoinDelim", but neither one handles ordering or duplicate removal). It works as follows:
tab.[col], -- Value NVARCHAR(4000)
N',', -- Delimiter NVARCHAR(4000)
tab.[col], -- OrderBy NVARCHAR(4000)
1, -- Ordering TINYINT: 0 = none, 1 = ASC, 2 = DESC
NULL, -- InitialDelimiter NVARCHAR(4000)
NULL, -- MinRecordsNeededForInitialDelimiter INT
NULL, -- FinalDelimiter NVARCHAR(4000)
NULL, -- MinRecordsNeededForFinalDelimiter INT
NULL, -- NullReplacement NVARCHAR(4000)
1, -- RemoveEmptyEntries BIT
1, -- DuplicateHandling TINYINT: 0 = keep, 1 = remove (i.e. DISTINCT)
1 -- Use Compression (to save on memory) BIT
FROM (VALUES ('a'), ('c'), ('b'), ('d'), (''), ('c')) tab(col);
Although it's not free, it does handle removing duplicates, and works with SQL Server as far back as 2008. The only platform is does not run on is Azure SQL Database since that no longer supports SQLCLR.
Stavros Macrakis commented
I came across this discussion as I was trying to figure out how to do this.
Yes, I can build an additional level of query to solve this, but that's true for select distinct and
count distinct, too.
In fact, it might be useful to do this as a generic part of the OVER clause.
Joshua Wilder commented
Vladimir Moldovanenko commented
Alexander Sharovarov, you've got my vote! Спасибо, что зарегистрировал эту проблему :)
Alexander Sharovarov commented
Vladimir Moldovanenko, the workaround works if you just want to aggregate one value. But it doesn't work that well when you need to aggregate a string as part of a larger query/report.
JL Sands commented
It's great that SQL Server finally added it but please finish the task.
Without DISTINCT support, it greatly reduces the benefit of the function. Most of the time STRING_AGG is used in conjunction with GROUP BY in the main query thus requiring a second sub-query just to provide distinct values to the STRING_AGG function is defeating its usefulness. Other ancient RDMS have supported DISTINCT this for decades.
Its a simple workaround when the query is simple ....which is rarely the case
Vladimir Moldovanenko commented
Workaround is fairly simple
SELECT STRING_AGG(a, ',') FROM (SELECT DISTINCT a FROM dbo.Test) t
However, why not support it? It makes T-SQL more powerful so this gets my vote