Do you have a comment or suggestion to improve SQL Server? We’d love to hear it!

Support DISTINCT for STRING_AGG

Currently STRING_AGG aggregates all strings passed as an input. It would be very useful to support DISTINCT, so it would concatenate unique strings only.

Example:
CREATE TABLE dbo.Test (a VARCHAR(255))
INSERT INTO dbo.Test (a)
VALUES('Str1')
,('Str2')
,('Str1')

SELECT STRING_AGG(a, ',') FROM dbo.Test t

It returns: 'Str1,Str2,Str1'

SELECT STRING_AGG(DISTINCT a, ',') FROM dbo.Test t

It would return: 'Str1,Str2'

150 votes
Sign in
(thinking…)
Sign in with: Microsoft
Signed in as (Sign out)

We’ll send you updates on this idea

Alexander Sharovarov shared this idea  ·   ·  Flag idea as inappropriate…  ·  Admin →

9 comments

Sign in
(thinking…)
Sign in with: Microsoft
Signed in as (Sign out)
Submitting...
  • Solomon Rutzky commented  ·   ·  Flag as inappropriate

    While I did up-vote this suggestion, there's no telling if/when it will be implemented. So, just to put this out there: I have included a SQLCLR aggregate function in my SQL# SQLCLR library ( https://SQLsharp.com/ ). It's called "Agg_JoinPlus", and is only available in the Full (i.e. paid) version, not in the Free version (which does include "Agg_Join" and "Agg_JoinDelim", but neither one handles ordering or duplicate removal). It works as follows:

    SELECT SQL#.Agg_JoinPlus(
    tab.[col], -- Value NVARCHAR(4000)
    N',', -- Delimiter NVARCHAR(4000)
    tab.[col], -- OrderBy NVARCHAR(4000)
    1, -- Ordering TINYINT: 0 = none, 1 = ASC, 2 = DESC
    NULL, -- InitialDelimiter NVARCHAR(4000)
    NULL, -- MinRecordsNeededForInitialDelimiter INT
    NULL, -- FinalDelimiter NVARCHAR(4000)
    NULL, -- MinRecordsNeededForFinalDelimiter INT
    NULL, -- NullReplacement NVARCHAR(4000)
    1, -- RemoveEmptyEntries BIT
    1, -- DuplicateHandling TINYINT: 0 = keep, 1 = remove (i.e. DISTINCT)
    1 -- Use Compression (to save on memory) BIT
    )
    FROM (VALUES ('a'), ('c'), ('b'), ('d'), (''), ('c')) tab(col);
    -- a,c,b,d

    Although it's not free, it does handle removing duplicates, and works with SQL Server as far back as 2008. The only platform is does not run on is Azure SQL Database since that no longer supports SQLCLR.

  • Stavros Macrakis commented  ·   ·  Flag as inappropriate

    I agree.

    I came across this discussion as I was trying to figure out how to do this.

    Yes, I can build an additional level of query to solve this, but that's true for select distinct and
    count distinct, too.

    In fact, it might be useful to do this as a generic part of the OVER clause.

  • Alexander Sharovarov commented  ·   ·  Flag as inappropriate

    Vladimir Moldovanenko, the workaround works if you just want to aggregate one value. But it doesn't work that well when you need to aggregate a string as part of a larger query/report.

  • JL Sands commented  ·   ·  Flag as inappropriate

    It's great that SQL Server finally added it but please finish the task.

    Without DISTINCT support, it greatly reduces the benefit of the function. Most of the time STRING_AGG is used in conjunction with GROUP BY in the main query thus requiring a second sub-query just to provide distinct values to the STRING_AGG function is defeating its usefulness. Other ancient RDMS have supported DISTINCT this for decades.
    https://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html#function_group-concat

  • Vladimir Moldovanenko commented  ·   ·  Flag as inappropriate

    Workaround is fairly simple

    SELECT STRING_AGG(a, ',') FROM (SELECT DISTINCT a FROM dbo.Test) t

    However, why not support it? It makes T-SQL more powerful so this gets my vote

Feedback and Knowledge Base