Do you have a comment or suggestion to improve SQL Server? We’d love to hear it!

unicode compression nvarchar(max)

Please enable Unicode compression for nvarchar(max) for both in row data and off row.

https://docs.microsoft.com/en-us/sql/relational-databases/data-compression/unicode-compression-implementation

5 votes
Sign in
(thinking…)
Sign in with: oidc
Signed in as (Sign out)

We’ll send you updates on this idea

Keith Lawrence shared this idea  ·   ·  Flag idea as inappropriate…  ·  Admin →

3 comments

Sign in
(thinking…)
Sign in with: oidc
Signed in as (Sign out)
Submitting...
  • Solomon Rutzky commented  ·   ·  Flag as inappropriate

    Aaron, based on the two links found in the FAQ that you linked to, it is unclear what constitutes a "long string". Nothing that I have read there indicates any "recommended" upper-limit to the effectiveness of SCSU as compared to a general purpose method. Still, I can say that:

    1) I don't think the actual algorithm matters, as long as the compression is transparent and doesn't come with much, if any, performance hit (overall)

    2) One strength of SCSU, and a related downside to using many other algorithms, is that SCSU is designed to _also_ work well with short strings as it does not have the compression overhead that some other algorithms have (such as GZIP, which is what COMPRESS() uses). For example:

    SELECT COMPRESS('a');
    -- 0x1F8B08000000000004004B040043BEB7E801000000

    3) even if SCSU (which is already implemented in the code since it works for non-MAX NVARCHAR) were extended only for in-row NVARCHAR(MAX), leaving off-row data for a separate project, we would still be better off than we are today (with no compression for NVARCHAR(MAX)), and better off than using UTF-8 for compression (against Unicode recommendations).

Feedback and Knowledge Base