Solomon Rutzky

My feedback

  1. 7 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  SQL Server » Suggestions  ·  Flag idea as inappropriate…  ·  Admin →
    An error occurred while saving the comment
    Solomon Rutzky commented  · 

    This issue has nothing to do with UTF-8. It is a general Windows Collation issue. Initially these two characters did equate when compared as case-insensitive. But then at some point, they were deliberately specified as being separate characters that could not equate even if using both case-insensitive and accent-insensitive. The Windows Server 2008 sort weights (if nothing else, at least very close to the definitions used for the version 100 Collations) define the two characters as initially being the same except for the difference in case:

    -----------------------------------------------
    DEFAULT 58093 ; Characters in this table are sorted by: SM, AW, DW, CW, Codepoint #
    ;CP SCRIPT ALPHA DIACRITIC CASING COMMENT
    ...
    0x044c 16 150 2 2 ;Cyrillic Small Soft Sign
    0x042c 16 150 2 18 ;Cyrillic Capital Soft Sign
    ...
    -----------------------------------------------

    The first value is the Code Point (CP). The "16" is the script, the "150" is the character within the script (i.e. the ALPHAbetic weight), the "2" after the "150" is the diacritic weight, and the "2" or "18" is the case weight. As we can see in the chart above, the two characters only differ in the "case weight" column (i.e. 2 vs 18). BUT, that is just the default sort weight. Culture/Locale-specific modifications are made depending on what culture / locale is being used. Looking at the "Ukrainian"-specific modifications, they are:

    -----------------------------------------------------------
    LCID 0x00000422 2 ;Ukranian (Cyrillic Script)

    0x044c 16 168 2 2 ;Cyrillic Small Soft Sign
    0x042c 16 188 2 18 ;Cyrillic Capital Soft Sign
    -----------------------------------------------------------

    As you can see, the values in the "Alphabetic Weight" column are now different (i.e. 168 and 188). This means two things:

    1) This behavior, even if incorrect, is intentional
    2) It is only the Ukrainian locale that redefines these two characters, so all other locales/cultures (i.e. Collations) will still equate them, assuming compared as case-insensitive. The following test proves this:

    -----------------------------------------------------------------------
    -- Windows Collation that came with SQL Server 2000:
    SELECT 1 WHERE N'ь' = N'Ь' COLLATE Ukrainian_CI_AS; -- 1

    -- Windows Collation that came with SQL Server 2008:
    SELECT 2 WHERE N'ь' = N'Ь' COLLATE Ukrainian_100_CI_AS; -- nothing
    SELECT 3 WHERE N'ь' = N'Ь' COLLATE Ukrainian_100_CI_AS_SC_UTF8; -- nothing

    -- Try again, but also with being Accent Insensitive:
    SELECT 4 WHERE N'ь' = N'Ь' COLLATE Ukrainian_100_CI_AI; -- nothing
    SELECT 5 WHERE N'ь' = N'Ь' COLLATE Ukrainian_100_CI_AI_SC_UTF8; -- nothing

    -- Try again (with accent-sensitive), but use another culture / locale:
    SELECT 6 WHERE N'ь' = N'Ь' COLLATE Hebrew_100_CI_AS; -- 6
    SELECT 7 WHERE N'ь' = N'Ь' COLLATE Hebrew_100_CI_AS_SC_UTF8; -- 7

    SELECT 8 WHERE N'ь' = N'Ь' COLLATE Latin1_General_100_CI_AS; -- 8
    SELECT 9 WHERE N'ь' = N'Ь' COLLATE Latin1_General_100_CI_AS_SC_UTF8; -- 9
    -----------------------------------------------------------------------

    Now, it should be noted that the Windows 10 sort weights (the newest definitions) DO go back to the previous behavior of keeping these two characters as the same except for case, even when using a Ukrainian Collation:

    -----------------------------------------------------------
    SORTGUID 00000040-57EE-1E5C-00B4-D0000BB1E11E 10
    LOCALENAME uk-UA ;Ukranian (Cyrillic Script)
    LOCALENAME uk ;Ukrainian - uk

    ...
    0x044c 16 137 2 2 ;Cyrillic Small Soft Sign
    0x042c 16 137 2 18 ;Cyrillic Capital Soft Sign
    -----------------------------------------------------------

    As you can see, the last two characters are both "137" for the Alphabetic Weight, and only differ by the "Case Weight".

    What does this Windows 10 info mean? It means that the only way to get this behavior in SQL Server is for the Collations team to issue a new Collation, based on these new definitions. If it was to be done in SQL Server 2019, it would be a version 150 series of Collations, such as:

    Ukrainian_150_CI_AI
    Ukrainian_150_CI_AI_UTF8
    Ukrainian_150_CI_AS
    Ukrainian_150_CI_AS_UTF8
    Ukrainian_150_CS_AS
    Ukrainian_150_CS_AS_UTF8
    ...and so on through the other _KS, _WS variations...

    Notice that there is no "_SC" option on those. That is because all new Collations inherently support Supplementary Characters (for details, please see: https://sqlquantumleap.com/2017/10/16/all-new-collations-in-sql-server-2017-implicitly-support-supplementary-characters/ ).

  2. 2 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    2 comments  ·  SQL Server » Bugs  ·  Flag idea as inappropriate…  ·  Admin →
    An error occurred while saving the comment
    Solomon Rutzky commented  · 

    As far as I can tell, this problem appears to be fixed (more details in item linked at the top of this item). This item can be closed. Thanks!

    An error occurred while saving the comment
    Solomon Rutzky commented  · 

    Yes, it seems that having a NULL is a prerequisite for getting any of several variations of this error, but that doesn't mean that any NULL will cause any error. Sometimes NULL (.in a UTF-8 VARCHAR column) is fine.

    In the end, it seems that the problem is that older client drivers (ODBC, .NET SqlClient, etc) do not handle these NULLs (sometimes). I have posted comments in the other ticket (the one that you linked to) with more details and links to test scripts.

    In my testing (using SSMS 17.9) I found that when the query would not return, I could click on the "x" of the query tab up top, and then answer "Y"es when it asks if I really want to end that query. Then give it several more seconds until it finally disconnects and pops up a dialog asking if I want to save the changes to a .sql file. I hit "esc" (or click "cancel") to dismiss the dialog, make a change, and hit "F5" to start the cycle of despair all over again ;-).

  3. 1 vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    2 comments  ·  SQL Server » Bugs  ·  Flag idea as inappropriate…  ·  Admin →

    Thank you for providing the examples of handling invalid characters. While we are trying to detect invalid sequences as we encounter them, at the same time we need to be cautious about performance. As such the current behavior occurs by design, under the general principle that if a partially invalid character sequence is input but certain characters were valid, the valid part of the character sequence may be output without error. If a fully invalid sequence is input, an error is then generated. This doesn’t mean that the logic of handing damaged sequences cannot be improved, so we encourage you to share your suggestions on how to make it better.

    An error occurred while saving the comment
    Solomon Rutzky commented  · 

    I have run these tests on CTP 2.1 and this issue _might_ be resolved. Things are definitely improved. The following test now fails (as it should):

    SELECT CONVERT(VARCHAR(4), 0x90);

    And, according to the Unicode Standard (v11), in Chapter 3 ( http://www.unicode.org/versions/Unicode11.0.0/ch03.pdf ): Conformance, section 3.9 "Encoding Forms" (page 56 of the PDF -- page 127 in the PDF):

    * the sequence <C2 41 42> must return as U+FFFD, U+0041, U+0042 and it does:
    SELECT CONVERT(VARCHAR(14), 0xC24142); -- �AB

    * (on the next page) the sequence <F0 80 80 41> is only required to return the 41 but can handle the previous, erroneous sequence in a variety of ways. SQL Server returns:

    SELECT CONVERT(VARCHAR(14), 0xF0808041); -- ��A

    Not sure if that is how it should be. Nor are the following entirely conclusive:

    SELECT CONVERT(VARCHAR(14), 0xF08080); -- {nothing}
    SELECT CONVERT(VARCHAR(14), 0xF0808080); -- ���
    SELECT CONVERT(VARCHAR(14), 0xF080808080); -- ERROR: Invalid data for UTF8-encoded characters

    Solomon Rutzky shared this idea  · 
  4. 1 vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  SQL Server » Bugs  ·  Flag idea as inappropriate…  ·  Admin →
    An error occurred while saving the comment
    Solomon Rutzky commented  · 

    I forgot to mention, I tested this in:

    * SQL Server 2012 (LocalDB; SP4-GDR)
    * SQL Server 2014 (LocalDB; SP2-CU11)
    * SQL Server 2016 (LocalDB; SP2)
    * SQL Server 2017 (Developer; CU10)
    * SQL Server 2019 (Enterprise; CTP 2.0)

    Solomon Rutzky shared this idea  · 
  5. 7 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    2 comments  ·  SQL Server » Suggestions  ·  Flag idea as inappropriate…  ·  Admin →

    Upvotes: 5

    <=-=Aug 5 2014 9:56AM=-=>

    Hello,
    After carefully evaluating all of the suggestion items in our pipeline, we are closing items that we will not implement in the near future due to current higher priority items. We will re-evaluate the closed suggestions again in the future based on the product roadmap.
    Thanks again for providing the product suggestion and continued support for our product.

    Jos de Bruijn – SQL Server PM

    <=-=Sep 2 2016 1:23PM=-=>

    We’ve worked to bring WordPress to SQL Server through our port known as Project Nami. But the number one issue we are running into with making plugins compatible with SQL Server is the lack of support for GROUP_CONCAT.

    An error occurred while saving the comment
    Solomon Rutzky commented  · 

    @Eilert : Agreed, this functionality has been implemented, so this item should be closed.

  6. 6 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    2 comments  ·  SQL Server » Suggestions  ·  Flag idea as inappropriate…  ·  Admin →

    Upvotes: 10

    <=-=Mar 8 2011 9:34AM=-=>

    Thanks for contacting SQL Server team with the feedback. We will consider this in a future release. At this time, you can compress Out-of-row BLOBs by storing them as FileStream on a compressed volume. But I do understand that this is not your scenario as the data types you mentioned are not suitable for Filestream

    Thanks
    Sunil

    <=-=Mar 8 2011 1:30PM=-=>

    Sunil, thanks for the feedback. I had considered this option, but we’re using database mirroring as a resiliency solution which, unfortunately, doesn’t support Filestream.

    Thanks
    Chris

    <=-=Jan 15 2012 11:47AM=-=>

    Chris, I updated the title of your request. You are not really saying PAGE compression for out of row or LOB data. What you are saying is that we need the ability to compress this data as well. Also, I added a variance of UNICODE compresion as well.

    thanks
    Sunil

    <=-=Dec 1 2014…
    An error occurred while saving the comment
    Solomon Rutzky commented  · 

    This absolutely needs to happen. Please, please do this. And it should include the VARBINARY(MAX) and XML datatypes as well. I assume that doing this for one LOB type would extend to all LOB types, but just in case it doesn't I figured I would add those to the request.

    Please note that in-row MAX data _can_ be PAGE compressed, but only if it is under a certain size. I tested with VARCHAR(MAX) and saw that 6000 character/byte rows would not compress, but 4000 character/byte rows did.

    On a related note, please see "Native UTF-8 Support in SQL Server 2019: Savior or False Prophet?" ( https://sqlquantumleap.com/2018/09/28/native-utf-8-support-in-sql-server-2019-savior-false-prophet-or-both/ ) for a detailed analysis of the new UTF-8 feature.

    Solomon Rutzky supported this idea  · 
  7. 3 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  SQL Server » Suggestions  ·  Flag idea as inappropriate…  ·  Admin →
    Solomon Rutzky supported this idea  · 
    An error occurred while saving the comment
    Solomon Rutzky commented  · 

    This is now available in SQL Server 2019 CTP 2. This issue can be closed.

  8. 47 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  SQL Server » Suggestions  ·  Flag idea as inappropriate…  ·  Admin →

    Upvotes: 57

    <=-=Feb 8 2017 2:13PM=-=>

    We disallow ALL use of the PRIMARY filegroup on the premise that any use of PRIMARY is unauthorized, and PRIMARY is set to a small size with no autogrow. Since the Query Store on very active servers with non-parameterized workloads grows very, very fast, setting it to a filegroup whose data file is on a specific LUN (based on speed and cost, i.e. a specific storage tier) is vital.

    <=-=Apr 2 2017 7:56AM=-=>

    Thank you for taking time to post this issue! We understand that this could be an important issue for you.

    We get a lot of feedback regarding PRIMARY filegroup from the field and many MVPs. �
    This item is high on our priory list, but unfortunately, we do not plan to include a fix for this issue in the upcoming release. Although, we might include it as an improvement in future…

    Solomon Rutzky supported this idea  · 
  9. 59 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    under review  ·  2 comments  ·  SQL Server » Suggestions  ·  Flag idea as inappropriate…  ·  Admin →
    Solomon Rutzky supported this idea  · 
  10. 2 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  SQL Server » Suggestions  ·  Flag idea as inappropriate…  ·  Admin →
    Solomon Rutzky supported this idea  · 
  11. 8 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  SQL Server » Suggestions  ·  Flag idea as inappropriate…  ·  Admin →
    Solomon Rutzky supported this idea  · 
    An error occurred while saving the comment
  12. 1 vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  SQL Server » Suggestions  ·  Flag idea as inappropriate…  ·  Admin →

    Upvotes: 3

    <=-=Nov 28 2017 8:28AM=-=>

    This request is very similar to mine — Allow TVFs (T-SQL and SQLCLR) to provide user-defined row estimates to query optimizer ( https://connect.microsoft.com/SQLServer/feedback/details/3134395/ ) — and so they should probably be considered / discussed at the same time.

    Also, the request for the “SortedAs” attribute is unnecessary and should be removed from this suggestion as it was added via the ORDER clause of the CREATE FUNCTION statement in the release of SQL Server 2008. Please see the CREATE FUNCTION documentation ( https://docs.microsoft.com/en-us/sql/t-sql/statements/create-function-transact-sql ) for details (search for “ORDER (<order_clause>” minus the double-quotes).

    <=-=Dec 11 2017 3:17AM=-=>

    I missed the ORDER option, and inappropriately tried to specify sorting via index in return table declaration first.
    I removed “SortedAs” from request.

    <=-=Jan 3 2018 8:22AM=-=>

    Thanks for submitting this feedback. We’ll keep this one open for future consideration.

    An error occurred while saving the comment
    Solomon Rutzky commented  · 

    Please see my related suggestion here:

    Allow TVFs (T-SQL and SQLCLR) to provide user-defined row estimates to query optimizer ( https://feedback.azure.com/forums/908035-sql-server/suggestions/32898139-allow-tvfs-t-sql-and-sqlclr-to-provide-user-defi )

  13. 3 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  SQL Server » Bugs  ·  Flag idea as inappropriate…  ·  Admin →
    Solomon Rutzky supported this idea  · 
    An error occurred while saving the comment
  14. 1 vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  SQL Server » Suggestions  ·  Flag idea as inappropriate…  ·  Admin →
    Solomon Rutzky supported this idea  · 
    An error occurred while saving the comment
  15. 4 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  SQL Server » Suggestions  ·  Flag idea as inappropriate…  ·  Admin →

    Upvotes: 3

    <=-=Jun 9 2017 9:45AM=-=>

    Hi Solomon,

    Thanks for the feedback.

    It is more likely that we will follow and evolve the approach introduced with interleaved execution for multi-statement table valued functions (starting in SQL Server 2017). But we will keep this item open for future consideration.

    Best Regards,

    Joe Sack, Principal PM, Microsoft

    <=-=Nov 28 2017 7:44AM=-=>

    Hi Joe. Thanks for the reply. I have updated my request so that it fits more inline with the fact that interleaved execution exists but does not apply to all situations, and might never be able to apply to SQLCLR TVFs.

    <=-=Nov 28 2017 7:48AM=-=>

    The following suggestion appears to be related:

    Allow SQLCLR TVFs to provide information about sorting and distinctness to query optimizer ( https://connect.microsoft.com/SQLServer/feedback/details/3144175/ )

    Solomon Rutzky supported this idea  · 
    An error occurred while saving the comment
  16. 1 vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  SQL Server » Suggestions  ·  Flag idea as inappropriate…  ·  Admin →

    Upvotes: 1

    <=-=Feb 21 2017 1:58PM=-=>

    Thank you for your suggestion. While I can see that this would be helpful, it realistically won’t happen quickly, as it means changing on-disk layout in Master to hold this information. We will consider this when we have need to make such a change, and will re-evaluate.
    In the meantime, you can filter out system databases by looking for databases with owner_sid <> 0×00.

    <=-=Feb 22 2017 7:23AM=-=>

    @Kevin, thanks for that reply. Regarding the suggestion of using “WHERE [owner_sid] <> 0×00”: I am not sure how this is supposed to help given that the owner_sid on all of my various instances is 0×01 for the system DBs as well as many others. This is one of the reasons that I made this request: I cannot find any values in sys.databases that reliably indicate a “system” DB. Am I missing something?

    <=-=Mar 17…
    Solomon Rutzky supported this idea  · 
    An error occurred while saving the comment
  17. 2 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  SQL Server » Suggestions  ·  Flag idea as inappropriate…  ·  Admin →
    Solomon Rutzky supported this idea  · 
    An error occurred while saving the comment
  18. 3 votes
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    0 comments  ·  SQL Server » Suggestions  ·  Flag idea as inappropriate…  ·  Admin →

    Upvotes: 3

    <=-=Mar 12 2016 12:29PM=-=>

    I would like to suggest an alternative:

    Add a few more result columns, one of which can easily be filtered on to remove empty-input entries.

    1. Ordinal position in input
    2. String position in input
    3. Length of item in input

    The length can easily be filtered on with a where clause and opens other applications.
    I prefer this over complicating the function arguments to fit every conceivable scenario.

    The listed columns are no overhead for any splitter function as they are part of its normal operation, just no exposed to T-SQL at this moment.
    Whenever the optimized detects certain result columns are not used, it is normal operation to not include them in intermediate results and thus are no overhead at all.

    <=-=Mar 12 2016 10:24PM=-=>

    Peter, while I like the idea of those additional columns, their existence is unrelated to the…

    Solomon Rutzky supported this idea  · 
  19. 1 vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  SQL Server » Bugs  ·  Flag idea as inappropriate…  ·  Admin →
    Solomon Rutzky supported this idea  · 
    An error occurred while saving the comment
  20. 1 vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)

    We’ll send you updates on this idea

    1 comment  ·  SQL Server » Bugs  ·  Flag idea as inappropriate…  ·  Admin →
    Solomon Rutzky supported this idea  · 
    An error occurred while saving the comment

Feedback and Knowledge Base