sp_validname incorrectly returns 1 / true if name being checked contains code point U+FFFF which is invalid, even for delimited identifiers
The "sp_validname" system stored procedure relies on the QUOTENAME built-in function in order to validate the name being passed in. Unfortunately, there's a minor bug in QUOTENAME in that it doesn't recognize code point U+FFFF as being invalid (please see: "QUOTENAME considers 0xFFFF to be valid, but needs to return NULL if that character exists in input value" ( https://feedback.azure.com/forums/908035-sql-server/suggestions/37048876-quotename-considers-0xffff-to-be-valid-but-needs )).
There is a comment in sp_validname (obtained using: "EXEC sp_helptext N'sp_validname';" ) stating:
-- Check for valid SQL-Server identifiers: All non-binary-zero
-- characters are valid. String must not be null or empty.
So, it's possible that nobody knew that code point U+FFFF was invalid, making the current behavior intended, even if incorrect. This is proven via the following T-SQL:
DECLARE @BadName sysname = QUOTENAME(N'a' + NCHAR(0xFFFF));
SELECT @BadName AS [TheValue], DATALENGTH(@BadName) AS [ByteCount];
-- [a<U+FFFF>] 8
EXEC sp_validname @BadName; -- no error!!!
EXEC (N'USE [tempdb]; CREATE TABLE [dbo].' + @BadName + N' (col1 INT);');
Msg 1055, Level 16, State 1, Line XXXXX
'..' is an invalid name because it contains a NULL character or an invalid unicode character.
I filed this bug report separately from QUOTENAME since it is probably quicker and easier to fix this stored procedure by specifically checking for code point U+FFFF. This way you don't need to wait for someone to fix QUOTENAME, and once QUOTENAME is fixed, the check performed in this proc will become redundant, but will not cause a error or change the overall behavior in any way.
The fix here is simple, just change the following line (towards the end of the proc):
IF (quotename(@name) is not null AND datalength(@name) > 0)
to instead be:
IF (quotename(@name) is not null AND datalength(@name) > 0 and CHARINDEX(NCHAR(0xFFFF) COLLATE Latin1_General_100_BIN2, @name) = 0)
Please note: the "COLLATE Latin1_General_100_BIN2" is required! This will not work without a binary collation.
For details / research on valid/invalid characters, please see:
"The Uni-Code: The Search for the True List of Valid Characters for T-SQL Identifiers, Part 3 of 2 (Delimited Identifiers)" ( https://sqlquantumleap.com/2018/04/16/the-uni-code-the-search-for-the-true-list-of-valid-characters-for-t-sql-identifiers-part-3-of-2-delimited-identifiers/ )