STRING_SPLIT needs "RemoveEmptyEntries" option, like String.Split in .NET
The new STRING_SPLIT built-in function is missing a feature that is available in its .NET counter-part: the option to "RemoveEmptyEntries" (as found in the StringSplitOptions enum). While it is possible to remove empty string entries after-the-fact, if the Connect Suggestion to add the Item Number is ever implemented ( https://connect.microsoft.com/SQLServer/Feedback/Details/2433171 ), then this suggestion is even more pertinent because wanting to remove empty entries would then still require a CTE, removing the benefit of having the built-in Item Number field:
;WITH split AS
SELECT vals.value, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS [ItemNum]
FROM STRING_SPLIT(@InputString, ',') vals
WHERE vals.value <> ''
String.Split ( https://msdn.microsoft.com/en-us/library/tabh47cf.aspx )
StringSplitOptions ( https://msdn.microsoft.com/en-us/library/system.stringsplitoptions.aspx )
Upvotes: 3<=-=Mar 12 2016 12:29PM=-=>
I would like to suggest an alternative:
Add a few more result columns, one of which can easily be filtered on to remove empty-input entries.
1. Ordinal position in input
2. String position in input
3. Length of item in input
The length can easily be filtered on with a where clause and opens other applications.
I prefer this over complicating the function arguments to fit every conceivable scenario.
The listed columns are no overhead for any splitter function as they are part of its normal operation, just no exposed to T-SQL at this moment.
Whenever the optimized detects certain result columns are not used, it is normal operation to not include them in intermediate results and thus are no overhead at all.
Peter, while I like the idea of those additional columns, their existence is unrelated to the issue presented here:
1) There is no need to filter on length as one can already filter on "WHERE value <> ’’; " (as shown in my example in the Description above).
2) The issue here is not that it is difficult to remove the empty entries after the fact, but that if the Ordinal Position field is ever added, then removing the empty entries after the fact will leave gaps in those ordinal positions.
3) The request is for an optional parameter, not a required parameter. This would allow anyone to simply pass in just the two current input parameters and never even know that there was another option. This is currently done in many built-in T-SQL functions (e.g. the “style” parameter in the CONVERT function, the “database_id” parameter in the OBJECT_NAME function, the “start_location” parameter in the CHARINDEX function, etc).
4) On a very minor level, it keeps the API consistent with .NET, similar to how the FORMAT function was handled.