The ordering of the data

Question

The ordering of the data

Steve Jones - SSC Editor

SSC Guru

Points: 738327
More actions
April 7, 2020 at 12:00 am

#3734329

Comments posted to this topic are about the item The ordering of the data

Viewing 5 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

emiddlebrooks Hall of Fame Points: 3431 More actions · Answer 1

Interesting and the explanation makes sense. Played with it a bit and found this returns the second answer and not exactly sure why. I am thinking there is an implicit conversion changing the collation but not sure how it got what it did.

SELECT testString
 FROM @Test
 WHERE testString LIKE N'[a-B]';

Solomon Rutzky SSCoach Points: 16430 More actions · Answer 2

emiddlebrooks wrote:

Played with it a bit and found this returns the second answer and not exactly sure why. I am thinking there is an implicit conversion changing the collation but not sure how it got what it did.
SELECT testString
 FROM @Test
 WHERE testString LIKE N'[a-B]';

The underlying reason for the behavior in either case is that when sorting using a case-sensitive collation, there will be consistent ordering of each case of each letter, but there are two cases, hence two options for the ordering. In terms of the given dataset for this question, that would be reflected as:

Upper-case first = A, a, B, b, C, c
Lower-case first = a, A, b, B, c, C

Most of the SQL Server collations (names starting with "SQL_") use "upper-case first" for VARCHAR data only, but use "lower-case first" for NVARCHAR data. Windows collations (names not starting with "SQL_") uses "lower-case first" for both VARCHAR and NVARCHAR data. Binary collations are not relevant here because they have no concept of case.

Now, the range option of the [...] wildcard uses sorting to determine the actual range. Looking at the two lists shown above and applying them to the pattern of [a-B] results in:

Upper-case first = _, a, B, _, _, _
Lower-case first = a, A, b, B, _, _

The reason why you got the "lower-case first" result from adding the upper-case "N" prefix to the string literal (i.e. N'[a-B]') is that doing so creates a Unicode / NVARCHAR string literal, and datatype precedence causes the VARCHAR data in the column to be implicitly converted into NVARCHAR which uses "lower-case first" ordering.

For more info on collations / Unicode, please visit: Collations Info

Take care,

Solomon...

SQL# — https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
Sql Quantum Lift — https://SqlQuantumLift.com/ ( company )
Sql Quantum Leap — https://SqlQuantumLeap.com/ ( blog )
Info sites — Collations • Module Signing • SQLCLR

webrunner SSC-Dedicated Points: 32004 More actions · Answer 3

Thanks, that was a good one.

-------------------
A SQL query walks into a bar and sees two tables. He walks up to them and asks, "Can I join you?"
Ref.: http://tkyte.blogspot.com/2009/02/sql-joke.html

Stewart "Arturius" Campbell SSC Guru Points: 72737 More actions · Answer 4

Really good question, thanks Steve

And Solomon for the detailed explanation.

____________________________________________
Space, the final frontier? not any more...
All limits henceforth are self-imposed.
“libera tute vulgaris ex”

The ordering of the data

Cookies on SQLServerCentral