The ordering of the data

  • Steve Jones - SSC Editor

    SSC Guru

    Points: 719130

    Comments posted to this topic are about the item The ordering of the data

  • emiddlebrooks

    Hall of Fame

    Points: 3085

    Interesting and the explanation makes sense. Played with it a bit and found this returns the second answer and not exactly sure why. I am thinking there is an implicit conversion changing the collation but not sure how it got what it did.

    SELECT testString
    FROM @Test
    WHERE testString LIKE N'[a-B]';

     

  • Solomon Rutzky

    SSCoach

    Points: 16250

    emiddlebrooks wrote:

    Played with it a bit and found this returns the second answer and not exactly sure why. I am thinking there is an implicit conversion changing the collation but not sure how it got what it did.

    SELECT testString
    FROM @Test
    WHERE testString LIKE N'[a-B]';

     

    The underlying reason for the behavior in either case is that when sorting using a case-sensitive collation, there will be consistent ordering of each case of each letter, but there are two cases, hence two options for the ordering. In terms of the given dataset for this question, that would be reflected as:

    1. Upper-case first = A, a, B, b, C, c
    2. Lower-case first = a, A, b, B, c, C

    Most of the SQL Server collations (names starting with "SQL_") use "upper-case first" for VARCHAR data only, but use "lower-case first" for NVARCHAR data. Windows collations (names not starting with "SQL_") uses "lower-case first" for both VARCHAR and NVARCHAR data. Binary collations are not relevant here because they have no concept of case.

    Now,  the range option of the [...] wildcard uses sorting to determine the actual range. Looking at the two lists shown above and applying them to the pattern of [a-B] results in:

    1. Upper-case first = _, a, B, _, _, _
    2. Lower-case first = a, A, b, B, _, _

     

    The reason why you got the "lower-case first" result from adding the upper-case "N" prefix to the string literal (i.e. N'[a-B]') is that doing so creates a Unicode / NVARCHAR string literal, and datatype precedence causes the VARCHAR data in the column to be implicitly converted into NVARCHAR which uses "lower-case first" ordering.

     

    For more info on collations / Unicode, please visit: Collations Info

    Take care,

    Solomon...

     

    SQL#https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
    Sql Quantum Lifthttps://SqlQuantumLift.com/ ( company )
    Sql Quantum Leaphttps://SqlQuantumLeap.com/ ( blog )
    Info sitesCollations     •     Module Signing     •     SQLCLR

  • webrunner

    SSC-Dedicated

    Points: 30219

    Thanks, that was a good one.

    -------------------
    A SQL query walks into a bar and sees two tables. He walks up to them and asks, "Can I join you?"
    Ref.: http://tkyte.blogspot.com/2009/02/sql-joke.html

  • Stewart "Arturius" Campbell

    SSC Guru

    Points: 71767

    Really good question, thanks Steve

    And Solomon for the detailed explanation.

    ____________________________________________
    Space, the final frontier? not any more...
    All limits henceforth are self-imposed.
    “libera tute vulgaris ex”

Viewing 5 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply