Home Forums SQL Server 2008 SQL Server 2008 - General SQL Server Collation (ASCII Table sort and Case Insensitive/Accent Insensitive) RE: SQL Server Collation (ASCII Table sort and Case Insensitive/Accent Insensitive)

  • I realize that this question was posted nearly 6 years ago, but I just came across it and have the answer.

    Part of the confusion in answering this is that the O.P. was accidentally misleading in a few places:

    • Request was for "ASCII ordering, except for grouping letters together as if it were case-INsensitive and accent-INsensitive", yet that is not an accurate description of the desired sort order
    • ALL of the exceptions to the stated rules were not shown in the initial post
    • ALL of the exceptions to the stated rules were not shown in the sample data shown in the 3rd post

    The actual desired sort order (at least the stated desired order) is shown in the 3rd image ( 3.jpg ), four posts above this one (the only post with attachments). That shows a sort order, with the numbers highlighted, but also shows some punctuation characters that come between the numbers and the letters (which is true to the request for "ASCII" value ordering), BUT also some punctuation coming just before the letters that come after the letters. So, the desired sort order is farther away from the base values than was originally presented.

    Given the sort order shown in 3.jpg, I can state that:

    1) There is no Collation within SQL Server, outside of a binary Collation, that will sort punctuation between the numbers and letters as is being requested here,

    2) A binary Collation by itself will not sort the punctuation that comes after the standard US English letters between the numbers and letters, and it also will not group upper-case, lower-case, and accented letters together, as is being requested here,

    3) A binary Collation used with UPPER(myField) will group upper-case and lower-case together, but it won't put the upper-case letters first (they will be intermixed), and it won't group the accented characters along with the related upper-case and lower-case characters. And, just like the case of using the binary Collation by itself, it won't sort the punctuation that comes after the standard US English letters between the numbers and letters.

    4) A case-INsensitive Collation will not guarantee that the upper-case letters get sorted before the lower-case letters, as is being requested here. It will group them together, but it will allow the order of the upper-case and lower-case of the same letter to be intermixed.

    What is being requested here is a custom sort order. No existing SQL Server Collation will do what is being requested.

    What is needed is this:

    1) Use a SQL Server case-sensitive Collation. These sort upper-case before lower-case, per each letter (not all upper-case before all lower-case). A Windows case-sensitive Collation will sort the lower-case letters first, per each letter. The SQL Server Collations do not offer a combination that is both case-sensitive and accent-INsensitive, so use one that is both case-sensitive and accent-sensitive. This is just for the sorting, not for comparisons. There is also the option of using one of the few "Pref" Collations, which are case-INsensitive yet sort the upper-case letters before the lower-case, per each letter. The two best options are: SQL_Latin1_General_CP1_CS_AS and SQL_Latin1_General_Pref_CP1_CI_AS.
    2) Use a CASE statement to create the custom sort rules. Test the value of the first character, and if it is within a certain range, then prefix the sort value with a "grouping" value.

    Starting with Paul's example code from a prior post in this thread, I made some modifications to:
    1) add in the missing sample data (items 14 and 15)
    2) create a CASE statement in the ORDER BY clause to force custom groupings. Please note that the ASCII function only returns the value for the first letter of any string passed into it, so no need to use SUBSTRING or LEFT to get just the first letter.

    DECLARE @apples TABLE
    (
      [sequence]  INT NOT NULL,
      [apple]   VARCHAR(10) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL
    );

    INSERT INTO @apples ([sequence], [apple])
    VALUES
      (1, '!apple'),
      (2, '-apple'),
      (3, '0apple'),
      (4, '1apple'),
      (5, 'APPLE'),
      (6, 'Apple'),
      (7, 'apple'),
      (8, 'àpple'),
      (9, 'Bpple'),
      (10, 'bpple'),
      (11, 'èpple'),
      (12, 'Zpple'),
      (13, 'zpple'),
      (14, '~zpple'),
      (15, '=zpple');

        
    SELECT *, ASCII([apple]) AS [CodePoint]
    FROM @Apples
    ORDER BY [apple];

    SELECT *, ASCII([apple]) AS [CodePoint]
    FROM @Apples
    ORDER BY CASE
        WHEN ASCII([apple]) < 58 THEN '1'
        WHEN ASCII([apple]) BETWEEN 58 AND 64 THEN '2'
        WHEN ASCII([apple]) BETWEEN 91 AND 95 THEN '3'
        WHEN ASCII([apple]) BETWEEN 123 AND 126 THEN '4'
        ELSE '5'
       END + [apple] COLLATE SQL_Latin1_General_Pref_CP1_CI_AS; -- SQL_Latin1_General_CP1_CS_AS

    The first query (no custom sort) returns:

    1    !apple    33
    2    -apple    45
    15    =zpple    61
    14    ~zpple    126
    3    0apple    48
    4    1apple    49
    5    APPLE    65
    6    Apple    65
    7    apple    97
    8    àpple    224
    9    Bpple    66
    10    bpple    98
    11    èpple    232
    12    Zpple    90
    13    zpple    122

    While the second query (with the custom sort) returns:

    1    !apple    33
    2    -apple    45
    3    0apple    48
    4    1apple    49
    15    =zpple    61
    14    ~zpple    126
    5    APPLE    65
    6    Apple    65
    7    apple    97
    8    àpple    224
    9    Bpple    66
    10    bpple    98
    11    èpple    232
    12    Zpple    90
    13    zpple    122

    SQL#https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
    Sql Quantum Lifthttps://SqlQuantumLift.com/ ( company )
    Sql Quantum Leaphttps://SqlQuantumLeap.com/ ( blog )
    Info sitesCollations     •     Module Signing     •     SQLCLR