SQL Server Collation (ASCII Table sort and Case Insensitive/Accent Insensitive)

  • diallonina

    Mr or Mrs. 500

    Points: 524

    Hi everybody :-),

    I have an issue about SQL Server Collations.

    Our owned programmed database use Case-Insensitive/Accent Insensitive data ordered like ASCII characters table.

    Here is this sort order :

    32 (space)

    ! 33

    " 34

    # 35

    $ 36

    %37

    & 38

    ’ 39

    ( 40

    ) 41

    *42

    +43

    , 44

    - 45

    . 46

    / 47

    0 48

    ...(Others numbers from 49 to 57)

    : 58

    ; 59

    <60

    =61

    >62

    ?63

    @64

    A 65

    ...(others UPPER LETTERS From 66 to 90)

    [ 91

    \ 92

    ...

    a 97;

    ...(others LOWER LETTERS FROM 98 TO 122)

    ...(others symbols)

    à 224;

    ...OTHERS LETTERS WITH ACCENTS

    ...(others symbols)

    You can look at http://www.asciitable.com/

    My issue: I have to find a way to do the same in SQL Server (a collation that respects the begin of ASCII table sort with CASE INSENSITIVE/ACCENT INSENSITIVE).

    Then, the result will be like:

    ! , ", #, $, %, &, ’, ..., 0,1,2, 3, .... A, a, à, O, o, ô, E, e, é, è, ê, .....

    I have made many searches and I found that binaries collations respect ASCII table sort order but they do not support CASE INSENSITIVE/ACCENT INSENSITIVE options. However, non binaries collations support CASE INSENSITIVE/ACCENT INSENSITIVE but do not respect ASCII table sort order:hehe:

    I tried without success :

    DataBase Collation = CollationXXX_BIN with SELECT column with COLLATE CollationXXX_CI_AI

    DataBase Collation = CollationXXX_CI_AI with SELECT column with COLLATE CollationXXX_BIN

    Is there a way that could solve my problem ? A way to create my own collation in SQL Server ?

    Thank you very much 🙂

    PS: I don't speak english very well but i Hope you'll understand 😉

  • Lowell

    SSC Guru

    Points: 323463

    remember what case-Insensitive collations are used for: if i search for "apple' or "Apple" or "APPLE", i'd get results. that's perfect. now if i want to ORDER those results, i can order by a binary sensitive collation; i think that's the thing you are missing, so there's no need to come up with your own collation.

    so the thing to remember is you can order your data differently than the collation it is stored in.

    SELECT

    YourColumn,

    OtherData

    FROM YourTable

    WHERE YourColumn = 'Apple'

    ORDER BY YourColumn Collate SQL_Latin1_General_CP850_BIN

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • diallonina

    Mr or Mrs. 500

    Points: 524

    Thank you Lowell for your answer :-).

    I tried what you said:

    Database Collation CI_AI with a select column collate with a collation BIN.

    More explanations:

    With my select ordered by column with a binay collation as you suggest, it wll return

    1- !apple

    2- -apple

    3- 0apple

    4- 1apple

    5- APPLE

    6- Apple

    7- Bpple

    8- Zpple

    9- apple

    10-bpple

    11-zpple

    12-àpple

    13-èpple

    etc...

    But I want

    1- !apple

    2- -apple

    3- 0apple

    4- 1apple

    5- APPLE

    6- Apple

    7- apple

    8-àpple

    9- Bpple

    10-bpple

    11-èpple

    12- Zpple

    13-zpple

    Thank you.

  • Lowell

    SSC Guru

    Points: 323463

    well i couldn't figure out what order you wanted via your example;

    lets switch to real code so we can figure it out better.

    does this do what you want? what's wrong witht he order in this example? you'll need to explain why certain things need to go first vs second...that's what is missing for me, i think; i can't get a handle on the order by criteria.

    ;With MySampleData AS (

    SELECT N,

    CHAR(MiniTally.N) + 'pple' As Word

    FROM (SELECT

    ROW_NUMBER() OVER (ORDER BY name) AS N

    FROM sys.columns)MiniTally

    WHERE N BETWEEN 32 AND 255

    )

    SELECT *

    from MySampleData

    ORDER BY Word,Word Collate SQL_Latin1_General_CP850_BIN

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Paul White

    SSC Guru

    Points: 150442

    One of the SQL collations should do what you need. This one works for your sample data:

    DECLARE @apples TABLE

    (

    sequence integer NOT NULL,

    apple varchar(6) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL

    )

    INSERT @apples

    (sequence, apple)

    VALUES

    (1, '!apple'),

    (2, '-apple'),

    (3, '0apple'),

    (4, '1apple'),

    (5, 'APPLE'),

    (6, 'Apple'),

    (7, 'apple'),

    (8, 'àpple'),

    (9, 'Bpple'),

    (10, 'bpple'),

    (11, 'èpple'),

    (12, 'Zpple'),

    (13, 'zpple')

    SELECT *

    FROM @apples AS a

    ORDER BY apple;

    -- SQL Server Sort Order 54 on Code Page 1252 for non-Unicode Data

    SELECT *

    FROM sys.fn_helpcollations() AS fh

    WHERE fh.name = 'SQL_Latin1_General_CP1_CI_AI';

  • diallonina

    Mr or Mrs. 500

    Points: 524

    Hi 🙂

    Thank you Lowell and SQL Kiwi for your answers.

    Our application send SQL requests one-by-one via ODBC cursor and then fetch next. With the good collation, SQL would return the good one first result and then the fetch next would return the next ones in order.

    I've tried SQL Kiwi solution and it doesn't work :hehe:

    USE [TestCollation

    GO

    /****** Object: Table [dbo].[tmp] Script Date: 01/24/2012 09:48:06 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    SET ANSI_PADDING ON

    GO

    CREATE TABLE [dbo].[tmp](

    [char_name] [varchar](1) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL ,

    [ascii_val] [smallint] NULL

    ) ON [PRIMARY]

    GO

    SET ANSI_PADDING OFF

    INSERT INTO [TestCollationFrench].[dbo].[tmp]

    ([char_name]

    ,[ascii_val])

    VALUES

    (<char_name, varchar(1),>

    ,<ascii_val, smallint,>)

    First print screen: initial data

    Second print screen :SELECT * FROM dbo.tmp ORDER BY char_name. You'll notice that the results are not ordered by the column ascii_val order. Only the 16 firt lines are correctly ordered.

    Third print screen: This is what i want to obtain. Results expected (almost ordered by ascii_val column with numbers before letters).

    Have a nice day 🙂

  • Paul White

    SSC Guru

    Points: 150442

    diallonina (1/24/2012)


    I've tried SQL Kiwi solution and it doesn't work :hehe:

    It worked for the sample data you provided :hehe: :hehe: :hehe:

    Look, images are not easy for us to work with - if you want a definitive answer, provide test data in the form I did: CREATE TABLE with INSERT statements. From what you have said, it might be worth looking trying the SQL_Latin1_General_CP437_CI_AI collation, but I am not about to transcribe images into test data for you to test that out.

  • diallonina

    Mr or Mrs. 500

    Points: 524

    Ok, I understand SQL Kiwi. I'll make insert statements and post them after.

    Thank you 😉

  • kcarlson-1040281

    SSC Enthusiast

    Points: 123

    Hello,

    I believe I had the same need/problem and I found a solution using:

    select * from myTable order by UPPER(myField) COLLATE SQL_Latin1_General_CP850_BIN

    the trick of course is to use UPPER to achieve case-insensitivity and then to use a binary collation to get everything else in an ASCII order.

    Cheers,

  • Solomon Rutzky

    SSCoach

    Points: 16267

    I realize that this question was posted nearly 6 years ago, but I just came across it and have the answer.

    Part of the confusion in answering this is that the O.P. was accidentally misleading in a few places:

    • Request was for "ASCII ordering, except for grouping letters together as if it were case-INsensitive and accent-INsensitive", yet that is not an accurate description of the desired sort order
    • ALL of the exceptions to the stated rules were not shown in the initial post
    • ALL of the exceptions to the stated rules were not shown in the sample data shown in the 3rd post

    The actual desired sort order (at least the stated desired order) is shown in the 3rd image ( 3.jpg ), four posts above this one (the only post with attachments). That shows a sort order, with the numbers highlighted, but also shows some punctuation characters that come between the numbers and the letters (which is true to the request for "ASCII" value ordering), BUT also some punctuation coming just before the letters that come after the letters. So, the desired sort order is farther away from the base values than was originally presented.

    Given the sort order shown in 3.jpg, I can state that:

    1) There is no Collation within SQL Server, outside of a binary Collation, that will sort punctuation between the numbers and letters as is being requested here,

    2) A binary Collation by itself will not sort the punctuation that comes after the standard US English letters between the numbers and letters, and it also will not group upper-case, lower-case, and accented letters together, as is being requested here,

    3) A binary Collation used with UPPER(myField) will group upper-case and lower-case together, but it won't put the upper-case letters first (they will be intermixed), and it won't group the accented characters along with the related upper-case and lower-case characters. And, just like the case of using the binary Collation by itself, it won't sort the punctuation that comes after the standard US English letters between the numbers and letters.

    4) A case-INsensitive Collation will not guarantee that the upper-case letters get sorted before the lower-case letters, as is being requested here. It will group them together, but it will allow the order of the upper-case and lower-case of the same letter to be intermixed.

    What is being requested here is a custom sort order. No existing SQL Server Collation will do what is being requested.

    What is needed is this:

    1) Use a SQL Server case-sensitive Collation. These sort upper-case before lower-case, per each letter (not all upper-case before all lower-case). A Windows case-sensitive Collation will sort the lower-case letters first, per each letter. The SQL Server Collations do not offer a combination that is both case-sensitive and accent-INsensitive, so use one that is both case-sensitive and accent-sensitive. This is just for the sorting, not for comparisons. There is also the option of using one of the few "Pref" Collations, which are case-INsensitive yet sort the upper-case letters before the lower-case, per each letter. The two best options are: SQL_Latin1_General_CP1_CS_AS and SQL_Latin1_General_Pref_CP1_CI_AS.
    2) Use a CASE statement to create the custom sort rules. Test the value of the first character, and if it is within a certain range, then prefix the sort value with a "grouping" value.

    Starting with Paul's example code from a prior post in this thread, I made some modifications to:
    1) add in the missing sample data (items 14 and 15)
    2) create a CASE statement in the ORDER BY clause to force custom groupings. Please note that the ASCII function only returns the value for the first letter of any string passed into it, so no need to use SUBSTRING or LEFT to get just the first letter.

    DECLARE @apples TABLE
    (
      [sequence]  INT NOT NULL,
      [apple]   VARCHAR(10) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL
    );

    INSERT INTO @apples ([sequence], [apple])
    VALUES
      (1, '!apple'),
      (2, '-apple'),
      (3, '0apple'),
      (4, '1apple'),
      (5, 'APPLE'),
      (6, 'Apple'),
      (7, 'apple'),
      (8, 'àpple'),
      (9, 'Bpple'),
      (10, 'bpple'),
      (11, 'èpple'),
      (12, 'Zpple'),
      (13, 'zpple'),
      (14, '~zpple'),
      (15, '=zpple');

        
    SELECT *, ASCII([apple]) AS [CodePoint]
    FROM @Apples
    ORDER BY [apple];

    SELECT *, ASCII([apple]) AS [CodePoint]
    FROM @Apples
    ORDER BY CASE
        WHEN ASCII([apple]) < 58 THEN '1'
        WHEN ASCII([apple]) BETWEEN 58 AND 64 THEN '2'
        WHEN ASCII([apple]) BETWEEN 91 AND 95 THEN '3'
        WHEN ASCII([apple]) BETWEEN 123 AND 126 THEN '4'
        ELSE '5'
       END + [apple] COLLATE SQL_Latin1_General_Pref_CP1_CI_AS; -- SQL_Latin1_General_CP1_CS_AS

    The first query (no custom sort) returns:

    1    !apple    33
    2    -apple    45
    15    =zpple    61
    14    ~zpple    126
    3    0apple    48
    4    1apple    49
    5    APPLE    65
    6    Apple    65
    7    apple    97
    8    àpple    224
    9    Bpple    66
    10    bpple    98
    11    èpple    232
    12    Zpple    90
    13    zpple    122

    While the second query (with the custom sort) returns:

    1    !apple    33
    2    -apple    45
    3    0apple    48
    4    1apple    49
    15    =zpple    61
    14    ~zpple    126
    5    APPLE    65
    6    Apple    65
    7    apple    97
    8    àpple    224
    9    Bpple    66
    10    bpple    98
    11    èpple    232
    12    Zpple    90
    13    zpple    122

    SQL#https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
    Sql Quantum Lifthttps://SqlQuantumLift.com/ ( company )
    Sql Quantum Leaphttps://SqlQuantumLeap.com/ ( blog )
    Info sitesCollations     •     Module Signing     •     SQLCLR

Viewing 10 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply