Different database collation

  • Hello, can anybody explain the following behavior?

    The collation of the server is (and MUST be due to a third party software) SQL_Latin1_General_CP1_CI_AS.

    use master

    go

    create database dtb1 collate SQL_Hungarian_CP1250_CI_AS

    go

    use dtb1

    go

    create table dbo.table1 (

    Id int,

    Name nvarchar(20)

    )

    go

    insert into dbo.table1 (Id, Name) values (1, N'GLYCEROL');

    insert into dbo.table1 (Id, Name) values (2, N'GLYCOL');

    insert into dbo.table1 (Id, Name) values (3, N'SUNFLOWER OIL');

    insert into dbo.table1 (Id, Name) values (4, N'WHEAT FLOUR');

    go

    select * from dbo.table1 where Name like N'WH%' -- Correct result.

    select * from dbo.table1 where Name like N'SU%' -- Correct result.

    select * from dbo.table1 where Name like N'G%' -- Correct result.

    select * from dbo.table1 where Name like N'GL%' -- INCORRECT, empty result set!

    select * from dbo.table1 where Name like N'%GL%' -- INCORRECT, empty result set!

    select * from dbo.table1 where Name like N'GLY%' -- Correct result.

    select * from dbo.table1 where Name like N'GLYC%' -- Correct result.

    select * from dbo.table1 where Name like N'GL%' collate SQL_Latin1_General_CP1_CI_AS -- Correct result.

    go

    I know it is recommended to use the server's collation for databases, but in this case I do not compare columns

    with different collations, everything is in the same database. Is did not write special characters.

    Another example, that realy terrifies me:

    declare @a nvarchar(10) = N'GLYCOL'

    declare @b-2 nvarchar(10) = N'GL%'

    if @a like @b-2

    print 'Match'

    else

    print 'Does not match'

    There are no columns only variables.

    Running in the dtb1 database: does not match. With explicit collate SQL_Latin1_General_CP1_CI_AS after 'like @b-2' it is OK.

    Running in the master database: match.

    If I change the text 'GLYCOL' to 'GLXCOL' it gives 'Match' in both databases without 'collate SQL_Latin1_General_CP1_CI_AS'.

    Does it mean, if I use hungarian collation for a database, the LIKE operator will work randomly depending on the column content (or I must write the Latin1 collation after every LIKE)?

    Will other types of string compare operations give wrong results (for example: where columnx = N'blabla'...) or only the LIKE is uncertain?

    And finally: what is the relevance to allow different database collations if the consequences are so unpredictable.

    Much appreciated

  • I guess the different results are because the [tempdb] is always using the collations as set at instance level. So as soon as the queryplan decides to use the [tempdb] (or when temporary objects are used) you will run into the difference in collation. To avoid issues when the database is set to a different cololation you always need to specify the collation you want to use in the compare.

    ** Don't mistake the ‘stupidity of the crowd’ for the ‘wisdom of the group’! **
  • If my memory serves me correctly, in Hungarian 'ly' is its own letter.

    Seems to account for those results, e.g., 'GL' is a two letter string consisting of 'G' and 'L', while 'GLY' is a two letter string consisting of the letters 'G' and 'LY'.

    Cheers!

    EDIT: Fixed a typo.

  • I am ashamed, I completely forgot that. So the server tries to "guess" double letters during parsing a string. In Hungarian there are a lot of them (cs, dz, dzs, gy, ly, ny, sz, ty, zs). So if I do not want this "feature" in LIKE I have to define the collation to non-Hungarian.

    Thank you very much.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply