October 24, 2016 at 7:49 am
Hello, can anybody explain the following behavior?
The collation of the server is (and MUST be due to a third party software) SQL_Latin1_General_CP1_CI_AS.
use master
go
create database dtb1 collate SQL_Hungarian_CP1250_CI_AS
go
use dtb1
go
create table dbo.table1 (
Id int,
Name nvarchar(20)
)
go
insert into dbo.table1 (Id, Name) values (1, N'GLYCEROL');
insert into dbo.table1 (Id, Name) values (2, N'GLYCOL');
insert into dbo.table1 (Id, Name) values (3, N'SUNFLOWER OIL');
insert into dbo.table1 (Id, Name) values (4, N'WHEAT FLOUR');
go
select * from dbo.table1 where Name like N'WH%' -- Correct result.
select * from dbo.table1 where Name like N'SU%' -- Correct result.
select * from dbo.table1 where Name like N'G%' -- Correct result.
select * from dbo.table1 where Name like N'GL%' -- INCORRECT, empty result set!
select * from dbo.table1 where Name like N'%GL%' -- INCORRECT, empty result set!
select * from dbo.table1 where Name like N'GLY%' -- Correct result.
select * from dbo.table1 where Name like N'GLYC%' -- Correct result.
select * from dbo.table1 where Name like N'GL%' collate SQL_Latin1_General_CP1_CI_AS -- Correct result.
go
I know it is recommended to use the server's collation for databases, but in this case I do not compare columns
with different collations, everything is in the same database. Is did not write special characters.
Another example, that realy terrifies me:
declare @a nvarchar(10) = N'GLYCOL'
declare @b-2 nvarchar(10) = N'GL%'
print 'Match'
else
print 'Does not match'
There are no columns only variables.
Running in the dtb1 database: does not match. With explicit collate SQL_Latin1_General_CP1_CI_AS after 'like @b-2' it is OK.
Running in the master database: match.
If I change the text 'GLYCOL' to 'GLXCOL' it gives 'Match' in both databases without 'collate SQL_Latin1_General_CP1_CI_AS'.
Does it mean, if I use hungarian collation for a database, the LIKE operator will work randomly depending on the column content (or I must write the Latin1 collation after every LIKE)?
Will other types of string compare operations give wrong results (for example: where columnx = N'blabla'...) or only the LIKE is uncertain?
And finally: what is the relevance to allow different database collations if the consequences are so unpredictable.
Much appreciated
October 24, 2016 at 8:06 am
I guess the different results are because the [tempdb] is always using the collations as set at instance level. So as soon as the queryplan decides to use the [tempdb] (or when temporary objects are used) you will run into the difference in collation. To avoid issues when the database is set to a different cololation you always need to specify the collation you want to use in the compare.
October 24, 2016 at 8:49 am
If my memory serves me correctly, in Hungarian 'ly' is its own letter.
Seems to account for those results, e.g., 'GL' is a two letter string consisting of 'G' and 'L', while 'GLY' is a two letter string consisting of the letters 'G' and 'LY'.
Cheers!
EDIT: Fixed a typo.
October 24, 2016 at 11:28 am
I am ashamed, I completely forgot that. So the server tries to "guess" double letters during parsing a string. In Hungarian there are a lot of them (cs, dz, dzs, gy, ly, ny, sz, ty, zs). So if I do not want this "feature" in LIKE I have to define the collation to non-Hungarian.
Thank you very much.
Viewing 4 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply