Collation Error 468

Question

Collation Error 468

Steve Jones - SSC Editor

SSC Guru

Points: 734444
More actions
October 10, 2007 at 11:29 pm

#63100

Comments posted to this topic are about the item Collation Error 468

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply

James Stover Hall of Fame Points: 3363 More actions · Answer 1

"Most DBAs don't ever deal with multiple languages or different collation and sort order settings in SQL Server"

Now there's a sweeping statement. In fact, I would say many (if not all) non-USA DBA's have had to deal with this. Convert to SQL_Latin1_General_CP1_CI_AS or die, Latin1_General_CI_AS scum!

James Stover, McDBA

DanKennedy Hall of Fame Points: 3736 More actions · Answer 2

I've come across this one quite a bit at one contract I had as a third-party vendor insisted that the performance of a binary collation was much quicker than a case-insensitive one and so set the default to Latin1_General_Bin for everything.

I'm not sure if there are tests/stats to back this claim up but I do know that the majority of queries against the data were name related and so most of their queries were written as UPPER([NameColumn]) = UPPER(@Criteria). Brilliant !! Index seek to index scan in one easy step.

After seeing this I wasn't inclined to believe their performance enhancement claim.

Dan
www.firstcs.co.uk

Nebojsa Ilic SSCommitted Points: 1735 More actions · Answer 3

Good practice is to use same collation whenever it's possible. If you have databases with different collations on the same sever use COLLATION database_default in CREATE/ALTER TABLE statement to not confuse yourself. When import objects from other sources check collation of destination tables.

Richard Dragossy-168541 SSC Veteran Points: 285 More actions · Answer 4

Hello Steve!

I've nearly the same problem with collation. I've had to build a "data warehouse" from four different different databases of four systems, one of them is multilingual.

The solution slightly differs only:

SELECT ...............

FROM [DB1].[dbo].[TABLE1] as T1

inner join [DB2].[dbo].[TABLE2] as T2

on T1.name SQL_Latin1_General_CP1_CI_AS

= T2.table_name SQL_Latin1_General_CP1_CI_AS

In this case both sides of join are forced to use the same collation 🙂

I hadn't time to check the theoretical fundamentals or efficiency aspects, but in practice it's works.

Best regards: Richard

Carolyn Richardson SSCrazy Eights Points: 8357 More actions · Answer 5

"Most DBAs don't ever deal with multiple languages or different collation and sort order settings in SQL Server"

The majority of UK DBA's will have come across collation issues, its a common problem for me. I currently have a server having 9 databases with various different collations at table and row level and am trying to clean up the mess.... To identify whether I had issues:-

USE MASTER

GO

SET NOCOUNT ON

DECLARE @DB VARCHAR (150),

@Counter INT,

@Rec VARCHAR (150),

@SQL VARCHAR (1000),

@SQL1 VARCHAR (1000),

@SQL2VARCHAR (1000)

SELECT database_id, name INTO #Temp

FROM sys.databases

WHERE name NOT IN ('Master', 'tempdb','msdb','model')

SET @Counter = (SELECT MIN(database_id) FROM #Temp)

/*Work out if a database has more than one collation, assumes only interested if more than one*/

CREATE TABLE #ctr

( NumRows int )

WHILE @Counter <= (SELECT MAX(database_id) FROM #Temp)

BEGIN

SET @DB = (SELECT name

FROM #Temp

WHERE database_id = @Counter)

--Alter 'Latin' below if not just comparing US/UK

SET @SQL = 'INSERT INTO #ctr SELECT count(distinct COLLATION_NAME)

FROM '+ @DB +'.INFORMATION_SCHEMA.columns

WHERE COLLATION_NAME LIKE ''%Latin%'' '

EXEC (@SQL)

SET @Rec = (SELECT NumRows FROM #ctr)

DELETE FROM #ctr

IF (@Rec > 1)

BEGIN

PRINT @DB

SET @SQL1 = 'SELECT TABLE_CATALOG AS [DATABASE], '

SET @SQL1 = @SQL1 +'TABLE_NAME, '

SET @SQL1 = @SQL1 +'COLLATION_NAME, '

SET @SQL1 = @SQL1 +'COLUMN_NAME, '

SET @SQL1 = @SQL1 +'DATA_TYPE '

SET @SQL1 = @SQL1 +'FROM '+ @DB +'.INFORMATION_SCHEMA.columns '

SET @SQL1 = @SQL1 +'WHERE TABLE_NAME <> ''dtproperties'' '

SET @SQL1 = @SQL1 +'AND COLLATION_NAME LIKE ''%Latin%'' '

SET @SQL1 = @SQL1 +'ORDER BY COLUMN_NAME'

EXEC (@SQL1)

END

SET @Counter = @Counter + 1

END

DROP TABLE #ctr

GO

DROP TABLE #Temp

GO

Facts are stubborn things, but statistics are more pliable - Mark Twain
Carolyn
SQLServerSpecialists[/url]

Dean Hill SSC Enthusiast Points: 155 More actions · Answer 6

Hi, we've had similar problems with collations across different data sources, but have also found one bonus. As part of the validation for data imports we check for personal info being in upper case by using a case sensitive collation

--Does the data need to be proper cased

select 'Employee- ' +d.EMP_REF+ ' name- ' +d.SURNAME

from DOWNLOAD_Employee d

where substring(d.SURNAME, 2, 1) --get second character in SURNAME

<> lower(substring(d.SURNAME, 2, 1)) --get second character again and lowercase it

collate sql_latin1_general_cp1_cs_as --compare the two using a case-sensitive collation

Neil Evans-Mudie SSC Journeyman Points: 81 More actions · Answer 7

Steve, nice article and a nice pointer ref to Tony Rogerson. I ran in to this issue also here in the UK after I restored a DB built on a UK locale server to a US locale-built server which itself had DBs created using default US-locale led collation. Tony's article help me fix the resultant mixed-collation issues (as I wanted to union results from several source DBs with differing collations).

Anyway one addition I wanted to mention was watch the collation on your Tempdb aswell as your User DBs, as it can affect your query results in a mixed collation DB environment. Here's a couple URLs:

Kimberley Tripp article titled 'Changing Database Collation and dealing with TempDB Objects'

@ http://www.sqlskills.com/blogs/kimberly/PermaLink.aspx?guid=7b4c9796-66d0-4ed2-b19d-bef6bb1e3e1d#a7b4c9796-66d0-4ed2-b19d-bef6bb1e3e1d

Michael Kaplan's blog entry @ http://blogs.msdn.com/michkap/archive/2006/05/30/610889.aspx

Hope this helps.

Cheers, Neil (DBA in UK)

chris webster SSCommitted Points: 1974 More actions · Answer 8

This problem is excaserbated by 'part-time' DBAs in the UK blindly installing SQL Server with defualts without referring to the regional settings of the server which influence the default collation of the sql server. The UK default is the Windows collation introduced with SQL 2000 which according to microsoft gives performance benifits as it matches that of the OS. I have no idea why the US defaults to a collation that was supposed to be only there for backwards compatability.

Steve Jones - SSC Editor SSC Guru Points: 734444 More actions · Answer 9

Probably because we're lazy in the US. Stick with what was working, etc. 😉

I'm sure many of you in the UK and elsewhere deal with this, but I was going on the stats for this site. We're 80% US readership. So I used the "most".

Note that I didn't imply "best" :hehe:

Carolyn Richardson SSCrazy Eights Points: 8357 More actions · Answer 10

I most often come across the issue when the server install gets completed and the guys don't alter the regional settings to UK, then when SQL Server gets installed it takes the US settings from the server default, I think in 2000 even if you changed the collation on the install you still had the issue if I remember rightly its some time since I've looked into this.

Facts are stubborn things, but statistics are more pliable - Mark Twain
Carolyn
SQLServerSpecialists[/url]

Jack Corbett SSC Guru Points: 184393 More actions · Answer 11

I had this issue pop up on my once and found the fix in BOL. I was under the impression that the 2 collations you encountered were the same. I would be interested to know what the differences are. I have not found a good resource on the differences in collations anywhere yet.

Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question

Carolyn Richardson SSCrazy Eights Points: 8357 More actions · Answer 12

The same but not the same, SQL Server doesn't recognise them as the same.

Facts are stubborn things, but statistics are more pliable - Mark Twain
Carolyn
SQLServerSpecialists[/url]

Steve Jones - SSC Editor SSC Guru Points: 734444 More actions · Answer 13

I pinged a few language people to see if they knew, but no response.

I can't find a difference either. They should be the same, and could be, but I think SQL Server's response is the error if there's any naming difference.

chris webster SSCommitted Points: 1974 More actions · Answer 14

All my research pointed to them being the same, SQL just can't get over the name difference.