I have a collation issue, should I change the database or all the columns?

Question

I have a collation issue, should I change the database or all the columns?

Rob Reid-246754

SSCarpal Tunnel

Points: 4183
More actions
March 18, 2020 at 12:36 am

#3735193

I had a SQL 2012 database on a dedicated server and I have taken it down to my local PC and an SQL Express version as I am trying to downsize a system that used to run multiple websites and use the Betfair API to place bets from an AI type system to a local BOT that I can run on my laptop as and when needed to place bets.
The old database had no issues and the DB and Table Column collation for all my nvarchar textual columns were both Latin1_General_CI_AS.
However as I had to quickly obtain the system using FTP and had no ability to run data imports I had to do a backup and restore to my local PC and it seems that the database now has the collation of SQL_Latin1_General_CP1_CI_AI whilst all the nvarchar table columns still use Latin1_General_CI_AS.
I have set the databases compatibility level to 110 (2012) but cannot change the databases collation to Latin1_General_CI_AS using Management Studio and just changing the property OR running a query result in this error: "The database could not be exclusively locked to perform the operation." - However I am not sure why it cannot be exclusively locked as no other process is using the database. There is no website, no BOT, nothing at the moment, apart from me trying to sort out issues with the database and testing code against it.
I read somewhere you cannot use MS Management Console to change the DB collation once the DB had been created so I tried a query but still it gave the same error e.g
USE master;
GO
ALTER DATABASE Brainiac
COLLATE Latin1_General_CI_AS ;
GO
Msg 5030, Level 16, State 5, Line 18
The database could not be exclusively locked to perform the operation.
Msg 5072, Level 16, State 1, Line 18
ALTER DATABASE failed. The default collation of database 'Brainiac' cannot be set to Latin1_General_CI_AS.
However is it better for me to change the database collation from SQL_Latin1_General_CP1_CI_AI to Latin1_General_CI_AS or is it better to run a script to alter every column and change the tables columns collation to SQL_Latin1_General_CP1_CI_AI ?
I have found some scripts on this site that might be useful e.g to change the DB collation > https://www.sqlservercentral.com/blogs/how-to-change-the-database-collation and https://www.sqlservercentral.com/scripts/change-database-collation
Also to change the columns collation to the DB default > https://www.sqlservercentral.com/scripts/change-column-collations-back-to-database-default
I am not sure what the best option is, as well as not quite sure why I cannot obtain an exclusive lock on the DB when no other process is using it to change the DB collation?
Which way should I be going?
Try to change the DB back to Latin1_General_CI_AS
OR try to change all the columns to SQL_Latin1_General_CP1_CI_AI ?
I really don't want to have to put collation statements in the thousands of places where string comparisons are being carried out.
Just in case it's useful SELECT @@VERSION returns > Microsoft SQL Server 2017 (RTM-GDR) (KB4505224) - 14.0.2027.2 (X64) Jun 15 2019 00:26:19 Copyright (C) 2017 Microsoft Corporation Express Edition (64-bit) on Windows 8.1 6.3 <X64> (Build 9600: )
Any help on this would be great.
Thanks in advance

Viewing 11 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

Erland Sommarskog SSC-Insane Points: 24038 More actions · Answer 1

You are probably hit by the fact that SSMS opens a second connection for Intellisense. Do this:

ALTER DATABASE DB SET SINGLE_USER WITH ROLLBACK IMMEDIATE
ALTER DATABASE DB COLLATE Latin1_General_CI_AS
ALTER DATABASE DB SET MULTI_USER

By the way, get hold of the latest Culmulative Update (CU) for 2017. There is no reason not to stay updated.

[font="Times New Roman"]Erland Sommarskog, SQL Server MVP, www.sommarskog.se[/font]

John Mitchell-245523 SSC Guru Points: 148800 More actions · Answer 2

Changing the collation of a database is difficult, because, as you've already seen, you have to change the collation of each character-based column as well. And it's not as simple as that script you linked to - indexes and constraints must be removed before you can change a column's collation (and put back afterwards). Worse still, some constraints and indexes are dependent on others, so you have to do it in a particular order. If it's an option for you, it would probably be easier to rebuild your server with a different collation, or put the database on a server that has the right collation.

John

Rob Reid-246754 SSCarpal Tunnel Points: 4183 More actions · Answer 3

Hi, thanks for the replies.

I did try that 3 line SQL code, to set into Single User Mode and then change the DB collation before putting it back to multi user and it did work. However when I looked at the collation of the Database columns for some reason the majority were in SQL_Latin1_General_CP1_CI_AI anyway and only 35 or so were in Latin1_General_CI_AS but they were on the columns in the 3 main tables I was working on when I ran into the collation error issue. So it seems the DB and majority of tables are in the SQL collation anyway. I just need to sort out the LATIN collation tables.

I did run into the referencing issue myself when trying to change the columns in those tables to the SQL collation, and it was due to some of them being referenced in new indexes.

However luckily - or not - depending on the point of view, for some reason none of the indexes on the tables were copied down when I ported the DB. It was done such a long time ago I really don't know what happened. It seems to have been some sort of mix of a backup and restore with some scripting as I cannot understand why most of the TABLE columns were actually the same as the DB collation of SQL_Latin1_General_CP1_CI_AI anyway - apart from those in a few tables and why there are no indexes anyway.

So as I have only just started adding indexes to the tables with Latin1_General_CI_AS as their collation I am going to remove those indexes and change them to SQL_Latin1_General_CP1_CI_AI , the same as the DB, then re-add the indexes afterwards.

One thing I am missing on an SQL Express Edition is MS Agent which is a shame as I used to have a "auto missing index" DB, that used the DMV's and stored all missing indexes or indexes that could be removed in tables so that if the service or server stopped I wouldn't lose the stats and would get a good overview of the indexes I required over time.

Is there anyway to get MS Agent to run on an SQL Express DB, and if not what do people do to replicate it - use MS Windows Task Manager? Create a Windows Service with Timed Jobs that run SQL stored procs or some other method?

As I am missing all my indexes and this is going to be a new kind of system the old indexes which I could script and port down might not be appropriate so I was hoping to replicate this auto_index database somehow, so if anyone has ideas on the best way to this that would be great.

Thanks for your help.

John Mitchell-245523 SSC Guru Points: 148800 More actions · Answer 4

Yes, I think a Windows scheduled task (if that's what it's called these days) would be your best bet. I'm sure you don't need me to tell you this, but don't just blindly implement every missing index recommendation. You'll want to consider whether the benefits each one confers outweigh the cost of maintaining it. You'll also (nearly always) want to make sure that you don't create any indexes with a set of leading columns that are the same as the columns of another index.

John

Y.B. SSChampion Points: 11652 More actions · Answer 5

John Mitchell-245523 wrote:

Yes, I think a Windows scheduled task (if that's what it's called these days) would be your best bet. I'm sure you don't need me to tell you this, but don't just blindly implement every missing index recommendation. You'll want to consider whether the benefits each one confers outweigh the cost of maintaining it. You'll also (nearly always) want to make sure that you don't create any indexes with a set of leading columns that are the same as the columns of another index.
John

+1

If you follow every missing index recommendation you will stand a good chance of ending up with overlapping indexes. Although indexes are your friend they are not "free". Be mindful of the downsides of (too many) indexes.

SELECT quote FROM brain WHERE original = 1
0 rows returned

Erland Sommarskog SSC-Insane Points: 24038 More actions · Answer 6

ALTER DATABASE COLLATE only changes the default collation and the collation of the tables in the system catalog. I got the impression from your post that all the user-table columns had the desired collation. If that is not the case, you will need to change them manually. Preferably before you index them, since you cannot change the collation of an indexed column. (Since the collation affects how the index is organised physically.)

SQL Server Agent is not supported with Express.

And I agree with others that adding all suggested missing index is counterproductive.

[font="Times New Roman"]Erland Sommarskog, SQL Server MVP, www.sommarskog.se[/font]

Rob Reid-246754 SSCarpal Tunnel Points: 4183 More actions · Answer 7

I know overlapping indexes are an issue but one a production system having a permanent DB which stores the missing index DMVs so you don't lose recommendations between server/service shut downs is a good idea because you can then see the actual real cost of indexes, missing or ones not being used and it helps guide you in creating indexes - not that you should follow the recommendations blindly, but they are a guide to what queries are missing index coverage especially in production.

At least then you can see which indexes have high counts and low counts and use your brain to create indexes that can cover one or more recommendations. I have used this system since DMVs came out and it has saved a lot of time in working out which indexes to create as you may think certain queries/indexes are the most important but from production you might actually see totally different ones are being used more.

I have never said to just add multiple indexes on top of each other and especially with SQL Express I would be even more wary of making too many indexes per table as the size probably comes into play a lot more therefore I want to know the actual key / important queries that my system is using that require coverage not just add a load from my own testing and then find out that my tests were just a tiny percentage of all queries made and that actually I really needed totally different index coverage.

This is why that system is so good as it keeps the counts in-between server/service shut downs when the DMVs counters are reset so you don't create a biased count from your own query testing.

I did "think", that the table collation was the one I needed to change the DB to, but then after testing I found that it was my testing on these 2 main tables that had made me think this. For some reason they have a different collation than the DB and the rest of the tables. I think what happened was I ported down a copy of the DB (backup/restore), then realised the size constraints of SQL Express, and created a new DB with the new collation and ran scripts porting over data but just for certain tables, and certain date driven data, e.g I have records of races going back before 2000 but don't really need all that data so a few years has been ported instead.

Anyway as these indexes weren't been copied down from the existing system keeping the DBs initial collation as SQL and then changing the columns to LATIN ( rather than the other way round ) seems the way to go. I can drop my indexes on these main tables now - then change the tables collation with the query that outputs all the ALTER table statements e.g

SELECT 'ALTER TABLE ['+USER_NAME(o.uid)+'].['+o.[name]+'] ALTER COLUMN ['+c.[name]+'] '+

CASE

WHEN c.prec IS NULL THEN t.[name]

ELSE t.[name]+'('+CONVERT(varchar(5),c.prec)+')'

END+' COLLATE '+t.collation

FROM syscolumns c

JOIN sysobjects o ON (c.id = o.id)

JOIN systypes t ON (c.xusertype = t.xusertype)

WHERE c.collation IS NOT NULL

AND o.type = 'U' -- NOT IN ('P','FN','TF','IF','S','V')

AND c.collation COLLATE Latin1_General_CI_AS != t.collation

ORDER BY o.[name]

- then rebuild the queries.

Then I need to look at a way to implement an "MS Agent" type replacement system that can run scheduled SQL jobs as its not just saving DMV data I want to do but system clean up / maintenance / table trimming - even more important on SQL Express / Query re-organisation/rebuilding / and hopefully be able to run the timed jobs that build betting systems if possible rather than have them all spawned from a Windows Service I was going to use with timers and numerous "jobs" , calling the relevant stored procedures.

Thanks for your suggestions

Erland Sommarskog SSC-Insane Points: 24038 More actions · Answer 8

That query is so... SQL 2000. Here is a mordernised version:

SELECT 'ALTER TABLE ' + quotename(s.name) + '.' + quotename(o.name) + 
       ' ALTER COLUMN ' + quotename(c.name) + ' ' + type_name(c.system_type_id) + 
       '(' + CASE WHEN c.max_length = -1 THEN 'MAX'
                  WHEN type_name(c.system_type_id) LIKE 'n%' THEN convert(varchar(10), c.max_length/2)
                  ELSE convert(varchar(10), c.max_length)
             END + ') COLLATE Latin_General_CI_AS '  + 
       IIF(c.is_nullable = 1, '', 'NOT ') + 'NULL ' 
FROM  sys.objects o
JOIN  sys.schemas s ON o.schema_id = s.schema_id
JOIN  sys.columns c ON o.object_id = c.object_id
WHERE o.type = 'U' -- NOT IN ('P','FN','TF','IF','S','V')
  AND c.collation_name <> 'Latin_General_CI_AS'
ORDER BY o.[name]

[font="Times New Roman"]Erland Sommarskog, SQL Server MVP, www.sommarskog.se[/font]

Rob Reid-246754 SSCarpal Tunnel Points: 4183 More actions · Answer 9

LOL

I like it, that query is so SQL 2000, it was and did use old system tables but it still worked with a change to make nvarchar(max) appear rather than nvarchar(-1).

I don't really know why SQL Server has kept all those older system tables if they now use the newer ones with sys. In front like sys.objects?

Anyway it was the query I got from one of the SQL Server articles I read about changing collation and linked to so I used it and it worked. Don't really see the point of rewriting something that worked just to use modern SQL when the more important thing was to get the job done and the code I got from this site worked.

I kept the DB collation as SQL and dropped the indexes I had created manually on the tables I had been working on that were using the Latin collation. Changed all the columns over to use SQL with that "old" query LOL, and then rebuild the indexes and it was job done.

I had used the single user mode, ALTER DB, multi user 3 lined SQL to change the DB collation to Latin and that worked fine, so the intellisense connection must have been the reason I couldn't run an ALTER DB statement even though I was the only user.

Anyway once I used that other piece of "old SQL" with the Information Schema views to show how many columns I had with each collation it seemed more obvious to make the whole DB and Tables SQL rather than Latin.

It seems to be a more modern collation name from what I can tell as a few of my older databases were the Latin collation but when I create new DBs in MS Management Console it defaults to that SQL collation (I am on my phone in bed so cannot see or copy and paste the full name but they are both shown on the initial post), so I was just wondering, as I haven't had a chance to look it up yet whether there is a reason the default DB collation changed to those prefixed with SQL now and what the differences are. I will look it up when I am near my PC and it's no biggy. Just wondered why the change.

Anyway even with the "old SQL" that still works even in 2020 it did the job so thanks to SQLServerCentral for having the various articles I found when searching for Collation issues and thanks to anyone who replied for there help.

Maybe you should write an article using the more modern SQL to do the job of removing references and writing out the CREATE Index/Key code to rebuild them later, then the ALTER TABLE statements to change the columns over, then the DB, if you think the code is to dated. I am just happy it got the job done so not to worried about the code as everything works perfectly now.

Thanks

Erland Sommarskog SSC-Insane Points: 24038 More actions · Answer 10

The old system tables are now compatibility views for those that have old scripts running. There is a particular reason the old views may not work well here: in those days owner and schema was the same thing, so I don't know how these views work when you have multiple schemas owned by dbo.

As for collations, the ones starting with SQL_ are legacy collations from SQL 7 and earlier. The Windows collations are certainly more modern. The SQL collations has some funny behaviour and they can also cause performance issues of you do things like

WHERE indexedvarcharcol = @nvarchar

The varchar column will be converted to nvarchar, and because the rules for varchar and nvarchar are different in an SQL collation, the index cannot be used.

What you see as default depends on your system collation, which you set on installation. And the default at installation is derived from the system locale. The default is always a Windows collation, with one exception: your locale is English (United States).

For a decently versatile collation-change script, see this blog post from Hugo Kornelis. Don't forget to check out the comments, as several people has added improvements to his original post. https://sqlserverfast.com/blog/hugo/2019/01/collation-change-script/

[font="Times New Roman"]Erland Sommarskog, SQL Server MVP, www.sommarskog.se[/font]