SQLServerCentral Article

Beware of Mixing Collations: Converting Collations

,

Beware of Mixing Collations - Part 2

Converting Collations

With SQL Server 2000 you are able to create databases or columns with a different collating setting than the server (discussed in Part 1: https://www.sqlservercentral.com/articles/beware-of-mixing-collations-part-1). This is especially useful when you are trying to restore from a SQL Server 7.0 database backup, where you are not sure of the collating sequence for the SQL Server 7.0 machine. If after your restore you desire to change the collation sequence, then SQL Server 2000 will allow this. Depending on which collation settings you want to change will determine the method, and how much effort it will take to pull off the collation change. This article will explain a couple of methods I have used to successfully change collation settings.

There are three distinct collation settings. The first one is the server collation setting that is initially set when you install SQL Server. There is the database collation setting, which defaults to the server collation setting, or can be specifically defined to be different than the server collation setting when creating a database. Lastly there are collation settings for columns, that by default take the database collation setting, or if you wish you can specify a collation setting when creating a column.

Changing the collation setting for a server is quite complex, so I will save that for the end. Although saying that doesn't mean that changing the collation setting for a database and all the character columns in the database is a piece of cake either. Lets first look at how you might change the collation setting for a given column.

Changing a Columns Collation Setting

To alter the collation setting for a given column you use the ALTER TABLE statement. Columns that are referenced in a foreign key, check constraint, index, and/or have

distributed statistic cannot be altered without first dropping the column

constraints.

Let's go through a couple of examples. The first example will change the collation of a varchar column from Danish_Norwegian_CS_AS to Latin1_General_CI_AS. This example uses the “ALTER COLUMN” clause of the “ALTER TABLE” statement to make the column collation conversion.

CREATE TABLE Test_tbl
  (Record_id   int PRIMARY KEY,
   Name     varchar(50) COLLATE
Danish_Norwegian_CS_AS   NOT NULL
  )
GO
select * from information_schema.columns
go
—
— Here is where the collation setting is changed
—
ALTER TABLE Test_tbl ALTER COLUMN Name
            varchar(50)
COLLATE Latin1_General_CI_AS NOT NULL
GO
select * from information_schema.columns

If you run the above code then review the output displayed from the information_schema.columns view, you can see that the collation setting for the column NAME was changed from Danish_Norwegian_CS_AS to Latin1_General_CI_AS.

Now lets modify the above example slightly, by making the NAME column in table Test_tbl the primary key. By doing this you now will not be able to alter the NAME columns collation setting without first dropping the primary key. As you can see in the example below, I altered the table to drop the primary key constraint, then altered the column collation, and finally re-created the dropped primary key constraint.

CREATE TABLE Test_tbl
  (Record_id   int ,
   Name     varchar(50) COLLATE
Danish_Norwegian_CS_AS   NOT NULL
constraint pk_name primary key
  )
GO
select * from information_schema.columns where table_name
= ‘Test_tbl'
go
—
— Drop the primary key constraint
—
alter table Test_tbl Drop constraint pk_name
—
— Here is where the collation setting is changed
—
ALTER TABLE test_tbl ALTER COLUMN Name
           
varchar(50)COLLATE Latin1_General_CI_AS NOT NULL
—
— Re-create the primary key constraint
—
alter table Test_tbl add constraint pk_name primary key
(name)
GO
select * from information_schema.columns where table_name
= ‘Test_tbl'
 
drop table test_TBL

Now imagine a database that contained a large number of tables and a fair number of constraints (foreign key, check, and index). The process of identify all the constraints, then building the script to drop the constraints, altering each character column collation and the re-creating the constraints sounds like a lot of work, right? Well it is! It can be done, but it will be a time consuming, labor-intensive process. When using this method you might be in for a few small bumps in the road, as you try to ALTER COLUMNS that are still referenced by constraints.

I've found this process to be too frustrating for most cases, except when I only have a few columns that need their collation settings changed. If you have a large number of character fields then you might want to read the section below, on migrating a database and columns to a new database with a new collation settings.

Changing the Database Collation Setting

Each database has a default collation setting. For each new character field created, that does not have a COLLATE clause, this default setting is used. If you desire to change the default collation setting for a database, you will need to use the ALTER DATABASE command. Here is an example that sets the default collations of the SAMPLE_DB database to French_CI_AS collation:

ALTER DATABASE SAMPLE_DB COLLATE French_CI_AS

Be aware that changing the default collation of a database does not change the collation settings of any columns in existing tables. The ALTER COLUMN clause of the alter table statement will need to be used to alter the collation settings for any existing columns in user-defined tables, as described above.

Changing the Server Collation Setting

Changing the collation settings isn't an easy task. There is no ALTER SERVER command to help change the collation setting for the server. In order to change the server collation you will need to rebuild the server.

This means you will need to export all the data in the existing databases. Gather up and/or generate all the scripts to recreate the user databases and objects in those user databases. Drop all the user databases and rebuild the MASTER database. When you rebuild the MASTER database you will specify your new collation setting.

Note that if you are generating table create scripts from a SQL Server 2000 database then they might contain COLLATE clauses. Prior to executing generated scripts scan your CREATE TABLE and DATABASE scripts to verify that there are no COLLATE clauses. If you don't do this then potentially your newly migrated columns might end up with their original collation settings.

Once you have rebuilt the MASTER database then you can re-create all the user databases. Each object in each user database will need to be re-created with the collation setting of the database. If your create table statements do not have any COLLATE clauses then all columns will be given the default collation setting for the database. Once all your tables are created you will need to import the data using the files exported from the database that contained the wrong collation settings.

Once again not a simple task, but it can be done. This is why choosing your collation setting up front is very important.

Alternative Method for Migrating Database and All Columns Within to a New Collation

Now if you need to change all the collation settings in a database, then you might consider migrating your data to a new database that has the desired collation setting. This typically is how I migrate from one collating sequence to another.

I accomplish this by performing the following multi-step process. This process can be done in a number of different ways, I will walk you though the steps I normally use.

Step 1: Create a new database with the desired default collation setting.

Step 2: Generate all the CREATE TABLE statements from the database that has the wrong collation settings. Only generate the create table statements. Don't generate any constraints. If your scripts where generated from a SQL Server 2000 database remember to review your scripts for COLLATE clauses (see note above).

Step 3: DTS the data from the database that has the wrong collation settings, to your newly created database.

Step 4: Generate all the constraints from the database with the wrong collation settings. Apply the generated constraints to your newly created and populated database. If you don't have clean data you might have to apply some of the constraints with the NOCHECK option.

Step 5: Generate the scripts to create all views, stored procedures, roles, user and/or any other objects you might need from the database with the wrong collation settings and execute the generated script on the new database.

Step 6: Backup your newly created database and restore it over the top of the database with the wrong collation. Of course only do this if you want to replace your original database with your new database, which has a new collation setting.

This method is not simple, and is still error prone. You need to take care when generating scripts that nothing is missed and the scripts generated build things in the proper order.

I've found that SQL Server tends to generate things in alphabetical order, which is not always an appropriate order. I've resorted to using a third party tool "SQL Compare" from Red Gate Software for generating my scripts. I also used this tool to compare both old and new databases to make sure no objects were missed when migrating objects from the database with the wrong collation settings to the new database.

Even though this method is time consuming I've found that it is a reliable method to convert from one collation to another. I'm sure there might be other methods out there for converting collation settings.

Conclusion

Converting from one collation setting to another is not easy. There is no simple tool to perform this transformation. In order to minimize collation issues it is best to require all server databases to have the same collation settings, even if it requires you to migrate your data from one collation setting to another.

Rate

4.33 (3)

You rated this post out of 5. Change rating

Share

Share

Rate

4.33 (3)

You rated this post out of 5. Change rating