change collation on a datase

  • Dear SQL Server Central,

    So far it has been a nightmare, if you got your database collation wrong.

    One of my colleagues has written a script that will do most of the hard work for you, and report where it fails.

    Would it be possible for someone to test (and/or review) this script for us?

    Best regards,

    Henrik Staun Poulsen

    http://www.sql.udstyr.dk

    --*******************************************************

    SET NOCOUNT ON

    /*

    Description: Code to change collation on a database.

    Pre_condition:

    Post_condition:

    Input:

    Output:

    Written by: Karsten Laursen, Vestas Wind Systems A/S, Denmark

    Change History: $Date: 6-07-07 9:12 $ , $Author: KLAUR $, $Revision: 1 $

    */

    DECLARE @NewCollation sysname,

    @CurrentCollation sysname,

    @DatabaseName sysname,

    @SQLString nvarchar(max),

    @TableName sysname,

    @ColumnName sysname,

    @Nullable varchar(3),

    @DataType nvarchar(128),

    @CharacterLength int,

    @DashString varchar(100),

    @ErrorMessage nvarchar(max),

    @ErrorMessageLength int,

    @ErrorCount int,

    @PrintString varchar(100),

    @i int

    SET @NewCollation = 'Latin1_General_CI_AS'

    --SET @NewCollation = 'Danish_Norwegian_CI_AS'

    SET @CurrentCollation = CAST( DATABASEPROPERTYEX ( db_name(), 'Collation' ) AS sysname)

    SET @DatabaseName = db_name()

    SET @SQLString = ''

    SET @DashString = '---------------------------------------------------------------------'

    SET @ErrorMessage = ''

    PRINT 'Current database: ' + @DatabaseName

    PRINT 'Current collation: ' + @CurrentCollation

    PRINT 'New collation: ' + @NewCollation

    PRINT ''

    IF EXISTS (SELECT 1 FROM master.dbo.sysprocesses

    WHERE dbid = db_id()

    AND spid <> @@spid

    )

    BEGIN

    PRINT 'Database ' + @DatabaseName + ' could not be locked exclusively'

    RETURN

    END

    IF @NewCollation NOT IN (SELECT name FROM master.dbo.fn_helpcollations())

    BEGIN

    PRINT @NewCollation + ' is not a valid collation'

    RETURN

    END

    IF @NewCollation <> @CurrentCollation

    BEGIN

    SELECT @SQLString = 'ALTER DATABASE ' + @DatabaseName + ' COLLATE ' + @NewCollation

    BEGIN TRY

    EXEC (@SQLString)

    PRINT 'Database collation has been changed'

    END TRY

    BEGIN CATCH

    PRINT 'Database collation could not be changed due to the follwing errors:'

    PRINT @DashString

    EXEC (@SQLString)

    RETURN

    END CATCH

    PRINT ''

    END

    DECLARE @Columns TABLE

    (

    TableName sysname,

    ColumnName sysname,

    Nullable varchar(3),

    DataType nvarchar(128),

    CharacterLength int,

    Changed bit,

    Error bit,

    ErrorMessage nvarchar(2048)

    )

    INSERT @Columns

    (

    TableName,

    ColumnName,

    DataType,

    CharacterLength,

    Nullable,

    Changed

    )

    SELECT T.TABLE_NAME,

    C.COLUMN_NAME,

    C.DATA_TYPE,

    C.CHARACTER_MAXIMUM_LENGTH,

    C.IS_NULLABLE,

    0

    FROM INFORMATION_SCHEMA.COLUMNS C

    JOIN INFORMATION_SCHEMA.TABLES T

    ON C.TABLE_NAME = T.TABLE_NAME

    WHERE C.DATA_TYPE IN ('char', 'nchar', 'varchar', 'nvarchar', 'text', 'ntext', 'sysname')

    AND C.COLLATION_NAME <> @NewCollation

    AND T.TABLE_TYPE = 'BASE TABLE'

    IF EXISTS (SELECT 1 FROM @Columns)

    BEGIN

    PRINT 'The following columns has been changed:'

    PRINT @DashString

    END

    ELSE BEGIN

    PRINT 'No changes needed'

    RETURN

    END

    WHILE EXISTS (SELECT 1 FROM @Columns WHERE Changed = 0 AND Error IS NULL)

    BEGIN

    SELECT TOP 1 @TableName = TableName,

    @ColumnName = ColumnName,

    @DataType = DataType,

    @CharacterLength = CharacterLength,

    @Nullable = Nullable

    FROM @Columns

    WHERE Changed = 0

    AND Error IS NULL

    ORDER BY TableName, ColumnName

    SELECT @SQLString = 'ALTER TABLE ' + @TableName

    + ' ALTER COLUMN ' + @ColumnName

    + ' ' + @DataType

    + CASE WHEN @CharacterLength = -1 THEN '(max)'

    WHEN @DataType IN ('text', 'ntext') THEN ''

    ELSE '(' + CAST (@CharacterLength AS nvarchar(10)) + ')' END

    + ' COLLATE ' + @NewCollation

    + ' ' + CASE @Nullable WHEN 'NO' THEN 'NOT NULL' ELSE 'NULL' END

    BEGIN TRY

    EXEC (@SQLString)

    UPDATE @Columns

    SET Changed = 1

    WHERE TableName = @TableName

    AND ColumnName = @ColumnName

    PRINT @TableName + ', ' + @ColumnName

    END TRY

    BEGIN CATCH

    UPDATE @Columns

    SET Error = 1,

    ErrorMessage = ERROR_MESSAGE()

    WHERE TableName = @TableName

    AND ColumnName = @ColumnName

    SELECT @ErrorMessage = @ErrorMessage + @TableName + ', ' + @ColumnName + ': ' + ErrorMessage + char(10)-- + char(13)

    FROM @Columns

    WHERE TableName = @TableName

    AND ColumnName = @ColumnName

    END CATCH

    END

    IF @ErrorMessage <> ''

    BEGIN

    SELECT @ErrorCount = COUNT(*)

    FROM @Columns

    WHERE Error = 1

    SELECT @PrintString = 'The following errors occured (' + CONVERT(varchar, @ErrorCount) + '):'

    PRINT ''

    PRINT @PrintString

    PRINT @DashString

    SELECT @ErrorMessageLength = LEN(@ErrorMessage)

    SET @i = 1

    WHILE @i <= @ErrorMessageLength

    BEGIN

    PRINT SUBSTRING(@ErrorMessage, @i, 4000)

    SELECT @i = @i + 4000

    END

    END

  • Check following link:

    http://msdn2.microsoft.com/en-us/library/ms174269.aspx

    --USE THIS TO CHANGE DATABASE COLLATION

    http://msdn2.microsoft.com/en-us/library/aa197950(SQL.80).aspx

    --USE THIS TO CHANGE INSTANCE COLLATION

    ---------------------------------------------------
    "Thare are only 10 types of people in the world:
    Those who understand binary, and those who don't."

  • hi,

    This only change the collation on the database, not the objects in the database.

    Best regards,

    Henrik Staun Poulsen

    http://www.sql.udstyr.com

  • If your existing data may contain extended ascii (Γ₯Àâ and such), you'd probably want to store away your data before changing collation.

    The 'old and proven' way is something along these lines...

    1) Script out the db and all objects in it. (remove all references to COLLATION that is wrong)

    2) BCP out all data from all tables. (store in a safe place, or two..)

    3) Drop the db.

    4) Create db with the desired collation

    5) Create all objects from the script(s) (tables, procs, views, triggers, users etc..)

    ... make sure that the script(s) has no references to the 'old' collation

    6) BCP all data back in again.

    7) Done, go have a beer. =;o)

    /Kenneth

  • Hi Kenneth,

    My colleague has written the script in order to invalidate the "old proven" way.

    We have tested it on our database, but have not had the currage to go live with it yet.

    The problems we have found, have been fixed, or the script has been enhanced with a warning message, on the stuff it cannot handle.

    Hence this post, in the hope that somebody has the time to test it, or review it.

    It might also help other people without the skill or time to develop such a script.

    I also forgot to tell that it is an in-place fix; you do not need an additional copy of your data.

    That is one big advantage over your method!

    Thank you for posting,

    Henrik

  • When you do COLLATE it DOES *NOT* change the data that is *already* there. Keneth is right, BCP-OUT->Rebuild->BCP-IN is the way to go!

    Cheers,


    * Noel

  • Noel,

    Are you saying that "ALTER TABLE xx ALTER COLUMN yy COLLATE zz" does not change the data in the table, that it only changes how the table appears to the world?

    So could we get away with a "SELECT a,b,c INTO temp_xx" and then a "DROP TABLE xx", rename table ?

    We do not have disk space for an additional copy of the entire database, and only maybe we have space for an extra copy of our biggest table.

    I know that USB drives are not that expensive, but they require site visits (which are very expensive).

    TIA

    Henrik Staun Poulsen

  • Hmmm... perhaps the old arcane ways are up for revision...

    Did a very tiny test in tempdb on a 2005 box, and it indeed looks like

    alter table alter column collate actually seems to change the ascii values of data already stored in the column...

    /Kenneth

  • Kenneth,

    Hurraaaa. πŸ™‚

    Thank you for testing, and sharing.

    Henrik

  • Heh. Well, you need to do your fair share of testing also.

    After all, it's your data =;o)

    Especially look out for 'weird' chars (from an ascii point of view) that may not convert as expected.

    It all depends on if you do have extended ascii, from which collation to which collation you're going etc.

    Here's a small illustration of what may be an issue.. (or it may not)

    [font="Courier New"]

    select'Ø' collate SQL_Latin1_General_Cp1_CI_AS, ascii('Ø' collate SQL_Latin1_General_Cp1_CI_AS)

    select'Ø' collate SQL_Latin1_General_Cp437_CI_AS, ascii('Ø' collate SQL_Latin1_General_Cp437_CI_AS)

    ---- -----------

    Ø 216

    (1 row(s) affected)

    ---- -----------

    O 79

    (1 row(s) affected)

    [/font]

    /Kenneth

  • Be warned that if you change the collation of the database so that it is different from the rest of the server, and your application uses temp tables, you could be in for a whole world of grief, especially if those temp tables are joined or otherwise compared to tables in the database.

    John

  • This: http://support.microsoft.com/kb/325335 is the official way... funny it says the review happened in 2007 but it dos not mentions 2005 πŸ˜‰


    * Noel

  • John,

    >>Be warned that if you change...

    Which is exactly why we want to change the collation.

    We upgraded a lot of databases from sql 2000 to sql 2005, and the tempdb no longer had the same collation as the main database.

    So we added COLLATE DATABASE_DEFAULT to a lot of our queries.

    Please note that most people would write COLLATE Latin1_General_CI_AS (or whatever), but "Collate database_default" is a more generic, and defensive coding style.

    Now we would like to get rid of these "Collate Database_Default" because they cause performance problems (non-optimal usage of indexes).

    So I hope that we arrive at a script that works.

    Best regards,

    Henrik

  • Noel,

    But it requires room for two copies of your database.

    Our main database (at our biggest site) is about 800 GB, so having a backup ready at hand, and an additional copy; it adds up to 2,4 TB.

    If there is a way, I would perfer an in-place solution.

    Best regards,

    Henrik

  • Kenneth,

    Thank you for bringing this to my attention.

    Nice way to test the solution.

    We have these collations

    Danish_Norwegian_CI_AI

    Danish_Norwegian_CI_AS

    SQL_Latin1_General_CP1_CI_AS

    SQL_Scandinavian_Pref_CP850_CI_AS (the only problematic, as far as I can see)

    Latvian_CI_AS (the next one on the list after Latin1_General_CI_AS, but this is a site in the UK, so it only contains 7bit ASCII. Big Installation error)

    and lastly:

    Latin1_General_CI_AS (the prefered end result)

    So a lot of work remains.

    Best regards,

    Henrik

Viewing 15 posts - 1 through 15 (of 30 total)

You must be logged in to reply to this topic. Login to reply