SQL Server Collation Choices

  • Hi there, please can I get some opinions on the subject of Server Collation.

    We are in the process of configuring a new SQL Server 2005 and we wish to choose the best Collation setting - for backwards compatibility, and to ensure that we do not hit problems in the future.

    We realise that the main consideration is to have consistency between various environments.

    The potential choices are either “SQL_Latin1_General_CP1_CI_AS” or “Latin1_General_CI_AS”.

    Microsoft documentation shows that “SQL_Latin1_General_CP1_CI_AS” is the US default, but there is also much talk about the SQL prefixed collations being superseded.

    We do not make heavy use of UNICODE characters.

    We are currently favouring the use of “SQL_Latin1_General_CP1_CI_AS” – we have a legacy SQL 2000 database that will be transferred to this server.

    Please advise whether the choice of “SQL_Latin1_General_CP1_CI_AS” is likely to cause any un-toward problems in the future.

    Many thanks,

  • I'm not really a collation expert, but since 2000 (and in 2005), databases can have separate collations, so there isn't an impact with the server collation.

    If you are moving data between databases, you'd have to really dig in and see if there are potential issues with the data you use.

  • It's true that databases can have their own collations, but you can run into problems if they don't match the tempdb collation - it depends on how your code is written. I won't make any recommendations because I'm definitely not an expert in this area and it's been too long since I had to look into it in detail, but this might be one of those cases where running multiple instances of SQL is beneficial .... one instance for legacy apps, and one for whichever collation you choose to use going forward.

  • You should be aware that SQL Collations are going to be depreciated.

    Another thing is use of TERTIARY_WEIGHTS function that could kill performance of any server if you use SQL collations.

    In other terms - use of SQL collation and ANSI defined columns (char,varchar) in some cases(queries) yields use of TERTIARY_WEIGHTS function and table scans no matter if you have index on column or not. If you have big tables it could be a very big problem.

    So, if you can, go with clean install with windows collation. Also, change database collation and column level collations to the same - windows collation.

    A good starting point to go deeply is http://blogs.msdn.com/michkap/default.aspx

    rgds

    Sinisa

  • When we set up our SQL Server 2005 server, we did some research on the collation settings and we decided on using the Windows collation (Latin1_General_CI_AI) because of the sort issues using Unicode data and non-Unicode data and also because MS is going to deprecate the SQL Collations. http://support.microsoft.com/kb/322112

    However, when we migrated some of our SQL Server 2000 databases to SQL Server 2005, we did experience some issues because of the different collation settings between the TempDB and the migrated databases. We wrote a script to go through and alter the database, tables, and columns to change the collation settings of them to the Windows collation and everything works fine.


    Wendy Schuman

  • We are supporting SQL Server installations which has databases with non English data.

    Our practice is to keep the server and database collation as SQL_Latin1_General_CP1_CI_AS

    On the object level we use nvarchar for non English language string data and the respective collation.

    This serves the purpose well.

    Regards,

    Maz

    -----------------------------------------------------------[font=Arial Black]Time Is Money[/font][font=Arial Narrow]Calculating the Number of Business Hours Passed since a Point of Time[/url][/font][font=Arial Narrow]Calculating the Number of Business Hours Passed Between Two Points of Time[/font]

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply