Collation Question - Going from Binary to SQL_Latin1_General_CP1_CI_AS

  • We have a database that is currently SQL_Latin1_General_CP850_BIN and would like to change the collation to SQL_Latin1_General_CP1_CI_AS. Would the only impact on the data be the sort order?

    The change would be made using ALTER TABLE...ALTER COLUMN.

    Thanks, Dave

  • declare @x1 varchar(20), @x2 varchar(20)

    select @x1 = 'a', @x2 = 'A'

    if @x1 collate SQL_Latin1_General_CP850_BIN = @x2

    print 'Equal 1' else print 'Not equal 1'

    if @x1 collate SQL_Latin1_General_CP1_CI_AS = @x2

    print 'Equal 2' else print 'Not equal 2'

    Results:

    Not equal 1

    Equal 2

  • That makes sense if I were comparing the two collations, but after running the ALTER TABLE...ALTER COLUMN all tables will be the default sort order/collation. Wouldn't comparisons at that point no longer be a problem or are you saying the users may see different results returned by the application following the collation changes as opposed to before the collation changes were made?

    Thanks, Dave

  • That colation change will impact sort order of the data, because:

    SQL_Latin1_General_CP850_BIN: is binary sort , on Code Page 850

    SQL_Latin1_General_CP1_CI_AS: is Dictionary sort , case-insensitive , on Code Page 1252

  • moh (8/22/2008)


    That colation change will impact sort order of the data, because:

    SQL_Latin1_General_CP850_BIN: is binary sort , on Code Page 850

    SQL_Latin1_General_CP1_CI_AS: is Dictionary sort , case-insensitive , on Code Page 1252

    To add a little flavor text to the above: a binary sort is implicitly case sensitive, because a lower case letter has a different binary representation than an upper case letter (otherwise, how would the computer know which is which?). So you not only impact sort order, but could see some modest changes in performance on sorting, given a huge enough collection of objects to sort.

    Also, you did not discuss HOW you would do this. I remember seeing knowledge base articles recently, when I was trying to change a database from CS to CI. The bottom line - it was easier, simpler, and more complete to delete the SS instance and reinstall with the correct sort order than to modify each table and field individually - the risk of error by missing an important table or field may be too great. IF I remember correctly, you back up your DATA, uninstall, reinstall with the correct collation sequence (you want), and then restore your database.

  • DBADave (8/21/2008)


    That makes sense if I were comparing the two collations, but after running the ALTER TABLE...ALTER COLUMN all tables will be the default sort order/collation. Wouldn't comparisons at that point no longer be a problem or are you saying the users may see different results returned by the application following the collation changes as opposed to before the collation changes were made?

    Thanks, Dave

    Before 'a' <> 'A', after 'a' = 'A'

  • Thanks everyone. Here's the tricky part. This is a vendor's database. We have there application's DBs stored on a shared development server. We have yet to migrate them to production. 7 of their DBs use the default SQL collation, while one uses binary. We asked if that DB could originally be installed using the default SQL collation and they said yes. Somewhere along the line they provided us with an updated version of their DB that reset the collation to binary. Now we are trying to determine the impact of going back to the SQL default. Sort order is one area I warned them about and differences in data comparisons is another. I performed these steps to make the change.

    1. Script all PK, FK and Indexes

    2. Script Views

    3. Dropped all PK, FK and Indexes

    4. Ran script to loop through all tables and execute ALTER TABLE...ALTER COLUMN, setting the collation back to the SQL default

    5. Ran script to add PK, FK and Indexes

    6. Ran script to create views

    The collation at the DB level was reset to the SQL default prior to running the above steps.

    My question is should I have created a blank database and used SSIS to import the data into the tables defined with the default collation? If the only issue they will experience is a different sort order they can live with that, however if the functionality of the application could change due to changing the sort order from binary to SQL default, then I have concerns.

    Thanks again

  • IF you are dealing with a vendor's DB, why are you attempting to reset the collation? What conflicts are you encountering which motivate the need to change what the vendor may have (sloppily) decided makes their application work? I'm not saying that you don't know better than the vendor, you probably do. But if they're a vendor, then THEY are getting paid $, but you are stuck cleaning up behind them.

    It's good to try and standardize across all components of their processing, so that all your processes can remain standardized. However, sometimes developers have 'subtle' optimization rationales, or by misunderstanding an error they encounter, offload the burden onto a database rather than fix their own code (sometimes for very good reasons! - at least as they see it :P).

    And, if you break the vendor's application by changing the collation, who has to fix it? You or the vendor?

  • We asked the vendor at the beginning of the project why binary was necessary and they initially said it would yield a 30% performance gain, but they could not explain why. They went on to say it had to be enabled at the instance level. We tried to explain that collation could be set on a DB by DB basis, but they did not appear to understand. We told them our desire is to save $$$ in the development environment and use an existing shared DB server. We could not change the collation setting at the instance level on this server. They spoke with their developers and told us binary collation is not necessary for the specific DB in question. They have since stated there would be no performance difference with using binary or

    SQL Server's default collation. Since they approved our changing the collation to the database in question we assumed they knew what they were talking about. We shouldn't have assumed. 🙁

  • steve smith (8/22/2008)


    And, if you break the vendor's application by changing the collation, who has to fix it? You or the vendor?

    why not to ask the vendor frankly to install application in the code page you want, and let them have the responsibility and risk for that change

  • If it's not custom code, having vendors implement such a change for their 'COTS' product can be a tremendous burden on them. Especially if only one customer asks for the change.

    However, there should be a 'formalized' agreement or memorandum of understanding that changing the collation sequence within the database does not nullify any obligations the vendor has to supporting the product. Given the history of correspondence, the vendor should have no problems with such a formality.

  • Having the vendor agree to the change, while continuing support of their application isn't the issue. They agreed to the change during the planning stages of the project and the initial implementation contained the correct collation. Sometime during the first couple of months of the project a database backup was provided to us by the vendor and the backup was restored to our development database. A month or two after the restore we noticed the database-level collation was set to binary so we assumed it was the result of the restored database. The vendor agreed and told us we could set it back to the default, which we did. Last week the vendor provided us with a script that dropped and recreated a table. Shortly after the table change was made a developer reported a collation error. We determined the table was recreated with the default collation, because that is how the database is defined and no specific collation statements were in the CREATE TABLE script. When problems continued we did some more digging and discovered that all tables, views and SPs were using Binary, with the exception of the recently recreated table. We informed the vendor of the problem and the steps needed to fix the issue. They looked to us to correct the problem. While we could have pushed back, because of time constraints and resource availability we offered to fix the issue. I prefer they make the changes since it is their product, however we have deadlines that must be met so our DBAs had to get involved. My point of this post was to get a complete understanding of the potential data issues resulting from a collation change from Binary to SQL default and provide both our users and the vendor with a disclaimer.

    Dave

  • Sadly, I am not longer shocked by this level of incompetence by a vendor. Actually, I have come to expect it.

    Another big problem to watch is a mismatch between the collation of tempdb and the application database. If they are different, and the code was not written to allow for that, you will encounter collation errors with temp tables.

  • I would think your best bet for minimal problems would be to set up another instance on that server using the binary collation, server resource permitting of course.

  • Good point about tempdb.

    The initial recommendation was to install a second instance with binary collation, but we pushed back on the vendor asking why a dedicated instance is needed to support one database. If they could prove the use of binary helps the application performance enough to warrant the cost of an additional instance we would have no problem installing the instance. However, after initially stating there was a 30% gain in performance they recently told us there is no performance gain with their application if the 8th database is using binary collation or the SQL default. Now all I'm trying to do is get a good picture of what problems the business users may encounter, despite the vendor being confident the collation change will not cause issues.

    Thanks, Dave

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply