Collation: Confusion

I started reading about collations after I had a recent run in with them. As I read I started to realize that for a seemingly simple subject there was a lot to it. So I thought I would go over some of what I found out.

In Europe, Canada and other locations where multiple languages are a matter of course I imagine that collation is dealt with on a regular basis. Here in the US we tend to ignore and take it for granted unless we have a specific need (case sensitivity for example). I’ve found a good analogy for collation is honey bees. Perfectly safe if you give them some respect and are careful. However if you get lazy and stop paying attention you are likely to get stung. And boy did we get stung!

I work for a large company that has a lot of databases from a number of different locations, some purchased, some created in house. Because of this we have a very mixed bag. Add to it the fact that we tend to move data around between systems and you can see how I’ve had a number of run-ins with collation over the years. Recently however I had one that was at once the most difficult to fix and the easiest to prevent.

I received a call from one of my co-workers one Saturday afternoon. A very frustrated call at that. He was working on moving some databases off of an old server and on to a new one. It should have been a 1-2 hour job at most, and that includes user testing. He was now at 10-12 hours and counting. After a bit of digging it turned out that the collation on the new server didn’t match the collation on the old server. Thus the users were seeing frequent collation errors. Over the next couple of hours I helped him change the collation of the databases to the new collation. At 16 hours (for a 2 hour job) the users called it and we reverted back to the original server.

Come Monday morning the question of the hour was “Why didn’t this show up when we moved the development and test environments?” A bit of digging and we discovered something rather disturbing. The development and test instances were one collation and production was a different one. It turns out one member of the team installed the first 2 instances and a different member of the team installed production. From there it was just a simple mix up in communication.

Now planning begins. First we had to postpone the database move while we corrected the problem. We decided to correct the production environment rather than development and test since they were the same as most of the rest of our environment. The production environment was small fortunately. Only 8 user databases none of which were large, and of course the system databases. I went through the joys of fixing the collation in the last few posts but nothing can really convey how much fun it really is.

First we backed everything up of course.
Then detached everything but the system databases.
Next a member of our operations team changed the collation on the index by rebuilding the system databases.
And last but painfully not least we reattach and “fixed” each of the user databases.

So here is the moral of the story (since my favorite stories have morals). A simple 10 second mistake/breakdown in communication cost us:

16 hours from the first attempt at moving the databases
loss of reputation from the mistake
a combined 15-20 hours to correct the mistake

I HIGHLY recommend making sure that your sdlc stack of instances (development/test/staging/production) is the same collation from top to bottom. And for that matter treat collation as what it is. Something important that can cause a great deal of difficulty if not paid attention to.

Over a number of different posts I’m going to discuss some of the surprisingly deep subject of collation.

As a start listen to Rob Farley’s hilarious talk on collation
Collation: Definition
Collation: Location
Collation: Expression
Collation: Correction -> Expression, Column, Database
Collation: Correction -> Instance
Collation: Temporary objects & variables: Demonstration

Filed under: Collation, Microsoft SQL Server, SQLServerPedia Syndication, T-SQL Tagged: code language, collation, language sql, microsoft sql server, T-SQL

Book Review: Big Red - Voyage of a Trident Submarine

by Andy Warren

SQLServerCentral.com

Blogs

I've grown up reading Tom Clancy and probably most of you have at least seen Red October, so this book caught my eye when browsing used books for a recent trip. It's a fairly human look at what's involved in sailing on a Trident missile submarine...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-03-10

1,439 reads

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

by Robert Davis

SQLServerCentral.com

Blogs

Question: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? This question was sent to me via email. My reply follows. Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? Databases to be mirrored are currently running on 2005 SQL instances but will be upgraded to 2008 SQL in the near future.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-23

1,567 reads

Inserting Markup into a String with SQL

by Phil Factor

SQLServerCentral.com

T-SQL

In which Phil illustrates an old trick using STUFF to intert a number of substrings from a table into a string, and explains why the technique might speed up your code...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-18

1,631 reads

Networking - Part 4

by Andy Warren

SQLServerCentral.com

Blogs

You may want to read Part 1 , Part 2 , and Part 3 before continuing. This time around I'd like to talk about social networking. We'll start with social networking. Facebook, MySpace, and Twitter are all good examples of using technology to let...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-17

1,530 reads

Speaking at Community Events - More Thoughts

by Andy Warren

SQLServerCentral.com

Blogs

Last week I posted Speaking at Community Events - Time to Raise the Bar?, a first cut at talking about to what degree we should require experience for speakers at events like SQLSaturday as well as when it might be appropriate to add additional focus/limitations on the presentations that are accepted. I've got a few more thoughts on the topic this week, and I look forward to your comments.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-13

360 reads

Collation: Confusion

Rate

Share

Share

Rate

Collation: Confusion

Rate

Share

Share

Rate

Related content

Book Review: Big Red - Voyage of a Trident Submarine

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

Inserting Markup into a String with SQL

Networking - Part 4

Speaking at Community Events - More Thoughts