Collation: Definition

I had a recent run in with collation problems and it got me started reading. As I read I started to realize that for a seemingly simple subject there was a lot to it. So I thought I would go over some of what I found out. First things first of course.

What is a collation? Collation is a property assigned to any character field that defines the sort and comparison rules for the field. The information I’m going over is mostly found in BOL under Collation and Unicode Support. It’s something like 10 pages long which just goes to show I wasn’t exaggerating about the depth of the subject.

Options

Prior to SQL Server 2012 there were four options associated with collation with a 5th added in SQL Server 2012. They are specified by appending them to the collation name along with a _. So Collation_CS_AS_KS_WS_SC is case sensitive, accent sensitive, kana sensitive, width sensitive and supplementary character sensitive.

Case sensitivity (CS/CI) – Will the sort/comparison take case (upper/lower) of the characters into account. If CS (case sensitive) it will and if CI (case insensitive) it will not.
Accent sensitivity (AS/AI) – Will the sort/comparison take into account accented characters (a/?). If AS (accent sensitive) then it will. If AI (accent insensitive) it will not.
Kana sensitivity (KS) – This option is specific to Japanese kana characters. There are two types of kana characters, Hiragana and Katakana. If KS is specified then the sort/comparison will be treat the kana characters differently. If it is not specified then they are treated the same. There is no KI.
Width sensitivity (WS) – I find this an interesting option. This determines if the storage (one byte or two) is taken into account in the sort/comparison. There is no WI. If WS is included then the collation is width sensitive. If it isn’t included then the collation is not width sensitive
UTF-16 Supplementary Characters (SC) – Thanks to Bob Beauchemin for pointing this one out to me. But basically this is new as of SQL 2012 and it supports UTF16 characters that require more than a 16bit (2 byte) word. It is supported in 90 level and up collations.

Collation sets

SQL Server supports three different collation sets. Windows collations, Binary collations, and SQL Server collations. I honestly found the differences rather confusing so I’ll let you read up on them yourself.

Code pages

In addition to the sort/comparison options, for non-Unicode data types (char/varchar/text) collation also determines the code page. Because non-Unicode types only have one byte per character they are more restricted in the number of characters available. The code page determines what character set is available. So for example Hebrew collations will use an entirely different set of characters than English ones. Unicode datatypes (nchar/nvarchar/ntext) don’t have the same problem. Because they have the additional byte per character they can hold a large enough number of possible characters that a code page isn’t really necessary.

Over the next few weeks I’m going to do a number of different posts around the surprisingly deep subject of collation.

As a start listen to Rob Farley’s hilarious talk on collation
Collation: Location
Collation: Expression
Collation: Correction -> Expression, Column, Database

Collation: Correction -> Instance

Collation: Temporary objects & variables: Demonstration

Collation: Confusion

Filed under: Collation, Microsoft SQL Server, SQLServerPedia Syndication, T-SQL Tagged: code language, collation, language sql, microsoft sql server, T-SQL

Book Review: Big Red - Voyage of a Trident Submarine

by Andy Warren

SQLServerCentral.com

Blogs

I've grown up reading Tom Clancy and probably most of you have at least seen Red October, so this book caught my eye when browsing used books for a recent trip. It's a fairly human look at what's involved in sailing on a Trident missile submarine...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-03-10

1,439 reads

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

by Robert Davis

SQLServerCentral.com

Blogs

Question: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? This question was sent to me via email. My reply follows. Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? Databases to be mirrored are currently running on 2005 SQL instances but will be upgraded to 2008 SQL in the near future.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-23

1,567 reads

Inserting Markup into a String with SQL

by Phil Factor

SQLServerCentral.com

T-SQL

In which Phil illustrates an old trick using STUFF to intert a number of substrings from a table into a string, and explains why the technique might speed up your code...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-18

1,631 reads

Networking - Part 4

by Andy Warren

SQLServerCentral.com

Blogs

You may want to read Part 1 , Part 2 , and Part 3 before continuing. This time around I'd like to talk about social networking. We'll start with social networking. Facebook, MySpace, and Twitter are all good examples of using technology to let...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-17

1,530 reads

Speaking at Community Events - More Thoughts

by Andy Warren

SQLServerCentral.com

Blogs

Last week I posted Speaking at Community Events - Time to Raise the Bar?, a first cut at talking about to what degree we should require experience for speakers at events like SQLSaturday as well as when it might be appropriate to add additional focus/limitations on the presentations that are accepted. I've got a few more thoughts on the topic this week, and I look forward to your comments.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-13

360 reads

Collation: Definition

Options

Collation sets

Code pages

Rate

Share

Share

Rate

Collation: Definition

Options

Collation sets

Code pages

Rate

Share

Share

Rate

Related content

Book Review: Big Red - Voyage of a Trident Submarine

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

Inserting Markup into a String with SQL

Networking - Part 4

Speaking at Community Events - More Thoughts