Collation oddities

  • Hi All, this is my first post, so some help would be appreciated.

    Here is my problem:

    I have a SQL server db which has to maintain data from multiple countries. Of of the characters we are having trouble with is the 'ę' - know as the e-with-ogonek. If I try to save to my db when using normal Latin1_General_CI_AS_WS collation, all I get is a normal 'e' character in the field. Changing the collation on the field to Polish_CI_AS_WS has no effect, we get exactly the same symptoms. However, if I change the collation for the entire DB, it works fine. This is a complete pain, as we also have to support a number of other languages/codepages at the same time - I really don't want to have to change the DB collation just for one character, and possibly break half a dozen others. This all seems very odd to me - why does setting the collation on the field have no effect?

    To test this, I did the following: I created two databases - database A with Latin1 collation and database B with Polish collation. Each db contains one table. The structure is as below.

    USE [TestUnicode]

    GO

    /****** Object: Table [dbo].[TextTable] Script Date: 01/22/2009 15:40:18 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    CREATE TABLE [dbo].[TextTable](

    [testText] [nvarchar](50) NOT NULL,

    [TextID] [int] IDENTITY(1,1) NOT NULL,

    CONSTRAINT [PK_TextTable] PRIMARY KEY CLUSTERED

    (

    [TextID] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    I then attempting all permutations of collation at database and field level. I also tried setting the field datatype of 'testText' to nvarchar(MAX), ntext and 'xml'.

    Everytime I had exactly the same result - it will only read the e-with-oganek correctly when the entire database collation is set to Polish. Why is this? Does setting collation at field level do anything at all?

  • Collation affects the way that data is compared and sorted, not how it is stored. As long as you are using a unicode datatype then you should be able to store that character.

    How are you inserting your data into the table ?

    are you including the leading N for unicode data?

  • Thanks for the reply, and yes I am using the leading 'n'.

    First off, I agree with you, collation should only affect how data is compared and sorted. However, that is not what is happening. I can only store the characters with ogoneks when the entire database is using Polish collation. This is precisely what my post is about.

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply