how to write collation agnostic queries?

  • Hi All,

    A very warm happy new year to all of you!

    I have one question. I am writing a desktop application that was targeted to default SQL_Latin1_General_CP1_CI_AS collation. System fails to work if database is installed in any other server with a case sensitive collation say french, because field casing in queries does not match with field cases in database. Moreever there are issues with joins in tables having diffrent collations.

    Please note that I am installing my tables in a third party ERP database so I cannot control database collation. I can control collation of my tables but then there are issues when there is join between my tables and target database's tables

    Any ideas how to overcome this problem. Any help and best practices would be highly apprecaited.

    warm regards,

    Abhishek Jain

    jain_abhishek_77@yahoo.co.in

  • it's very important that when you create objects you don't include collate statements unless there is a precise reason to do so, that way your objects pick up the database collation, having objects of mixed collations in the same database is a real issue. To avoid issues with multiple database collations generally I've found using " COLLATE database_default " usually does the trick.

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

  • Thanks,

    I am already doing it. Still my queries fail in because they are written in case insensitive way. I would like to know how other people are doing it. One option is to take care of cases when you write your queries. There are issues when you compare data values. Like 'JOHN' and 'john' are different. Is there any option to coerce collations at connection level?

    thanks,

    Abhishek

  • you can't set a session/connection to use a certain type of collation...it's all in the database.

    we run into this occasionally, where a client will have a default collation of case sensitive,and then they create a database...which our program assumed to be case insensitive.

    In our case, we make the client run a script to change the db to case insensitive, and then change every single column of type text/ntext/char/varchar etc. to the proper collation as well.

    In reality, we have an option in our program do it...it builds the statments and runs them. Note it's just doing TOP 5, this can be a BIG perforamnce hit if you have tens of thousnads of columns.

    here's an example: this is for sql2000, you might need to tweek this for varchar(max) :

    DECLARE @collname varchar(128)

    SET @collname='SQL_Latin1_General_CP1_CI_AS'

    --change the db

    SELECT 'ALTER DATABASE ' + db_name() + ' COLLATE ' + @collname

    --change the defined length columns

    SELECT TOP 5

    'ALTER TABLE ' + sysobjects.name + ' ALTER COLUMN ' + syscolumns.name + ' '

    + TYPE_NAME(syscolumns.xtype) + '(' + CONVERT(VARCHAR,syscolumns.length) + ') '

    + ' COLLATE ' + @collname

    + CASE isnullable WHEN 0 THEN ' NOT NULL' WHEN 1 THEN ' NULL' END AS ALTERSTMT,

    sysobjects.name AS TBLNAME,

    syscolumns.name AS COLNAME,

    TYPE_NAME(syscolumns.xtype) AS DATATYPE,

    syscolumns.length as length

    FROM sysobjects

    INNER JOIN syscolumns ON sysobjects.id=syscolumns.id

    WHERE sysobjects.xtype='U'

    AND TYPE_NAME(syscolumns.xtype) IN ('char','varchar', 'nchar','nvarchar')

    ORDER BY TBLNAME,COLNAME

    --change the text columns

    SELECT TOP 5

    'ALTER TABLE ' + sysobjects.name + ' ALTER COLUMN ' + syscolumns.name + ' '

    + TYPE_NAME(SYSCOLUMNS.XTYPE)

    + ' COLLATE ' + @collname

    + CASE isnullable WHEN 0 THEN ' NOT NULL' WHEN 1 THEN ' NULL' END AS ALTERSTMT,

    sysobjects.name AS TBLNAME,

    syscolumns.name AS COLNAME,

    TYPE_NAME(syscolumns.xtype) AS DATATYPE,

    syscolumns.length as length

    FROM sysobjects

    INNER JOIN syscolumns ON sysobjects.id=syscolumns.id

    WHERE sysobjects.xtype='U'

    AND TYPE_NAME(syscolumns.xtype) IN ('text','ntext')

    ORDER BY TBLNAME,COLNAME

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • A very warm New Year to you too (and hopefully not because of global warming 🙂

    Are you asking to do something like this?

    Select * from

    table1 join table2 on whatever

    Where LCASE(table1.LastName) = LCASE(table2.LastName)?

    This would be "connection specific".

  • Another thought here...

    you can "cast" a column to a particular collation at select time.

    For Example

    Select distinct LastName COLLATE SQL_Latin1_General_CP1_CI_AS

    From testtable

    If you had a case sensitive column, you can use the COLLATE clause to force the collation you want at run-time without schema changes.

    There may be some gotchas that I don't know about with this method.

  • ah case sensitivity as well - I covered this on a presentation I did but ti's not really something I can drop into a post and I'd need to do some more testing - in the instance you talk about with JOHN and john then you must force both sides to a case insensitive collation or force the arguments to the same case, by using upper() for instance. It's a real pain isn't it?

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

  • Dear All,

    Thank you very much for your expert views on my problem.

    As I develop ERP products that target multiple countries I wanted a generic solution to my problem. Here is what I learned and recapitulate for your expert advice

    1. When installing your database in thrird party database or when creating a database that would interact with a third party database (say SAP Busineses One, Dynamics) keep collation of your database similar to third party database

    2. When creating temporary tables keep the collation to database default (by default temporary tables have collation of tempdb database)

    3. Write your sql queries keeping in mind that they can be case sensitive (in case sensitive database table authors is different from table Authors). Preferably use a query generator tool whereever possible.

    4. Avoid mix casing when creating tables, columns, sp, triggers etc. Use lower case or upper case but not both. Like table authors can be created as

    CREATE TABLE AUTHORS

    AS

    [AU_ID] ID NOT NULL,

    [AU_LNAME] NVARCHAR (40)....

    5. Always use unicode types NVARCHAR, NCHAR etc

    6. Avoid conversion of string to dates. For example following query

    SELECT * FROM SALES WHERE SALE_DATE>'15-May-2006' would definitely fail in a german collation. Always use parameterized queries like

    SELECT * FROM SALES WHERE SALE_DATE>@adate. Similarly do not write logic based on named date parts, for example MONTH(@adate) would return MAY in english amd MAI in german.

    7. As most business applications are witten in programming languages like vb, .NET, JAVA it is imperative that code is langauge agnostic and is not compiled against any specific locale.

    This list is by no means exhaustive and need valauable suggestions by all experts. I request all of you and moderators of this forum to come forward and create a guideline for creating globalized sofwares using sql server

    thanks,

    Abhishek Jain

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply