An nHibernate Head Scratcher

  • There are a couple of ways of fixing it. The one that springs to mind is as mentioned in Todd's blog. Change the mapping file in nHibernate from using String to using AnsiString.

  • David, this was an interesting read. Thanks for your post. We have used the script here:

    http://blogs.msdn.com/b/ikovalenko/archive/2006/12/03/alter-database-alter-collation-forced.aspx

    when changing database collations. Hopefully you can also put it to good use when the time comes.

  • Rather than changing the server default collation just to avoid the issue on the next database creation, how about changing the collation in the model database?

  • We had a similar problem in one of our apps. The solution for us was to use the correct nhibernate types in the mapping.

    Like you, our tables have Varchar. We noticed that hibernate was generating nvarchar for the data types.

    Per the hibernate documentation, string will generate nvarchar in sql. The use of AnsiString will generate varchar parameter types and Sql Server is happy.

    I'm on my smartphone at the moment, so I can't provide specifics. When I'm at a pc, I'll update with specific docs and code samples.

  • I know you aren't happy about using the NVarchar datatype on the table columns but still I have this question. If you converted the varchar in the table to be nvarchar to match the parameter coming in from the Unicode ORM framework would it have caused a seek, solving the problem, and negating the need to change coallation settings on the DB/Server.

    It's so much fun to watch someone else assume the issue to be the lowest level, index/data corruption type issue, and finally work their way back up to 30,000 feet and find the issue right outside the window. I wonder if the Guru DBAs would have started with coallation? Gosh, I hope not.

    Great Article!

  • Thomas Keller (4/11/2011)


    Rather than changing the server default collation just to avoid the issue on the next database creation, how about changing the collation in the model database?

    Nice thought, but you can't change the collation sequence on a system DB, short of rebuilding the master

  • Nice article, David. Good problems solving skills.

  • From the article (towards the end):

    The answer lies in a bug in the cluster installer for SQL2008. As I live in the UK the collation sequence chosen was Latin1_General_CI_AS which is for UK English but the installer ignores it and defaults back to SQL_Latin1_General_CP1_CI_AS which is for US English. Unless you think to check the collation sequence post build it is a very easy thing to miss.

    Hi David. I'm not sure that this is a bug in the installer. More likely just a bad choice on the part of Microsoft to continue using a bad / obsolete default value. All Latin1_General and SQL_Latin1_General Collations, with the exception of 2 of the SQL_Latin1_General Collations, are for US English (those other 2 are for Turkish). There are no Collations (of either type) that are for GB English.

    Also, to be clear, this issue, while it relates to VARCHAR vs NVARCHAR, is more so an issue between SQL Server Collations and Windows Collations. The problem is that VARCHAR data uses the old, simplistic sort orders when using SQL Server Collations, but uses Unicode sorting rules when using Windows Collations. The problem you experienced is that the VARCHAR column was using a SQL Server Collation and the rows were ordered in a particular way according to that sort order. But, due to datatype precedence, when compared with an NVARCHAR variable or string literal, the column values (sourced from the index) were converted to NVARCHAR. And, since the NVARCHAR sorting rules produce a different ordering than the static sort order of the SQL Server Collation, the physical order of the rows in the index is no longer correct.

    Because VARCHAR columns using Windows Collations use the same Unicode sorting rules as NVARCHAR data for that same Collation, implicitly converting to NVARCHAR is a non-issue as the physical order of the rows in the index won't change, even if the data changes from an 8-bit encoding to UTF-16 LE. This is why you will still see a CONVERT_IMPLICIT() when comparing a VARCHAR column (using a Windows Collation) to an NVARCHAR variable or string literal, because the conversion still has to happen. But, being in the same / correct physical order upon being converted means that the index is still completely valid, hence you still get a SEEK instead of a SCAN :-). I have a post specifically about this behavior:
    Impact on Indexes When Mixing VARCHAR and NVARCHAR Types
    Also, as it relates to changing the Collation of the Instance / system DBs, etc. I recently posted about the quick-and-mostly-easy (though undocumented / unsupported) method here:

    Changing the Collation of the Instance, the Databases, and All Columns in All User Databases: What Could Possibly Go Wrong?

    Take care,
    Solomon..

    SQL#https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
    Sql Quantum Lifthttps://SqlQuantumLift.com/ ( company )
    Sql Quantum Leaphttps://SqlQuantumLeap.com/ ( blog )
    Info sitesCollations     •     Module Signing     •     SQLCLR

  • Solomon Rutzky - Tuesday, June 19, 2018 1:50 PM

    From the article (towards the end):

    The answer lies in a bug in the cluster installer for SQL2008. As I live in the UK the collation sequence chosen was Latin1_General_CI_AS which is for UK English but the installer ignores it and defaults back to SQL_Latin1_General_CP1_CI_AS which is for US English. Unless you think to check the collation sequence post build it is a very easy thing to miss.

    Hi David. I'm not sure that this is a bug in the installer. More likely just a bad choice on the part of Microsoft to continue using a bad / obsolete default value. All Latin1_General and SQL_Latin1_General Collations, with the exception of 2 of the SQL_Latin1_General Collations, are for US English (those other 2 are for Turkish). There are no Collations (of either type) that are for GB English.

    Also, to be clear, this issue, while it relates to VARCHAR vs NVARCHAR, is more so an issue between SQL Server Collations and Windows Collations. The problem is that VARCHAR data uses the old, simplistic sort orders when using SQL Server Collations, but uses Unicode sorting rules when using Windows Collations. The problem you experienced is that the VARCHAR column was using a SQL Server Collation and the rows were ordered in a particular way according to that sort order. But, due to datatype precedence, when compared with an NVARCHAR variable or string literal, the column values (sourced from the index) were converted to NVARCHAR. And, since the NVARCHAR sorting rules produce a different ordering than the static sort order of the SQL Server Collation, the physical order of the rows in the index is no longer correct.

    Because VARCHAR columns using Windows Collations use the same Unicode sorting rules as NVARCHAR data for that same Collation, implicitly converting to NVARCHAR is a non-issue as the physical order of the rows in the index won't change, even if the data changes from an 8-bit encoding to UTF-16 LE. This is why you will still see a CONVERT_IMPLICIT() when comparing a VARCHAR column (using a Windows Collation) to an NVARCHAR variable or string literal, because the conversion still has to happen. But, being in the same / correct physical order upon being converted means that the index is still completely valid, hence you still get a SEEK instead of a SCAN :-). I have a post specifically about this behavior:
    Impact on Indexes When Mixing VARCHAR and NVARCHAR Types
    Also, as it relates to changing the Collation of the Instance / system DBs, etc. I recently posted about the quick-and-mostly-easy (though undocumented / unsupported) method here:

    Changing the Collation of the Instance, the Databases, and All Columns in All User Databases: What Could Possibly Go Wrong?

    Take care,
    Solomon..

    Now that is interesting.  When I have some time I will have to look at this as it has been issue at times where I work and seeing it in users questions here on SSC.  This could be a very good reason to be sure new installations of SQL Server use Windows collations instead of SQL collations.

    No matter how long I work with SQL Server the more I keep learning that I don't know.

Viewing 9 posts - 16 through 23 (of 23 total)

You must be logged in to reply to this topic. Login to reply