Question about explicit casting varchar and nvarchar columns

  • Hi,

    I know that if I have an nvarchar column I can use an equality like = N'supersqlstring' so it doesn't implicit cast as a varchar, like if I were to do ='supersqlstring'. And then I'll be a big SQL hero and all my stored procedures will run before a millisecond can whisper.

    But if I'm comparing an nvarchar column to a varchar column, is it better to cast the varchar 'up' to an nvarchar or cast the nvarchar 'down' to a varchar?

    For instance:

    cast(a.varchar as nvarchar(100)) = an.nvarchar

    or

    cast(an.nvarchar as varchar(100)) = a.varchar

    Leaving aside non-matching, like (at least I don't think) that SQL considers the varchar n to be equal to the nvarchar ń, what's the best way to handle this?

    Pretend for a moment that each column contains a mixed letter and number ID with no accented or wiggly-squiggly Unicode characters; it's just designs clashing.

    Is there a performance hitch doing it one way or another? Should I use COLLATE? Should one of the columns be altered?

    This is a general question, and not a specific situation.

    I'm fine with ~it depends~ as an answer, but would appreciate any thoughts or anecdotes.

    Thanks

  • It depends 😀

    If your varchar columns' length could be over 4000, then cast as varchar.

    If not, I would consider casting them as nvarchar (this would be the implicit conversion).

    I would take this path to prevent data loss, but maybe someone else can have a different advice.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Luis Cazares (2/26/2014)


    If your varchar columns' length could be over 4000

    Who would ever do something crazy like that? 😉

    Next thing I know you'll be making dodecahedron joins to a substring.

  • Maybe you're trying to compare two chapters of a book :hehe:

    I just tried to include all possibilities.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • SELECT *

    FROM bible

    WHERE verse LIKE '%begat%'

  • Just remember that any of the conversions can prevent the use of statistics which makes the possibilities of scans on the data much higher.

    Best practice is, like to like comparisons at all times. You should avoid having to compare disparate data types. Granted, it's not always possible, but that's the goal.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (2/27/2014)


    Just remember that any of the conversions can prevent the use of statistics which makes the possibilities of scans on the data much higher.

    Best practice is, like to like comparisons at all times. You should avoid having to compare disparate data types. Granted, it's not always possible, but that's the goal.

    So, depending on the situation, alter one of the columns (does alter table rebuild indexes and update statistics? does it need to?), or use a temporary means to store the column as a like datatype, but likely in ad hoc/one off query situations you're okay just casting the nvarchar column as a varchar?

  • sqldriver (2/27/2014)


    Grant Fritchey (2/27/2014)


    Just remember that any of the conversions can prevent the use of statistics which makes the possibilities of scans on the data much higher.

    Best practice is, like to like comparisons at all times. You should avoid having to compare disparate data types. Granted, it's not always possible, but that's the goal.

    So, depending on the situation, alter one of the columns (does alter table rebuild indexes and update statistics? does it need to?), or use a temporary means to store the column as a like datatype, but likely in ad hoc/one off query situations you're okay just casting the nvarchar column as a varchar?

    You have to define "okay". If you can't get to the goal, you'll have to deal with the consequences. If that means scans, it means scans. That's a limitation, if you want to call it that, within SQL Server that can't be denied. It may or may not cause issues, but if you can't get the necessary structural changes in place, that's the trade-off.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (2/27/2014)


    sqldriver (2/27/2014)


    Grant Fritchey (2/27/2014)


    Just remember that any of the conversions can prevent the use of statistics which makes the possibilities of scans on the data much higher.

    Best practice is, like to like comparisons at all times. You should avoid having to compare disparate data types. Granted, it's not always possible, but that's the goal.

    So, depending on the situation, alter one of the columns (does alter table rebuild indexes and update statistics? does it need to?), or use a temporary means to store the column as a like datatype, but likely in ad hoc/one off query situations you're okay just casting the nvarchar column as a varchar?

    You have to define "okay". If you can't get to the goal, you'll have to deal with the consequences. If that means scans, it means scans. That's a limitation, if you want to call it that, within SQL Server that can't be denied. It may or may not cause issues, but if you can't get the necessary structural changes in place, that's the trade-off.

    Yeah, I get you. There's no such thing as a free data conversion.

    I've just read so much about matching data types in query search terms/variable types and keeping things sargable, I got curious about comparing differing column data type best practices.

    I get tasked a lot with resolving issues between staging and live data, and often column data types don't match because the people creating the live tables are just running a script through a third party product that doesn't give a can of beans about data types, for the most part. So when I go to match on my nice PK identity integer column and get an error because the live table has XXXXXX_ANDY_TEST_ID in it I start straightening paper clips and staring at the ceiling.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply