varchar versus nvarchar

  • I just discovered a problem in the database design for an application that was developed years ago. There is an audit table with OldValue and NewValue columns defined as nvarchar(255). However, in the mean time, new tables have been defined with columns longer than nvarchar(255).

    I want to propose that the audit table use a much larger value.

    However, nvarchar takes up twice as much space as varchar, AND not too long ago, there was a forum post about trailing blanks in nvarchar, and how that can be a problem in comparisons, etc. unless you use RTRIM().

    So I was wondering what the general consensus is on using varchar versus nvarchar, especially for longer strings. Is it considered best practice to always use nvarchar, or are there some other considerations?

    Thanks.

  • It all depends on whether you think that there is any chance that that column will need to store Unicode characters that are not covered by Varchar().

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • I tend to default to using NVARCHAR, unless i can absolutely be certain that there will not be any UniCode data in there. I find the increase in space is easier to deal with than potentialy needing to chnage the data types in the future.

  • Carla Wilson (12/2/2008)


    So I was wondering what the general consensus is on using varchar versus nvarchar, especially for longer strings. Is it considered best practice to always use nvarchar, or are there some other considerations?

    Best practice is to always use the appropriate data type. so if you're going to be storing unicode, make the column nvarchar. If you're not storing unicode (just ascii strings) then make the column varchar.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Practically speaking, I expect only "American-keyboard" ascii characters, but I suspect the original designer wanted to cover international users.

    But I can think of a number of other columns in this database that would never need unicode.

  • Whether or not to use varchar or nvarchar really depends on the type of data you're storing. If you are not now or ever going to store different character sets for foreign language use, you can use varchar. However, if you need to store unicode, you need to use nvarchar.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • On the specific question of audit tables. If your requirement is to capture actual data then you should create audit table columns that match the types of the data being audited. If you have more than one type then use multiple columns. Casting every column to a single string type (whether VARCHAR or NVARCHAR) is poor practice in my opinion.

    In SQL Server 2008 use Change Data Capture rather than roll-your-own audit.

  • RE: audit table - it's not like a standard audit table. I have several of those, and the columns do match the datatypes of the tables that they audit. I was going to put the word in double-quotes when I first posted; this table is used in an application to keep track of change requests from users, that may or may not be approved. The application has been around a LONG time, and I'm not even going to propose messing with it. However I do need to make sure it keeps working.

    Thanks to everyone for your input. It was interesting to see that the use of nvarchar is not as common as I was expecting to see.

  • Carla Wilson (12/3/2008)


    RE: audit table - it's not like a standard audit table. I have several of those, and the columns do match the datatypes of the tables that they audit. I was going to put the word in double-quotes when I first posted; this table is used in an application to keep track of change requests from users, that may or may not be approved. The application has been around a LONG time, and I'm not even going to propose messing with it. However I do need to make sure it keeps working.

    Thanks to everyone for your input. It was interesting to see that the use of nvarchar is not as common as I was expecting to see.

    It is probably not used as much in America, but here in Europe with all the different languages etc.. , it is pretty much needed for any column that will hold user input string data.

  • Note that even in America, extended ASCII is very common. Just think of all of peoples names that have accents in them. Or the word resumé. Let alone, there is this province in Canada that has French as the standard language.

    So if you are designing a new system, use NCHAR and NVARCHAR. Trust me and you'll thank me later.


    [font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

  • declare @test-2 VARCHAR(20)

    SET @test-2 = 'resumé'

    select @test-2

    There's a difference between extended ASCII and unicode.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (12/4/2008)[hr

    There's a difference between extended ASCII and unicode.

    True, but the support of extended (8-bit) ASCII is also governed by the collation of the database and the calling application, among other things.

    Best bet is to to be global and use Unicode. If the application is .Net, it is a no-brainer as .Net is Unicode.


    [font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

  • JohnG (12/4/2008)


    Note that even in America, extended ASCII is very common. Just think of all of peoples names that have accents in them. Or the word resumé. Let alone, there is this province in Canada that has French as the standard language.

    So if you are designing a new system, use NCHAR and NVARCHAR. Trust me and you'll thank me later.

    I'm from Quebec... and varchar is more than enough to cover all our words!!!

  • Ninja's_RGR'us (12/4/2008)


    JohnG (12/4/2008)


    Note that even in America, extended ASCII is very common. Just think of all of peoples names that have accents in them. Or the word resumé. Let alone, there is this province in Canada that has French as the standard language.

    So if you are designing a new system, use NCHAR and NVARCHAR. Trust me and you'll thank me later.

    I'm from Quebec... and varchar is more than enough to cover all our words!!!

    Varchar will handle any Latin/Romance language easily so I can't see any French words causing an issue, but you must have names of people/business that are from different aplhabets.

  • I'll qualify my responses as follows:

    If you are designing and building a system that will be used within a single language culture then by all means choose the proper database collation and use VARCHAR. If, however, your system will be used by different cultures, then you should use NVARCHAR (Unicode).

    If your system is a web-application (e.g., a SaaS offering) it should be Unicode. Also note that a lot of companies are global. Those that are not, may be in the future. If not from expansion they could be as a result of a merger or acquisition (M&A).

    My company has a SaaS offering that is used globally. Some of the customers are multi-lingual -- i.e., they have global offices. Consequently, the data placed into the system is from all languages and character sets.

    BOL links supporting the above:

    http://technet.microsoft.com/en-us/library/ms187828(SQL.90).aspx Note the reference to (North America) Mexico, Quebec, Canada and US.

    http://technet.microsoft.com/en-us/library/ms188325(SQL.90).aspx

    Edit: Added BOL links


    [font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply