nvarchar(max) or nvarchar(1000)?

  • taigovinda (8/2/2013)


    Hi,

    I have a concatenated field that is usually about 80 characters but sometimes has been up to 600 characters. It could conceivably be longer than that. How should I define that field in my table? Nvarchar(max)? Nvarchar(1000)? If I define it as 1000 will that make performance better or worse than max? And if I then try to insert into it with something more than 1000 characters then what will happen? The update attempt would simply error?

    I have another table with the same situation except the field is usually about 10 characters and I've observed it as long as 9,033 characters. Is nvarchar(max) basically the only way to address that one?

    Thanks!

    Tai

    You asked:

    If I define it as 1000 will that make performance better or worse than max?

    The performance of what: inserts, page reads, parsing column values? As for page reads, if most of your queries don't include this multi-valued column, then it might make sense to keep the column off-row. Also, as mentioned earlier, varchar will suffice, if you don't need support for unicode.

    If a row's value-set can be updated after insertion, then I'd definately reccomend containing it in a seperate table, which is the proper normalized way to handle this. Because, if you increase the length of a varchar/nvarchar column on a row, then that causes all sorts of problems like index fragmentation and page splits, which can result in more i/o for both writes and reads.

    Personally, the only time I use multi-valued columns are integer based bitflags, which is compact (1 bit per value, 64 bits per BigInt). The column can be "parsed" using bitwise operators, which is indexable. Also, it can be updated without resulting in resize of row.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Most of the queries will include the field in question, but will never join on the field in question. It will almost always be 80 or so characters but has been quite a bit more... I think 1000 would cover it. I'm concerned about the speed of querying the database, not so much writes and inserts. My guess is it won't make much difference one way or another but I was curious as to what is best practice which is why I started the thread instead of just going with max length. As it is right now I have switched from nvarchar(max) to varchar(1000).

    Thanks.

    Tai

  • I would not recommend you to use nvarchar (1000) when you need nvarchar (10) is to forestall going into the official database with terrible information.

    Concerning nvarchar (Max), it's an ill-conceived notion to applying this at all except if you hope to have in excess of 3000 characters. Just check indexing and varchar (max), you will know why.

     

  • CharlesWagner wrote:

    Concerning nvarchar (Max), it's an ill-conceived notion to applying this at all except if you hope to have in excess of 3000 characters. Just check indexing and varchar (max), you will know why.

    As I found out 6 years later after my previously latest post on this thread, I'll have to say "It Depends".  Having a bunch of non-indexable gobbilty-gook resident in the Clustered Index (CI) makes unavoidable scans and even coveted seek/range scans a whole lot slower on the CI.  Changing them to a MAX datatype "CAN" help that a whole lot without killing what happens in queries the need to return the gook... but, only if you do it right.  You have to set the table option to force MAX datatypes to be out-of-row for it to work as stated.  If you have existing rows, then you need to do an in-place update to get them to move out-of-row after you've set the table option.

    It'll also seriously help avoid "ExpAnsive" updates (a major cause of page splits and the resulting massive logical and physical fragmentation on the CI, as well, especially if you default such columns that usually contain data to a single space to get the LOB pointer to materialize in the CI.

    A side benefit is that index rebuilds will also occur MUCH faster.

    I totally agree with you on the "Right Sizing" of columns, though.  In fact, people may want to consider using CHAR or NCHAR instead of the variable width equivalents for narrow columns that suffer frequent "ExpAnsive" updates to help avoid page splits and all that go with them.  Of course, if there are no updates, then VARCHAR or NVARCHAR are ok and will save space up to a point.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 4 posts - 16 through 18 (of 18 total)

You must be logged in to reply to this topic. Login to reply