VARCHAR(1000) vs. VARCHAR(MAX)

  • Someone had asked me what I thought was a simple question, "If you're going to use varchar(1000) why not use varchar(max)?"

    My immedeate response was that you should format your datatype to match your business need.

    Which is fine and good; but what is the technical reason?

    Does SQL allocate the space?

    Does it hinder performance?

    Any particulars?

     

    Thanks

  • From my view this was one of the not so good features of sql server 2005 as it allows for "woolly" data definitions.

    My preference would always be to allocate the size you need, I was brought up on strong data typing. I understand some of the reasons this datatype was introduced, but for a varchar(1000) I'd always keep it to 1000.

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

  • I have a similar question perhap you all can help.  I have a column with Varchar(7000) in sql2k.  Only 200 chars are used in majority of records.  When convert to Sql05, should we change to use Varchar(Max)?

    Thanks.

  • There are a lot of good reasons for declaring fields to match the business need.  Allowing fields to store 2^31 bytes just to avoid worry about overflow is being lazy.  Are you sure every application that reads the data can handle huge values?  I doubt that changing varchar(1000) to varchar(max) would be a good idea, with varchar(7000) the answer is "it depends".

    For regular varchar fields (not text or varchar(max)) SQL Server only allocates storage for the size of the actual data, changing the column size won't make the table grow or shrink.  When processing queries the server must allocate memory buffers for the largest possible row, so there is a performance impact from using large declared sizes "just in case".

    The large value types (varchar(max), nvarchar(max), varbinary(max)) are improvements over the text, ntext, and image datatypes.  Using them in place of regular varchar fields would not necessarily be an improvement.  If most of the values are only 200 bytes, the data will almost always be stored in-row and you won't notice any difference with varchar(max).  If there are a lot of large values, and the field is used in most queries, varchar(max) will probably have a negative impact because of the extra I/O to get the off-page data.  If there are a lot of large values but the field is used infrequently, and the remaining fields make a relatively small row, then using varchar(max) would probably be an improvement because most queries would use less I/O to retrieve few pages with more rows per page.

    Your mileage may vary, this is one of the reasons beta versions have been around for a while.  You should be busy testing alternatives to find out how your data and applications will be affected.

  • Thank you so much for your detailed explanation.

  • Thanks all.

    I knew I came to the right place.

     

  • Scott Coleman explained it very clearly

    may i add that you cannot index a varchar(max) column

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • Varchar(max) is the replacement for text (text will probably be removed in the next version of SQL Server).  I would not recommend it as a replacement for large varchar fields.

    So using varchar(max) is meant to replace text definitions because it has some benefits like being able to use all string functions that work on varchar (and not on text).  Values that still fit on an 8k page are also stored in-row by default with a varchar(max).

    Do note that you CAN include varchar(max) fields in indexes but you cannot create and index on a varchar(max) field itself.

  • ...Do note that you CAN include varchar(max) fields in indexes but you cannot create and index on a varchar(max) field itself....

    Indeed, including the varchar(max) column, you would aim for an index-only accesspath.

    The keypart of the index cannot contain varchar(max)-columns.

    Keep in mind varchar(max) can have up to 2Gb of data. Having that in your index ...

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • I'm quite happy for the use of varchar(max) and including them within index defs - these features will help keep me in employment < grin >

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

  • Just as with every feature you have to carefully see where it can be used.  I wouldn't recommend putting a 2GB varchar(max) in an index either 🙂  Although I'd love to try it some time on a test machine.

    The golden rule to databases: it all depends 🙂

  • Does SQL Server actually reserve the space on the disk for the field if it is varchar(8000) and only 100 characters are used?

    If the space is not reserved, then later updating a field from the 100 characters to 8000 characters would mean that the physical storage of the data would be on a different part of the Hard Disk and thus the data retrieval would be affected. This would also imply that the index on SQL Server would have to grow to point to the location of the page for this additional data.

  • colin Leversuch-Roberts (7/27/2006)


    I'm quite happy for the use of varchar(max) and including them within index defs - these features will help keep me in employment < grin >

    Colin, count me in as well. 🙂

    It helps me write queries effectively on large text data.

    Working with text and ntext was nothing but nightmare :crazy:

  • Does SQL Server actually reserve the space on the disk for the field if it is varchar(8000) and only 100 characters are used?

    Yes... Sort of. It doesn't actually hold open a column for 8000 characters, but there is overhead with every VarXXXX datatype you use. The more characters you have, the more potential overhead you have just to have the "reservation" open.

    Think of it this way. A data page is 8 KB. That means it can store about 8096 bytes of data (including overhead). For every datatype, there is a size that goes with it. Ints use 4 bytes, money uses 8, smallmoney 4 and floats / decimals can use anywhere between 4-15 bytes each. And a VarXXXX uses an extra 2 bytes (minimum) for its overhead.

    Theoretically, you should never have a table so "wide" that you can't fit every field on a row in the same page. The way to determine this is to add up your column datatypes and see if they are under or over 8096 bytes. If the total sizes of your columns (add them up using the biggest size they could be) is over 8096 bytes, then the row will be split between pages. Since SQL Server will then have to add in pointers to connect the pages, that's more extra overhead, more space used by the database and more processing power / memory required to run your queries against the database.

    Now, for the bad news. Varchar(MAX), NVarchar(Max) and VarBinary(Max) do NOT get stored on the data pages. They are always stored in a different type of page than the rest of your data. So everytime you use one of the "max" datatypes, you're already splitting your pages. And again, you'll need more overhead for all the pointers, more space for the database files, etc.

    This behavior is relatively unnoticable on a well designed database or even on a small database, but the bigger you get (or the worse designed you get), the more you start to notice the impact. Queries will be sluggish, CPUs will show bottlenecks where you haven't seen them before... There are lots of things that could go wrong. And if you get a page corruption... YIKES! You've just lost part of your record!

    My advice is to use the minimum size datatypes you can get away with. Plan for scalability, but don't overplan. And stay away from the "maxs" unless you're converting from Text & Image.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • 🙂 Thanks Brandie, this answers my question.

Viewing 15 posts - 1 through 15 (of 22 total)

You must be logged in to reply to this topic. Login to reply