Data Type and Length

  • Thanks very much! This is a question that applies to the category of "things I need to know and will probably run across" and the answer is straight-forward/simple! Ok, I got it wrong, but that's because I have plenty to learn.

    (The last couple times I've commented, it was due to being frustrated with the question, thought I might add some positive feedback as well :cool:)

  • This is a very good question. My initial choice was 3/38. It's a good thing I decided to run the code before answering. I learned something new 🙂 ...

  • Good question. Thank you!

  • I got it wrong too. But like many questions here, the answer depends on knowing the many inconsistencies and strange rules in SQL Server - most of which you never need to know.

    I had no idea how long char and varchar default sizes are, for the simple reason that I've never declared any without specifying the length. IMHO the correct answer should have been 'Syntax Error'!

    It would be nice to have a datatype like .NET's string which can carry strings of (almost) any length without worrying about what its contents' maximum length might happen to be. Fields in database tables may need to be specified precisely; local variables should not.

  • David Data (8/10/2010)

    It would be nice to have a datatype like .NET's string which can carry strings of (almost) any length without worrying about what its contents' maximum length might happen to be. Fields in database tables may need to be specified precisely; local variables should not.

    Isn't that pretty much what VARCHAR(MAX) is?

  • Good question! Thanks.

  • Hugo Kornelis (8/9/2010)


    Good question! When I first saw it, I thought there would be a very high percentage of correct answers since the confusing default lengths for character strings have already been covered in a few recent QotD's. But the results say that at this time, only 53% of the respondents have given the right answer, so there is obviously still a lot of need to keep driving this point home. Good job, magasvs!

    Nah... it just shows that 47% of the people are too lazy to run the code which is also why these are only worth 1 point. 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • paul.knibbs (8/11/2010)


    David Data (8/10/2010)

    It would be nice to have a datatype like .NET's string which can carry strings of (almost) any length without worrying about what its contents' maximum length might happen to be. Fields in database tables may need to be specified precisely; local variables should not.

    Isn't that pretty much what VARCHAR(MAX) is?

    There is sometimes a HUGE performance penalty (2:1) to be paid for using VARCHAR(MAX) depending, of course, on what you're doing. Correct sizing is always important. I'd dare say the same would be true for most any language... even .Net.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • There is sometimes a HUGE performance penalty (2:1) to be paid for using VARCHAR(MAX) depending, of course, on what you're doing. Correct sizing is always important. I'd dare say the same would be true for most any language... even .Net.

    Well, the person I was responding to never mentioned anything about performance, he just wanted a text field that can be any length in T-SQL, and I was pointing out that already exists. 🙂

  • paul.knibbs (8/23/2010)


    There is sometimes a HUGE performance penalty (2:1) to be paid for using VARCHAR(MAX) depending, of course, on what you're doing. Correct sizing is always important. I'd dare say the same would be true for most any language... even .Net.

    Well, the person I was responding to never mentioned anything about performance, he just wanted a text field that can be any length in T-SQL, and I was pointing out that already exists. 🙂

    Heh... Well, everyone should always be concerned about performance so I was pointing out how bad it can actually be when you don't size stuff properly. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Great Question it chew up my hour worth ups time. 😀

  • (As someone has re-awakened this subject...)

    Jeff Moden (8/23/2010)


    There is sometimes a HUGE performance penalty (2:1) to be paid for using VARCHAR(MAX) depending, of course, on what you're doing. Correct sizing is always important. I'd dare say the same would be true for most any language... even .Net.

    I was talking about local scalar variables, not arrays or database fields where extra microseconds can get multiplied by millions of records.

    But I am curious as to why a VARCHAR(MAX) string that happens to contain say 100 chars is so much slower that a VARCHAR(100) that does? Its index and length values will have to be 32 bit numbers, but with 32/64 bit CPUs anything shorter tends to be less rather than more efficient anyway. As even a VARCHAR(100) is stored as a variable length string, I would not imagine the memory management issues are much different either. Or are they?

    That said, I do always use the smallest VARCHAR I can - though I'm not sure whether it really makes much difference as SQL Server seems to allocate space for the actual content not the potential maximum content, as I tested by creating a large table of VARCHAR(4096) containing just 'x' (1 character) per record.

  • The problem with wide columns is that even though SQL Server supports now larger row size the maximum capacity specification is still 8,060 bytes per row. If the row becomes wider - then row overflow occurs and data saved differently. This affects query performance (I/O performance degradation). More details are here http://msdn.microsoft.com/en-us/library/ms186981.aspx. This works differently with varchar(max), nvarchar(max), varbinary(max), text, image, or xml data types, but I just wanted to make point that columns width should be reasonable.

    Another example of incorrect data type usage is using NVARCHAR or FLOAT data types when it's not required. Just a couple of days ago I have been testing large database migration and by changing NVARCHAR, DATETIME to VARCHAR and DATE/SMALLDATETIME (together with column width reduction) one of the tables reduced in size by 11 GB! There were columns like Year nvarchar (255), PostalCode nvarchar (255), Phone nvarchar (8000)...

  • magasvs (9/27/2010)


    There were columns like Year nvarchar (255), PostalCode nvarchar (255), Phone nvarchar (8000)...

    One of our developers stored a TCP port number (which is, by definition, an unsigned 16-bit integer) in a VARCHAR(50) field...sometimes I despair. 😉

  • OTOH I designed our ETL system with [Postcode] [nvarchar](10) and then had to change it when I found records with postcodes like 'If out please leave by back door or with neighbour at No. 127' 🙁

    magasvs (9/27/2010)


    The problem with wide columns is that even though SQL Server supports now larger row size the maximum capacity specification is still 8,060 bytes per row. If the row becomes wider - then row overflow occurs and data is saved differently.

    I understand the need to keep the actual data short, (especially for fields in the primary key, where I prefer to use INT or CHAR if possible), but does a record containing a couple of VARCHAR(8000)'s or even a VARCHAR(MAX) which happen to contain 50-char strings need row overflow, or does it only need 100 bytes (+overhead)? My tests suggest the latter.

Viewing 15 posts - 16 through 30 (of 31 total)

You must be logged in to reply to this topic. Login to reply