char(??) vs varchar(??)

  • Hi all.

    I have a database with over 200 tables. They all use char(xx) instead of varchar(xx). I heard that varchar is better because it saves space.

    I tried to change a table from char(xx) to varchar(xx). It seems that it doubled the amount of space used. What gives?

    Does it really matter?

    Thanks,

    Mike

  • This was removed by the editor as SPAM

  • pls check below code...

    declare @var_chr char(50)='abc'

    declare @var_varchr varchar(50)='abc'

    select DATALENGTH(@var_chr) chr,DATALENGTH(@var_varchr) var

    O/P:chrvar

    503

  • Or you could construct a query against your table as follows to check total actual string lengths for both scenarios.

    Assume the column (Your_Column) in your table (YourTable) is VARCHAR(100):

    SELECT VARCHAR=SUM(2 + CAST(DATALENGTH(ISNULL(Your_Column,'')) AS BIGINT))

    ,CHAR=SUM(CAST(100 AS BIGINT))

    FROM YourTable

    There's probably also information on this in the sys tables but I don't recall the details.

    Edit: Note that this is only an estimate as I'm not 100% sure that NULL values take up the 2 byte length specifier.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Important to remember:

    When someone says to use a certain technique because it is better/faster/more economical, make it a point to understand WHY. Blind faith doesn't take you very far in SQL

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • There are a couple of factors that can make varchar take more space than char of the same length. First is very short strings in a small column. For example, storing "AA" in char(2) takes 2 bytes, but storing it in varchar(2) takes 4 bytes (because of the length data added to variable-length columns).

    The other is ANSI padding. It can do weird things to trailing spaces in varchar() columns. Check out this for details: http://msdn.microsoft.com/en-us/library/ms187403.aspx

    For fixed-length data, or for very small columns (up to about 10 characters), I use char() instead of varchar(). Unless the data will much more often be very short than very long. Partially, this is due to fixed-length columns not resulting in page splits when updated to longer values. Page splits often result in extra disk allocation being needed, even if only a small percentage of the pages are actually "full", plus they can result in performance issues in busy databases.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • mike 57299 (10/29/2012)


    I tried to change a table from char(xx) to varchar(xx). It seems that it doubled the amount of space used. What gives?

    Rebuild the clustered index, the table should drop back down (providing those char columns you changed weren't something like char(2))

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • CELKO (10/31/2012)


    or better:

    zip_code CHAR(5) NOT NULL

    CHECK (zip_code LIKE '[0-9][0-9][0-9][0-9][0-9]')

    As long as whatever that is is intended for US only, now and always. I just love finding web sites and forms that insist I put a 5-digit post code (mine is 4 digits).

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (10/31/2012)


    CELKO (10/31/2012)


    or better:

    zip_code CHAR(5) NOT NULL

    CHECK (zip_code LIKE '[0-9][0-9][0-9][0-9][0-9]')

    As long as whatever that is is intended for US only, now and always. I just love finding web sites and forms that insist I put a 5-digit post code (mine is 4 digits).

    Then you have UK post codes which are 5-8 characters long, depending on whether you include a space or not.

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

  • For Zip codes (which are actually a USPS trademark), yeah, 5-digits is fine. Well, unless you need Zip+4. Of course, you can put the +4 in a separate, Char(4), numeric-only, nullable column, to enforce the rules on that.

    For international postal codes, you need to get a bit more flexible. Canadian, for example, have numbers and letters, in a particular pattern. Same for UK if I'm not mistaken. As Gail mentioned, SA uses a 4-digit code (all numbers if I'm not mistaken). Other countries, other rules.

    Nullable columns, one for each format, can be done, with enforcement of format in each column. ZipCode char(5) numeric-only, ZipPlus4 char(4), numeric-only, PostalCanadian char(6) with formatting enforced, and so on. But that's a little complex to manage.

    Could do a table per format (some formats are multinational), with limits on the ISO3 country code in each table to map that to relevant national laws, then Union with forced-null columns for columns that some countries have and others don't (like State/Province/Region) into a single view for querying.

    Could create a CLR datatype "MailableAddress" that could format, enforce, etc., based on rules built into the DLL.

    The right XML rules could also be built to enforce a full set of international address rules.

    Or you could use nvarchar(10) for a column called PostalCode, and manage the input/update rules in a higher software layer. Preferably the UI, where an error in Zip code format can be brought to the user's attention and they have a choice of fixing it or overriding the rule.

    Lots of ways to solve these things.

    (I've been working with address list data for over 10 years. It's a bit more complex than char(5) and a Like '[0-9][0-9][0-9][0-9][0-9]' constraint, unless you're going to completely disallow non-US addresses.)

    But Joe is right that you need to know what you're planning on storing and not just blanket varchar(50) or char(25) everything.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • True, UK postal codes are a nightmare to handle, thats why most people will try and push it up into the UI layer and rely on specialised Reg-ex validation procedure before they get to the Db layer, or use a Postal code tool that uses the Royal Mail PAF files to validate the addresses.

    Though the later is often out of date especially for new developments as its only updated every quarter.

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

  • nice discussion 🙂

    _______________________________________________________________
    To get quick answer follow this link:
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply