Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase «««23456»»

Varchar or Char? Expand / Collapse
Author
Message
Posted Friday, October 16, 2009 12:38 PM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Thursday, August 28, 2014 8:53 PM
Points: 1,388, Visits: 3,039
Wow. Amazing amount of information in this discussion, folks. I thank you all, especially those who pointed out alternate takes on the question and explanation.

Yes, when formulating the question, I was thinking only of disk space used when I said "efficient", so please accept my apologies for that bit of ambiguity. I do thank Hugo for stepping in with his disection of the various issues raised -- I've learned a lot from that discussion.

Also, as Steve alludes to in his post a couple ahead of this one, generally I've found that the QOD is focused on an idealized case to emphasize a particular point. The point I was thinking of here is that sometimes what appears to be an obvious choice may not be what was intended. In this case, if you really wanted to save every possible byte of storage, then char(10) would be better than varchar(10). Yes, in the 'real world' most of the time varchar(10) may be just as good in a practical sense, and would provide a bit more flexibity for possible future requirements changes, but this is QOD-land, not the real world.

One more thing: I wouldn't have guessed that the 57% of respondants who chose varchar(10) were only interpreting "efficient" as having a broader meaning that included CPU usage and so on, but that most of them didn't work out the actual disk space usage to see that varchar(10) requires an average of a half byte more than char(10).



Post #804446
Posted Friday, October 16, 2009 12:48 PM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Thursday, August 28, 2014 8:53 PM
Points: 1,388, Visits: 3,039
Tom Garth (10/16/2009)
.....I get the feeling that John derived this question from a real world application rather than just for kicks.
Actually, yes. I've been going through a database structure provided by a vendor that sells their software globally. We use it only in the USA with English, and we've found that we would be able to cut the size of our databases by a substantial amount by changing many of the nchar() columns to varchar(). There are a few of those columns that vary in length, but only slightly, and use most of the vendor-allocated column length, so they will be changed from nchar() to char(), and not varchar().
Post #804454
Posted Friday, October 16, 2009 1:41 PM
Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Wednesday, November 19, 2014 12:30 PM
Points: 554, Visits: 672
john.arnott (10/16/2009)
Tom Garth (10/16/2009)
.....I get the feeling that John derived this question from a real world application rather than just for kicks.
Actually, yes. I've been going through a database structure provided by a vendor that sells their software globally. We use it only in the USA with English, and we've found that we would be able to cut the size of our databases by a substantial amount by changing many of the nchar() columns to varchar(). There are a few of those columns that vary in length, but only slightly, and use most of the vendor-allocated column length, so they will be changed from nchar() to char(), and not varchar().


This is largely off topic but I found this post interesting.

Just out of curiosity, do you need to cut the size of your database substantially or does it just offend you that so much space is theoretically wasted? I can't help but wonder if the cost of changing the data types can be recovered by the savings gleaned from shrinking your databases? (Since the cost of storage is significantly less than a dollar a GB you'll need some pretty serious space savings to make it worth your while.) While it's true you might be able to fit a few more rows on each page, quantifying performance gains gleaned from shrinking the database is a little harder to do but is almost certainly measured in microseconds and the loss of scalability into foreign markets is a fairly significant trade off.

I don't know whether you've done this or not but storing your data in unicode (as English) and then displaying it in your apps using another language is really pretty easy but if your data isn't in unicode you're seriously screwed should that requirement suddenly surface. The world is flat and unicode is a fairly insignificant cost of doing business. Clearly, your vendor understands this.


"Beliefs" get in the way of learning.
Post #804497
Posted Friday, October 16, 2009 2:03 PM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Thursday, August 28, 2014 8:53 PM
Points: 1,388, Visits: 3,039
Yes, hard drives now cost in the range of a dollar a GB, but the line of business I support has to pay the enterprise infrastructure division quite a bit more than than on an ongoing basis as a pro-rated share of support services. I only wish I could say "go down to the nearest Best-Buy and pick up a couple of 2TB USB drives". Nope. In this (and I suppose most large corporations), managed storage is priced to the user based on the full estimated cost of ownership.
Post #804521
Posted Friday, October 16, 2009 2:55 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, August 5, 2011 12:53 AM
Points: 8, Visits: 23
See I got it wrong for two reasons:

I was a knuckle head and missed the "English only" so yes varchar or char. The choice between the two though, to me, is subjective. Yes VARCHAR's do require an additional 2 bytes as length identifiers (remember VSAM? lol). I question is "Their length will vary from seven to ten characters" ... "with an even distribution of lengths" "Since the average size of the data is 8.5". We can't store data in half bytes (unless you are working with assembler. remember 3270 ASM?). In fact we can't store data in perfect little bytes. There are 32 or 64 bit WORD boundaries. For me 8, 12, or 16 would have been more efficient, but that's just the hardware nut in me.
Post #804566
Posted Friday, October 16, 2009 3:09 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 5:21 AM
Points: 6,098, Visits: 8,367
hawaiianrebel (10/16/2009)
See I got it wrong for two reasons:

I was a knuckle head and missed the "English only" so yes varchar or char. The choice between the two though, to me, is subjective. Yes VARCHAR's do require an additional 2 bytes as length identifiers (remember VSAM? lol). I question is "Their length will vary from seven to ten characters" ... "with an even distribution of lengths" "Since the average size of the data is 8.5". We can't store data in half bytes (unless you are working with assembler. remember 3270 ASM?). In fact we can't store data in perfect little bytes. There are 32 or 64 bit WORD boundaries. For me 8, 12, or 16 would have been more efficient, but that's just the hardware nut in me.


No, we can't store data in half bytes. But when we multiple units of data, and not all data uses the same number of bytes, we can get fractional bytes for the average unit. The simples example would be two rows, one with 10 bytes and the other with 11 bytes. That means 21 bytes for 2 rows, or an average of 10.5 bytes per row.

The 32 or 64 bit word boundaries are completely irrelevant in SQL Server. If a row takes 1 byte less, 1 byte less is used. No 16-bit, 32-bit, or 64-bit aligning is done.



Hugo Kornelis, SQL Server MVP
Visit my SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis
Post #804583
Posted Friday, October 16, 2009 4:15 PM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Thursday, January 24, 2013 9:59 PM
Points: 1,354, Visits: 1,299
john.arnott (10/16/2009)
Yes, hard drives now cost in the range of a dollar a GB, but the line of business I support has to pay the enterprise infrastructure division quite a bit more than than on an ongoing basis as a pro-rated share of support services. I only wish I could say "go down to the nearest Best-Buy and pick up a couple of 2TB USB drives". Nope. In this (and I suppose most large corporations), managed storage is priced to the user based on the full estimated cost of ownership.


The hard drive storage is quite expensive when you're looking at Enterprise systems. We use SANS drives with fiber channel drives with redundant drives. Also, they are backed up on tape and sent off-site. The cost of these systems are extremely expensive. When we ask for space, they generally give us 50 Gigs max. If we want more, it requires an act of congress. Every byte makes a difference.

The log files that are generated by MSSQL is more annoying than the space used by the database itself. Those need to be backed up and then a shrink performed to recover that space. This needs to be done several times throughout the day since we have massive data loads and that increases the log files significantly.
Post #804607
Posted Monday, October 19, 2009 6:40 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Tuesday, June 3, 2014 7:31 AM
Points: 155, Visits: 386
How efficient will char be if you have throw an rtrim() in the where clause?
Post #804990
Posted Monday, October 19, 2009 7:14 AM


SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Friday, February 4, 2011 7:20 AM
Points: 977, Visits: 1,499
Fatal Exception Error (10/19/2009)
How efficient will char be if you have throw an rtrim() in the where clause?


Not as efficient, but why would you want to use RTRIM in a WHERE clause? Even if you are restricting by length, SQL Server won't be counting the built in padding.


Tom Garth
Vertical Solutions

"There are three kinds of men. The one that learns by reading. The few who learn by observation. The rest of them have to pee on the electric fence for themselves." -- Will Rogers
Post #805007
Posted Wednesday, October 21, 2009 9:22 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Friday, May 27, 2011 1:29 AM
Points: 54, Visits: 141
Yeah thought so...I got it wrong because like a lot of you i chose varchar

Its variable length data so and will need trimming before it can be used if you use char. The question "which is more efficient..." extends beyond the size of the type. without a bit more background on usage i find the question ambiguous

If you said to me a billion rows worked with one at a time, no question, storage is an issue and single rows won't take much trimming so char would be best. But if you said 10,000 rows used widely in joins for many operations then storage is less of an issue and repeatedly trimming spaces is more of an issue.
Post #806592
« Prev Topic | Next Topic »

Add to briefcase «««23456»»

Permissions Expand / Collapse