Blog Post

Data Types Matter

,

I was watching Michelle Ufford’s (blog, Twitter) presentation from the 2009 PASS Summit recently on DVD. I had purchased them, and this was one session I had attended, but had to leave early because of another commitment.

However I thought the session was interesting, I like Michelle’s writing, and this was the first one I popped in when I got my Summit DVDs. It talks about the overload on their database servers from the first GoDaddy Super Bowl commercial, which referred people to their website. The week after the Super Bowl, they started redesigning their database and infrastructure to handle the extreme loads they experienced.

One of the things that Michelle mentioned is that data types matter. I know lots of designers don’t think about this, and perhaps it isn’t important for most databases. After all, worrying about squeezing things down takes time, and you could easily “guess” wrong when you’re building a database. If you have a small system, and you use a tinyint only to find out that you needed a smallint, it’s a lot of work to change things sometimes. And it isn’t necessarily worth it.

However in a large database, things might be different. This is an image that Michelle used in her presentation.

datatype_sqlfool

Two tables, partially represented here. If you calculate the differences here, you’ll see that if you use the smallest data type possible, as in Table B, you end up with

(449-323) / 323 * 100 = about 40% more rows. That could translate into significant IO savings for this table in an index scan, and lots of reports use Index scans.

Is it worth limiting yourself in all systems? No, but you ought to take a few seconds when you’re building a database and if there are obvious places where you don’t need a larger data type, pick a smaller one.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating