We have a guest editorial today as Steve is away on sabbatical.
When I started out in the world of SQL Server disk space was less of a concern than raw IO and the only way to get the raw IO was to increase the spindle count, which in turn led to a certain amount of ‘wasted’ space. That extra space helped the DBA’s of the world get a reputation as space wasters – the IO conversation never seemed to get heard.
Now most of us work with databases on a SAN and rarely have any control over the provisioning. It’s more common than it used to be to have conversations about IO rates (though not common enough) and it’s still a conversation about how much space we need. It feels like the pendulum has swung all over the map and we’ve wound up in a place where we – the DBA’s – are responsible for using up the world’s most precious commodity!
For the SAN admins reading this today, I’d mention this – DBA’s don’t decide on the space requirements. Those are driven by users and applications and retention policies and SLA’s for recovery times and other operational requirements. DBA’s are almost always fighting the good fight to normalize (which saves space), to archive (to slower and cheaper storage), and to purge (saves space and can help maintain performance levels). We review designs and recommend the smallest data types, sparse columns, and every other trick we know to make the application and database work long term. We compress tables when we can and backups almost always. Sometimes we’re listened to and sometimes not.
The running joke at the office is that we always need a terabyte added, and then we bargain down to the minimum we can get by with. Do we always need a terabyte? No, what we need is slack. Sometimes we need to sit on a backup an extra couple days, or keep a spare restored copy for some coding or testing or experimenting. It makes us a little crazy to have to spend time asking for 100 gigabytes! We (the royal DBA we) get that enterprise storage is considerably more expensive than consumer storage, but it’s not that expensive.
We’re living in the age of thin provisioning (at the VM and the SAN level) and data deduplication, all done easily and mostly efficiently. Do we really need to argue about a terabyte? Or is this a conspiracy, where the SAN team gets a bonus for minimizing space presented for use? What do they do with all that unused space? Could it be that the SAN team has a hoarding problem?