Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Disk Is Cheap! ORLY?


Disk Is Cheap! ORLY?

Author
Message
Solomon Rutzky
Solomon Rutzky
Say Hey Kid
Say Hey Kid (664 reputation)Say Hey Kid (664 reputation)Say Hey Kid (664 reputation)Say Hey Kid (664 reputation)Say Hey Kid (664 reputation)Say Hey Kid (664 reputation)Say Hey Kid (664 reputation)Say Hey Kid (664 reputation)

Group: General Forum Members
Points: 664 Visits: 2937
Todd M. Owens (4/27/2011)
...if a table only allows inserts in a version type approach, there is no reason to add columns that capture UpdateDate and/or UpdateUserID. If a table's use case(s) calls for a row to be updated, only then are the Update fields necessary.

And please think twice before setting the CreateDate and the UpdateDate to the same exact value on insert...If 90% or more of the columns are never updated, then there will be an extraneous 8 bytes on each row from the UpdateDate...


Hey there. I definitely agree with both of these and practice them myself.

There might be a circumstance, however, to have both CreateDate and UpdateDate be set upon initial INSERT (and hence UpdateDate will be the same value as CreateDate if the row is never updated). I have seen cases where an ETL process looks at the UpdateDate field to know if it should grab that row and will have the UpdateDate field indexed for that purpose. In this case UpdateDate serves dual-duty by indicating both new and updated rows without having to scan both CreateDate and UpdateDate fields. And to only have a single field (assuming someone might suggest not having a CreateDate field in this case), it is still helpful to see the CreateDate since that information will be gone once the row is updated if you only have the single UpdateDate field. But outside of this specific case I tend to agree that the UpdateDate field should be NULL.

Take care,
Solomon...

SQL# - http://www.SQLsharp.com/
Solomon Rutzky
Solomon Rutzky
Say Hey Kid
Say Hey Kid (664 reputation)Say Hey Kid (664 reputation)Say Hey Kid (664 reputation)Say Hey Kid (664 reputation)Say Hey Kid (664 reputation)Say Hey Kid (664 reputation)Say Hey Kid (664 reputation)Say Hey Kid (664 reputation)

Group: General Forum Members
Points: 664 Visits: 2937
Todd M. Owens (4/27/2011)
And please think twice before setting the CreateDate and the UpdateDate to the same exact value on insert: this a clear violation of the limited redundancy principle. If 90% or more of the columns are never updated, then there will be an extraneous 8 bytes on each row from the UpdateDate; multiply that by the number of rows where it should be null and in every table where it is misused, then the numbers get quite large quite quickly.


Hey Todd. I did not notice until just a moment ago when I was working on something else that this statement is not entirely true. While I still do agree that using NULL for UpdateDate when a record has not yet been updated is a better practice, that is merely just a logical consideration (outside of the potential indexing issue I mentioned before). But a fixed-width column (numeric types, date/time types, and char/nchar/binary types) will always take up the same amount of space regardless if it is set to NULL or a value. Meaning, if UpdateDate is a DATETIME field, it will always take up 8 bytes even if it is NULL. And to make sure I did just test this out.

Take care,
Solomon...

SQL# - http://www.SQLsharp.com/
Todd M. Owens
Todd M. Owens
Valued Member
Valued Member (58 reputation)Valued Member (58 reputation)Valued Member (58 reputation)Valued Member (58 reputation)Valued Member (58 reputation)Valued Member (58 reputation)Valued Member (58 reputation)Valued Member (58 reputation)

Group: General Forum Members
Points: 58 Visits: 471
But a fixed-width column (numeric types, date/time types, and char/nchar/binary types) will always take up the same amount of space regardless if it is set to NULL or a value. Meaning, if UpdateDate is a DATETIME field, it will always take up 8 bytes even if it is NULL. And to make sure I did just test this out.


Hmm I wonder if that is true for Oracle, DB2, Teradata, et.al....
Jeff Moden
Jeff Moden
SSC-Forever
SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)

Group: General Forum Members
Points: 45185 Visits: 39925
Solomon,

I've recently had the need to refer to this article again and I thank you from the bottom of my heart. This "right sizing on steroids" article should be required reading for anyone and everyone who has or will ever write the words CREATE TABLE. My hat is off to you, good Sir. Well done!

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is not.
Just because you can do something in PowerShell, doesnt mean you should. Wink

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Solomon Rutzky
Solomon Rutzky
Say Hey Kid
Say Hey Kid (664 reputation)Say Hey Kid (664 reputation)Say Hey Kid (664 reputation)Say Hey Kid (664 reputation)Say Hey Kid (664 reputation)Say Hey Kid (664 reputation)Say Hey Kid (664 reputation)Say Hey Kid (664 reputation)

Group: General Forum Members
Points: 664 Visits: 2937
Jeff Moden (10/5/2012)
Solomon,

I've recently had the need to refer to this article again and I thank you from the bottom of my heart. This "right sizing on steroids" article should be required reading for anyone and everyone who has or will ever write the words CREATE TABLE. My hat is off to you, good Sir. Well done!


Hi Jeff and thank you very much for such positive feedback. I apologize for the delayed response but new baby came along less than 2 weeks before your comment and things have been, well, C-R-A-Z-Y!! Hehe Also, I had thought of a few minor additions to make and wanted to get them in and published before replying.

I have added:

  • A reference to decreased Page Life Expectancy in the paragraph about memory usage

  • Info about several other datatypes at the end of the main recommendations list, things like: SMALLMONEY, REAL, TIME, etc.

  • Info about some newer features that can help save space in addition to (not in place of) a good model. Namely: SPARSE columns and Row / Page compression



Hopefully the updates make for an even stronger article :-).

Take care,
Solomon...

SQL# - http://www.SQLsharp.com/
dhubbard
dhubbard
Grasshopper
Grasshopper (10 reputation)Grasshopper (10 reputation)Grasshopper (10 reputation)Grasshopper (10 reputation)Grasshopper (10 reputation)Grasshopper (10 reputation)Grasshopper (10 reputation)Grasshopper (10 reputation)

Group: General Forum Members
Points: 10 Visits: 42
Excellent article. I think the network bandwidth issue is really important and often overlooked. It is mentioned in the article, but I'd add that a WAN circuit of even just moderate bandwidth can cost as much per month to operate as some disk drives cost to buy. And can easily cost significantly more. So if you are syncing to a DR site, getting the backup image over there, and as mentioned, log shipping or replication, can take a big byte out of it. Bandwidth is also not something you can quickly address if you run out of it! Disks are usually a two week turnaround to buy more and rack them up. WAN circuits are running 120 days or more!



Eric M Russell
Eric M Russell
SSCarpal Tunnel
SSCarpal Tunnel (4.6K reputation)SSCarpal Tunnel (4.6K reputation)SSCarpal Tunnel (4.6K reputation)SSCarpal Tunnel (4.6K reputation)SSCarpal Tunnel (4.6K reputation)SSCarpal Tunnel (4.6K reputation)SSCarpal Tunnel (4.6K reputation)SSCarpal Tunnel (4.6K reputation)

Group: General Forum Members
Points: 4623 Visits: 9574
Excellent article. It's funny (in a bad way) that many people say "disk is cheap" until the day that tempdb or transaction log starts filling up and you actually put in a request for more disk storage. In a corporate enterprise envrionment, it's not as if the sysadmin can just drop what they're doing, run down to the local electronics store, pull a $200 drive off the shelf, and pop it in the server. You can blow through thousands of dollars worth of billable time and broken service level agreements just discussing the issue and waiting for it to happen.


"The universe is complicated and for the most part beyond your control, but your life is only as complicated as you choose it to be."
christopher reeve
christopher reeve
Forum Newbie
Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)

Group: General Forum Members
Points: 5 Visits: 151
Great article and information (glad it was reposted on SQLServerCentral today)!

In regard to item #10 - while not denormalizing tables definitely applies to transaction systems (OLTP) - so order entry and other application transaction systems retain their integrity and update efficiencies - the presentation tables in a data warehouse (OLAP) are usually denormalized (as a star schema) to improve performance (and to a lesser extent understandability for the report creation users). This storage of redundant data in data warehouses is deemed worthy to reduce report query response time by limiting the depth of the table joins.

A properly designed star schema for the data warehouse presentation fact and dimension tables makes sure the large fact tables are using foreign keys (and the corresponding primary key in the dimension table) of appropriate size for both space saving and performance reasons and also espected growth as you specified in your article for transaction tables. Also numeric and string fields in both the fact (very important because fact tables get very large) and dimension tables would benefit from the space saving techniques you specified.

Thanks again,
Chris Reeve
Miles Neale
Miles Neale
SSCrazy
SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)

Group: General Forum Members
Points: 2672 Visits: 1694
Eric M Russell (10/25/2013)
Excellent article. It's funny ... You can blow through thousands of dollars worth of billable time and broken service level agreements just discussing the issue and waiting ...


We had an issue like this recently but it was a cert not disk. The time spent was more then the cert. The bloated cost of buying some things is amazing.

Appreciate your work here!

M.

Not all gray hairs are Dinosaurs!
Eric M Russell
Eric M Russell
SSCarpal Tunnel
SSCarpal Tunnel (4.6K reputation)SSCarpal Tunnel (4.6K reputation)SSCarpal Tunnel (4.6K reputation)SSCarpal Tunnel (4.6K reputation)SSCarpal Tunnel (4.6K reputation)SSCarpal Tunnel (4.6K reputation)SSCarpal Tunnel (4.6K reputation)SSCarpal Tunnel (4.6K reputation)

Group: General Forum Members
Points: 4623 Visits: 9574
Miles Neale (10/25/2013)
Eric M Russell (10/25/2013)
Excellent article. It's funny ... You can blow through thousands of dollars worth of billable time and broken service level agreements just discussing the issue and waiting ...


We had an issue like this recently but it was a cert not disk. The time spent was more then the cert. The bloated cost of buying some things is amazing.

Appreciate your work here!

M.

What did you mean by a "cert"; some new hardware or software was install, but you had to wait for it to be certified?


"The universe is complicated and for the most part beyond your control, but your life is only as complicated as you choose it to be."
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search