Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««123»»

Best way to implement partitioning Expand / Collapse
Author
Message
Posted Monday, July 9, 2012 2:01 PM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 1:50 PM
Points: 2,691, Visits: 3,374
Gullimeel (7/9/2012)
I completely disagree. If I know I want rows where the createdDate was between '2012-05-01 13:32:00.000' AND '2012-05-01 13:35:00.000' why on earth would I want to convert those to integers first? I don't believe there is any evidence that it would be faster to scan or seek based on an integer or bigint versus a datetime. Logically, it makes no sense to me and theoretically it makes no sense to me. Based on the way that SQL Server stores this information, I would prefer not to do the calculation first.


I am not trying to say that int is faster than date or like that. It is simply to save the space both on table and any index. This doesnt matter on small table but when you are talking about the 500GB tale then 4 extra bytes in a table also makes diff to the size of the table as well as index..

There will be just two calculations if at all needed. based on start and end..
Well, I see where your thought process is, but I disagree. I prefer to store my data as useful data. If I want to use that column, I have to convert back to datetime which is a waste of resources to me. It is also more work for my developers and my system. I suppose if space is a large issue, you may have no choice. However, I would certainly not jump to that without knowing that there was a severe constraint on space. Especially with the various forms of compression available.


Thanks,

Jared
SQL Know-It-All

How to post data/code on a forum to get the best help - Jeff Moden
Post #1327119
Posted Monday, July 9, 2012 2:03 PM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Sunday, June 23, 2013 10:39 PM
Points: 231, Visits: 506
Yes if he needs the column's data as datetime.Then i wouldnt suggest it.As I mentioned earlier.It purely depend on your workload against these tables like type of queries and frequency of these queries..

GulliMeel

Finding top n Worst Performing queries
Improve the performance of Merge Join(special case)
How to Post Performance Problem -Gail Shaw
Post #1327123
Posted Monday, July 9, 2012 2:07 PM
SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Yesterday @ 3:51 PM
Points: 1,967, Visits: 2,905
OmegaZero (7/6/2012)
I'm not 100% sure of how to handle the AccountSales table though - since if I did it by id the dates would not match up correctly if joined with the Account table, however I don't see a way to fix this.

Any ideas? What is the best way to handle this? If any more information is needed please let me know. Thanks if advance!



I know why the date was left out of the Sales table -- it can be derived.

But in this case I think denormalization is a good idea. I, too, would store the date/datetime in the Sales table, even though it's extra bytes, just to allowed it to be clustered upon.

I can also see the advantage of using minutes diff rather than datetime, as suggested, since it's shorter (4 bytes vs 8). On such a large table, it could indeed make a genuine size difference.

However, for ease of use, I would probably just use the actual datetime. Incorrect results from wrongly-coded queries could more than offset the gain of a few bytes per row.

Whether the clustered keys should be a composite of date and AccountId is a much more complex q, and answering properly would require analysis of daily loads and expected query patterns and types.


SQL DBA,SQL Server MVP('07, '08, '09)
"In America, every man is innocent until proven broke!" Brant Parker
Post #1327125
Posted Monday, July 9, 2012 2:12 PM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 1:50 PM
Points: 2,691, Visits: 3,374
ScottPletcher (7/9/2012)
I know why the date was left out of the Sales table -- it can be derived.
What am I missing... How can it be derived?


Thanks,

Jared
SQL Know-It-All

How to post data/code on a forum to get the best help - Jeff Moden
Post #1327127
Posted Monday, July 9, 2012 2:14 PM
SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Yesterday @ 3:51 PM
Points: 1,967, Visits: 2,905
Gullimeel (7/9/2012)
Just a small thing.Why do you have a fill factor of 80 for increasing identity field?Shouldnt it be 100%?


I agree that the fillfactor should be carefully reviewed, but not that it's a "small thing". It could have a big impact on performance if/when you rebuild the table.

Unless you do significant increases to the row length after it is inserted, you need to increase the fillfactor considerably. [Even if that is true, I would instead consider "pre-padding" the row rather than using such a low fill factor.]

For a table this large, I usually aim for ~98%, since you typically have to allow for varchar columns to be lengthened on occasion.


SQL DBA,SQL Server MVP('07, '08, '09)
"In America, every man is innocent until proven broke!" Brant Parker
Post #1327129
Posted Monday, July 9, 2012 2:17 PM
SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Yesterday @ 3:51 PM
Points: 1,967, Visits: 2,905
SQLKnowItAll (7/9/2012)
ScottPletcher (7/9/2012)
I know why the date was left out of the Sales table -- it can be derived.
What am I missing... How can it be derived?


Join back to the Account table using AccountId.


SQL DBA,SQL Server MVP('07, '08, '09)
"In America, every man is innocent until proven broke!" Brant Parker
Post #1327132
Posted Monday, July 9, 2012 2:21 PM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 1:50 PM
Points: 2,691, Visits: 3,374
ScottPletcher (7/9/2012)
SQLKnowItAll (7/9/2012)
ScottPletcher (7/9/2012)
I know why the date was left out of the Sales table -- it can be derived.
What am I missing... How can it be derived?


Join back to the Account table using AccountId.
Maybe I am missing something... Or just lack of information... However, I believe that rows are inserted into the accountsales table regularly, while only 1 row for each account is inserted into the accounts table. So I could have a createddate for account 10000 today, and then rows added to accountsales with this account today, 3 days from now, 3 years from now... So how do you know when the actual sale happened versus when the account was created. Mind you, I am thinking that these tables are representative of a Customer and Orders table.


Thanks,

Jared
SQL Know-It-All

How to post data/code on a forum to get the best help - Jeff Moden
Post #1327135
Posted Monday, July 9, 2012 2:39 PM
SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Yesterday @ 3:51 PM
Points: 1,967, Visits: 2,905
SQLKnowItAll (7/9/2012)
ScottPletcher (7/9/2012)
SQLKnowItAll (7/9/2012)
ScottPletcher (7/9/2012)
I know why the date was left out of the Sales table -- it can be derived.
What am I missing... How can it be derived?


Join back to the Account table using AccountId.
Maybe I am missing something... Or just lack of information... However, I believe that rows are inserted into the accountsales table regularly, while only 1 row for each account is inserted into the accounts table. So I could have a createddate for account 10000 today, and then rows added to accountsales with this account today, 3 days from now, 3 years from now... So how do you know when the actual sale happened versus when the account was created. Mind you, I am thinking that these tables are representative of a Customer and Orders table.


Maybe it's just me. I took Account to be more of a header and Sales the details for a sale.

Based on the names, you're probably right though.

Even then, I would expect there to be a "order header" table that stores the date/time of the sale. And that the datetime of any individual item on that order could be derived from the header table's sale date.

Typically, I would cluster order header and detail tables by OrderId.

IF, though, it has been decided to cluster the "header" table by date, I would consider (not necessary do, but consider) denormalizing and putting the date in the sales table to facilitate joins.


SQL DBA,SQL Server MVP('07, '08, '09)
"In America, every man is innocent until proven broke!" Brant Parker
Post #1327150
Posted Monday, July 9, 2012 2:41 PM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 1:50 PM
Points: 2,691, Visits: 3,374
ScottPletcher (7/9/2012)
SQLKnowItAll (7/9/2012)
ScottPletcher (7/9/2012)
SQLKnowItAll (7/9/2012)
ScottPletcher (7/9/2012)
I know why the date was left out of the Sales table -- it can be derived.
What am I missing... How can it be derived?


Join back to the Account table using AccountId.
Maybe I am missing something... Or just lack of information... However, I believe that rows are inserted into the accountsales table regularly, while only 1 row for each account is inserted into the accounts table. So I could have a createddate for account 10000 today, and then rows added to accountsales with this account today, 3 days from now, 3 years from now... So how do you know when the actual sale happened versus when the account was created. Mind you, I am thinking that these tables are representative of a Customer and Orders table.


Maybe it's just me. I took Account to be more of a header and Sales the details for a sale.

Based on the names, you're probably right though.

Even then, I would expect there to be a "order header" table that stores the date/time of the sale. And that the datetime of any individual item on that order could be derived from the header table's sale date.

Typically, I would cluster order header and detail tables by OrderId.

IF, though, it has been decided to cluster the "header" table by date, I would consider (not necessary do, but consider) denormalizing and putting the date in the sales table to facilitate joins.
I see where you are coming from now. I guess until we get more info, it will be hard to tell what is really going on...


Thanks,

Jared
SQL Know-It-All

How to post data/code on a forum to get the best help - Jeff Moden
Post #1327152
Posted Monday, July 9, 2012 2:42 PM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Today @ 2:50 AM
Points: 42,451, Visits: 35,506
Couple steps back... Why are you partitioning? What's the purpose and expected gain?


Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #1327155
« Prev Topic | Next Topic »

Add to briefcase ««123»»

Permissions Expand / Collapse