Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««123»»

Create Clustered Index on a Very Large Table (500 GB) Expand / Collapse
Author
Message
Posted Monday, May 12, 2014 4:57 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Wednesday, May 28, 2014 9:06 AM
Points: 12, Visits: 125
Jeff Moden (5/12/2014)
-A (5/7/2014)
Thanks for the replays.

--Here is the answers to the follow up questions:
1) I have to create this Index online since it is huge and I cannot afford such a long down time.

2) MAXDOP was set to 10 at first and showed no significant improvement in speed. I believe at the end the create index still happens on 1 processor anyways.

3) Also SORT_IN_TEMPDB = OFF or ON still didn't matter as SQL always seemed to use up all of the tempdb.

4) This table is already partitioned by date and currently has 570 partitions.


Wait justa minute... you have a partitioned table that has no Clustered Index???

What seems to be the problem ?



Abdel Ougnou
Post #1570090
Posted Tuesday, May 13, 2014 12:23 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 8:14 PM
Points: 37,107, Visits: 31,665
-A (5/12/2014)
Jeff Moden (5/12/2014)
-A (5/7/2014)
Thanks for the replays.

--Here is the answers to the follow up questions:
1) I have to create this Index online since it is huge and I cannot afford such a long down time.

2) MAXDOP was set to 10 at first and showed no significant improvement in speed. I believe at the end the create index still happens on 1 processor anyways.

3) Also SORT_IN_TEMPDB = OFF or ON still didn't matter as SQL always seemed to use up all of the tempdb.

4) This table is already partitioned by date and currently has 570 partitions.


Wait justa minute... you have a partitioned table that has no Clustered Index???

What seems to be the problem ?


Nothing if you never update variable width columns to contain more than they did at INSERT time and you never do deletes, both of which can waste huge amounts of disk space, memory space, etc. Sure, you can rebuild a heap as of 2008 but that also comes with some hefty hidden costs for the NCIs.

I suppose that for tables like audit tables, which are only inserted into, a partitioned heap might actually do faster inserts than on a partitioned clustered table so that might be an advantage. I would think that you'd need some covering indexes to get any real performance out of such a thing and those might negate the insert advantage that a heap might have.

I've never tried such a thing before and it's an interesting prospect. What's your experience on such a thing?

And thanks for raising the question.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1570132
Posted Tuesday, May 13, 2014 8:34 AM


SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Yesterday @ 4:33 PM
Points: 4,358, Visits: 6,193
Jeff, if I recall correctly earlier versions of the Fast Track Datawarehouse appliance was spec'd for mostly HEAP tables to ensure the most efficient sequential scans. Not sure of latest models though. I concur with your notion of logging tables likely benefitting from no clustered index as well. Another scenario is tables with a non-sequential GUID as their key. The no-clustered-index scenarios all assume small-to-medium-range-scan performance increases available from CIs are not required.


Best,

Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru at GMail
Post #1570365
Posted Tuesday, May 13, 2014 9:02 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 8:14 PM
Points: 37,107, Visits: 31,665
Thanks, Kevin. Now THAT's an interesting bit of information. I can see it being a benefit to INSERT performance as well since a heap on a temporal audit table would suffer roughly the same insert order as if a CI were present but without the overhead of updating the BTREE of a CI. Since there isn't much on Yabingooglehoo on the notion of partitioning heaps, I guess I have some experimentation to do.

--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1570391
Posted Tuesday, May 13, 2014 5:08 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 8:14 PM
Points: 37,107, Visits: 31,665
Jeff Moden (5/13/2014)
-A (5/12/2014)
Jeff Moden (5/12/2014)
-A (5/7/2014)
Thanks for the replays.

--Here is the answers to the follow up questions:
1) I have to create this Index online since it is huge and I cannot afford such a long down time.

2) MAXDOP was set to 10 at first and showed no significant improvement in speed. I believe at the end the create index still happens on 1 processor anyways.

3) Also SORT_IN_TEMPDB = OFF or ON still didn't matter as SQL always seemed to use up all of the tempdb.

4) This table is already partitioned by date and currently has 570 partitions.


Wait justa minute... you have a partitioned table that has no Clustered Index???

What seems to be the problem ?


Nothing if you never update variable width columns to contain more than they did at INSERT time and you never do deletes, both of which can waste huge amounts of disk space, memory space, etc. Sure, you can rebuild a heap as of 2008 but that also comes with some hefty hidden costs for the NCIs.

I suppose that for tables like audit tables, which are only inserted into, a partitioned heap might actually do faster inserts than on a partitioned clustered table so that might be an advantage. I would think that you'd need some covering indexes to get any real performance out of such a thing and those might negate the insert advantage that a heap might have.

I've never tried such a thing before and it's an interesting prospect. What's your experience on such a thing?

And thanks for raising the question.


@Abdel,

Looking forward to your answer on the questiong of "What's your experience on such a thing?" above.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1570584
Posted Tuesday, May 27, 2014 2:10 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Wednesday, May 28, 2014 9:06 AM
Points: 12, Visits: 125
@Jeff,
Sorry for the late response. This table is partitioned by week and we are mostly just inserting into that specific weekly partition and aggregating the data which updates some data. Also since we are almost always inserting a new quarter's data we are basically creating new pages at the end of the table. Because of this special business scenario the inserts are fairly fast for a 4 TB SQL2012 databases. The business operated under NCI index Heap setup for years but some of the larger reports are now suffering some performance and that's why the switch to this compressed CI set up.

This set up is still in development/testing phase and have not been deployed to production yet. So I will not have a real PROD stats on this until we finish implementation.

Thanks again,

-A



Abdel Ougnou
Post #1574979
Posted Thursday, August 7, 2014 5:36 PM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Yesterday @ 6:01 PM
Points: 356, Visits: 871
Jeff, I wanted to get your take on something Gail posted once :
http://www.sqlservercentral.com/Forums/FindPost485499.aspx
The way I read it, a clustered index (pk or not) won't necessarily cause a huge slowdown to the insert process. I think it would lead to fragmentation if done often enough, due to new pages everywhere. Is there a balance to reach here?
Post #1600964
Posted Thursday, August 7, 2014 6:36 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 8:14 PM
Points: 37,107, Visits: 31,665
MMartin1 (8/7/2014)
Jeff, I wanted to get your take on something Gail posted once :
http://www.sqlservercentral.com/Forums/FindPost485499.aspx
The way I read it, a clustered index (pk or not) won't necessarily cause a huge slowdown to the insert process. I think it would lead to fragmentation if done often enough, due to new pages everywhere. Is there a balance to reach here?


Correct. Adding rows to a clustered index won't cause any slowdown to an insert process if the insert process is in the same order as the CI and it's at the logical end of the table.

If it's not in the same order or not at the logical end of the table, you get page splits and those not only cause fragmentation but they can also cause huge slowdowns because (on average) roughly have the page will need to be copied to a new page. None of that is contrary to what Gail stated in that post.

Non-clustered indexes suffer the same fate.

I have seen it where someone adding an NCI with a column of very low selectivity as the leading column will suddenly cause massive timeouts on a busy app. Heh... I know this to be true because I was that "someone" in my early days.

The same thing can happen with clustered indexes.

The "balance" to be sought is to determine whether the table will suffer more inserts than selects and to write your indexes accordingly.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1600986
Posted Thursday, August 7, 2014 8:03 PM


SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Yesterday @ 4:33 PM
Points: 4,358, Visits: 6,193
I think this is slightly mis-stated Jeff. While you may not get fragmentation from end-of-index inserts, you certainly can get page LATCH contention, which can be a significant bottleneck to good insert performance in such a scenario.

Best,

Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru at GMail
Post #1600996
Posted Friday, August 8, 2014 12:43 PM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Yesterday @ 6:01 PM
Points: 356, Visits: 871
Thanks Jeff for sharing your knowledge here.
Kevin, I'll have to read up on Latching. Thanks guys.
Post #1601306
« Prev Topic | Next Topic »

Add to briefcase ««123»»

Permissions Expand / Collapse