Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 123»»»

Create Clustered Index on a Very Large Table (500 GB) Expand / Collapse
Author
Message
Posted Wednesday, May 7, 2014 9:36 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Wednesday, May 28, 2014 9:06 AM
Points: 12, Visits: 125
Hello,

I have a scenario where I need to create a Clustered Index (CI) on a very large SQL Server 2012 database table. This table has about approximately 10 billion rows, 500 GB in size. The job ran for about 20 hours into it and then fails with error: "Out of disk space in tempdb". My tempDB size is 1.8TB, but yet it's still not enough.

I have exhausted Google search and decided to post it here to get some extra help.

Here is my script:

CREATE CLUSTERED INDEX CI_IndexName
ON TableName(Column1,Column2)
WITH (MAXDOP= 4, ONLINE=ON, SORT_IN_TEMPDB = ON, DATA_COMPRESSION=PAGE)
ON sh_WeekDT(Day_DT)
GO

Thank you in advance for all your help.



Abdel Ougnou
Post #1568544
Posted Wednesday, May 7, 2014 9:43 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Thursday, October 23, 2014 4:49 AM
Points: 66, Visits: 488
Do you have any downtime windows to create the clustered index offline? Could be worth a go, will use less resource.
Post #1568553
Posted Wednesday, May 7, 2014 9:57 AM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Today @ 8:35 AM
Points: 457, Visits: 1,485
1. Change SORT_IN_TEMPDB = OFF
2. Remove MAXDOP or set it to high and run at at weekend.
3. Consider partitioning such a huge table by date.
Post #1568562
Posted Wednesday, May 7, 2014 10:31 AM This worked for the OP Answer marked as solution


SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Today @ 8:00 AM
Points: 4,410, Visits: 6,281
SQL Guy 1 (5/7/2014)
1. Change SORT_IN_TEMPDB = OFF
2. Remove MAXDOP or set it to high and run at at weekend.
3. Consider partitioning such a huge table by date.


1) I believe the table is partitioned (...ON sh_WeekDT(Day_DT))

2) It could be that there are other massive tempdb consuming queries running at the same time. Were you monitoring tempdb usage during these 20 hours?? Big necessity when you know you are going to be crushing it with something like you are trying to do.

3) I also wonder if it isn't the ONLINE part that is getting you, or at least having a play in the issue. Keeping track of stuff that is currently getting modified can add up depending on what is happening.



Best,

Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru at GMail
Post #1568595
Posted Wednesday, May 7, 2014 10:40 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Wednesday, May 28, 2014 9:06 AM
Points: 12, Visits: 125
Thanks for the replays.

--Here is the answers to the follow up questions:
1) I have to create this Index online since it is huge and I cannot afford such a long down time.

2) MAXDOP was set to 10 at first and showed no significant improvement in speed. I believe at the end the create index still happens on 1 processor anyways.

3) Also SORT_IN_TEMPDB = OFF or ON still didn't matter as SQL always seemed to use up all of the tempdb.

4) This table is already partitioned by date and currently has 570 partitions.



Abdel Ougnou
Post #1568601
Posted Wednesday, May 7, 2014 10:46 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Wednesday, May 28, 2014 9:06 AM
Points: 12, Visits: 125
TheSQLGuru (5/7/2014)

2) It could be that there are other massive tempdb consuming queries running at the same time. Were you monitoring tempdb usage during these 20 hours?? Big necessity when you know you are going to be crushing it with something like you are trying to do.

3) I also wonder if it isn't the ONLINE part that is getting you, or at least having a play in the issue. Keeping track of stuff that is currently getting modified can add up depending on what is happening.


Yes I was monitoring and very little to nothing was happening as I did this on the weekend.

Perhaps worth trying to do the index offline to see if it finishes faster. However this was done on another DB that was not being accessed by anyone.



Abdel Ougnou
Post #1568604
Posted Monday, May 12, 2014 10:03 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Wednesday, May 28, 2014 9:06 AM
Points: 12, Visits: 125
TheSQLGuru (5/7/2014)

3) I also wonder if it isn't the ONLINE part that is getting you, or at least having a play in the issue. Keeping track of stuff that is currently getting modified can add up depending on what is happening.


Just wanted to loop back and close this issues as successfully resolved.

Thanks to all for the help. TheSQLGuru was right, the online index creation was trying to keep track of too many changes as it tried to create and keep the index online at the same time. It could not complete this task within the given space on tempDB and withing a reasonable time. So changing this index creation setting to offline worked. The Clustered Index creation completed successfully and under 7:45 Hours. This was done over night during the time when users were offline.

The other great thing that is worth mentioning in case someone else has this similar scenario is by setting the index COMPRESSION=PAGE I was able to compress this 500 GB table down to 400 GB and the index size is only 1 GB. This is a savings of a 100 GB.

Thank you again to all.



Abdel Ougnou
Post #1569951
Posted Monday, May 12, 2014 10:15 AM This worked for the OP Answer marked as solution
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Thursday, October 23, 2014 4:49 AM
Points: 66, Visits: 488
cunningham (5/7/2014)
Do you have any downtime windows to create the clustered index offline? Could be worth a go, will use less resource.


Where's the love!
Post #1569953
Posted Monday, May 12, 2014 10:25 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Wednesday, May 28, 2014 9:06 AM
Points: 12, Visits: 125
cunningham (5/12/2014)
cunningham (5/7/2014)
Do you have any downtime windows to create the clustered index offline? Could be worth a go, will use less resource.


Where's the love!
hhh, sorry, I actually didn't see that. Thank you very much.



Abdel Ougnou
Post #1569962
Posted Monday, May 12, 2014 4:14 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 8:02 PM
Points: 35,397, Visits: 31,955
-A (5/7/2014)
Thanks for the replays.

--Here is the answers to the follow up questions:
1) I have to create this Index online since it is huge and I cannot afford such a long down time.

2) MAXDOP was set to 10 at first and showed no significant improvement in speed. I believe at the end the create index still happens on 1 processor anyways.

3) Also SORT_IN_TEMPDB = OFF or ON still didn't matter as SQL always seemed to use up all of the tempdb.

4) This table is already partitioned by date and currently has 570 partitions.


Wait justa minute... you have a partitioned table that has no Clustered Index???


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1570082
« Prev Topic | Next Topic »

Add to briefcase 123»»»

Permissions Expand / Collapse