SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Very large table - performance issues


Very large table - performance issues

Author
Message
Abu Dina
Abu Dina
SSCrazy
SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)

Group: General Forum Members
Points: 2917 Visits: 3325
Eric M Russell (5/22/2013)
Abu Dina (5/21/2013)
We have a tall table that contains 2.6 billion rows

Table structure:



The application which uses this table has been running slow for the last couple of days and it seems to have happened following the addition of about 400 million rows last weekend.

I think it's because of the index fragmentation although I'm not sure how to check if this the case without affecting the application?

So my first question is, how do I check to see if the indexes are fragmented and whether the stats need updating on such a large table?


I notice that your clustered index is on IX_dType. Why was that column chosen to cluster the table. Especially for tables with a large number of rows, you typically want to cluster on a column with unique sequential values. There could very well be fragmentation.


I don't know, I didn't create the table but now I'm looking after it! w00t

---------------------------------------------------------


It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
David Edwards - Media lens

Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
Howard Zinn
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (212K reputation)SSC Guru (212K reputation)SSC Guru (212K reputation)SSC Guru (212K reputation)SSC Guru (212K reputation)SSC Guru (212K reputation)SSC Guru (212K reputation)SSC Guru (212K reputation)

Group: General Forum Members
Points: 212093 Visits: 41977
Abu Dina (5/21/2013)
We have a tall table that contains 2.6 billion rows

Table structure:



Gosh. A couple of people have mentioned a couple of things seriously wrong with this already but let me summarize some of the things I see wrong even if some of them are repeats.

1. 2.6 Billion rows using an INT for a PK. Most people don't have the forsight to start such a column at -2,147,483,648 and that's about the only way this table could have 2.6 Billion rows in it. If you're going to keep adding 400,000 rows on a regular basis, now would be a good time to convert the PK column (DID) to a BIGINT (heh... while the table is still "small" :-P )

2. The Clustered Index is on the DTYPE column. I suspect there are relatively few unique values that go into that column. If you added 400,000 rows, you will likely have caused some huge page splits throughout the table which means that you have to read a lot more pages than you probably need to because many of the pages might only be half full due to the page splits. To wit, I strongly agree that the clustered index is on the wrong column. I don't know if your DID column is an IDENTITY column (or similar in operation) or not or whether the Created column appears in many queries but, if they are, I'd be tempted to make a UNIQUE Clustered Index using the Created and DID columns and in that order. Every index will benefit from the uniqueness of the clustered index.

The rule of thumb for a clustered index is "Narrow, ever increasing, and unique". If you build it correctly, you should almost never have to defrag the clustered index.

It would also be highly beneficial if the Created column was based on GETDATE() just to keep the "ever increasing" thing going.

3. I don't know if the table is partitioned or not. For the sake of easier index maintenance, you should strongly consider it if it's not. It'll take some planning and maybe some code changes if the DID column is an IDENTITY column, but it will absolutely be worth it insofar as index maintenance goes. With a bit more planning, you can make also make it so you only need to backup a very small portion of the table and you'll be able to suddenly do that on a nightly basis with Point-in-Time recovery. It'll also allow you (if you use a partitioned view across 2 databases) to restore the more recent parts of the table separately incase of a DR event to get the system back online much more quickly than trying to restore a 2.6 Billion row table in one fell swoop.

Also, keep in mind that partitioning isn't done for performance. The only thing that partitioning has to do with performance is 1) it'll make index management a whole lot faster so it could be done every night if necessary and 2) properly managed indexes will help performance. Depending on which method of partioning you chose, it may also reduce backup times, DR restore times, and tape space.

4. I can't tell from your graphic, but it would also be a good thing to find out if there are any triggers on the table. Those won't hurt SELECTs but they could really put the whammy on INSERTs, UPDATEs, and DELETES.

5. Consider archiving some of that data in the table. Do you REALLY need all 2.6 Billion rows to be accessable on an instantaneous basis? Even if you do, using partioned views will allow you to move a great deal of the data out of the main database and into an archive database. Again, you have to plan for this carefully especially if you want to do INSERTs via the partitioned view.

Of course, this is all just an opinion on my part and isn't complete (they're suggestions to look into, not complete solutions). I just wanted to give you some things to look at that I've had pretty good success with in the past.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Abu Dina
Abu Dina
SSCrazy
SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)SSCrazy (2.9K reputation)

Group: General Forum Members
Points: 2917 Visits: 3325
Jeff, thanks so much for your ideas and suggestions. I've spent a few days on this problem and the table has been restructured and a new indexing strategy has been implemented.

I'm glad to report that performance has significantly improved.

Unfortunately I'm a little busy at the moment so I can't go into too much detail about what I've done but I will keep this thread alive somehow and will post back with more detail about this. Watch the space!

Once again thank you and thanks to all who have contributed to this thread. I couldn't have done this without you guys! Cool

---------------------------------------------------------


It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
David Edwards - Media lens

Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
Howard Zinn
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search