Clustered Index leaves a lot of empty space

  • I have a fairly large table, billions of rows, that take up around 3TB of disk. I was tasked with creating a clustered index on this table. I performed the sort in tempDB and when it completed, tempdb populated the original table, so far so good.  I know the clustered index takes the place of the original heap, and I expected the ~20% growth(120% of original size), but now my table is ~7TB and has 3TB of free space. I was hoping by performing the sort in tempDB I could avoid such a waste of disk. This table only gets updated quarterly, its there for reads... is there a way to create a clustered index without all this empty space?

  • haahof wrote:

    I have a fairly large table, billions of rows, that take up around 3TB of disk. I was tasked with creating a clustered index on this table. I performed the sort in tempDB and when it completed, tempdb populated the original table, so far so good.  I know the clustered index takes the place of the original heap, and I expected the ~20% growth(120% of original size), but now my table is ~7TB and has 3TB of free space. I was hoping by performing the sort in tempDB I could avoid such a waste of disk. This table only gets updated quarterly, its there for reads... is there a way to create a clustered index without all this empty space?

    Yes there is!

    The issue is that for anything over 128 Extents (that's just 8MB), SQL Server creates or recreates the entire index and doesn't drop the original heap/Clustered index until the copy that has been created has been committed.

    Now, there's no way to get around the temporary doubling of the space used even when you do sort in tempdb BUT, you can get around all that space being permanent.  It takes a little work the first time, though.

    As is the condition of most systems, let's say that your original CI (Clustered Index) lives on the PRIMARY FileGroup...

    1. Build a new filegroup with the same name as the table the CI is on followed by the number "1".  For example, if your table is named "MyTable", name the new filegroup as "MyTable1".
    2. On that new filegroup, create a new file with the same new name as the filegroup.
    3. Presize the new file to about 90 or 95% of what you think the final size will be based on Fill Factor and how much space you'll recover based on the current page density.  If you have a lot of "in-row" LOBs, that'll be a bit more difficult due to the nature of "Trapped Short Rows" but that's a much longer story than what we should get into just now.
    4. Do a CREATE INDEX with exactly the same keys and setting you currently have but add the DROP_EXISTING = ON option.  If you can temporarily slip into the BULK LOGGED Recovery Model, it'll take about half the time as it normally would due to minimal logging.  The other change to make is, instead of ON [PRIMARY], change it to ON [MyTable1] (using the name of the new filegroup you created instead of "MyTable1".
    5. Now would be a good time to rebuild the other indexes from smallest to largest.
    6. Then, even if you don't do #5, you'll have a shedload of free space left just as you do now.  You have two choices... move the index back to the PRIMARY file group and drop the new file/filegroup or... plan for the future.  Leave the new file/filegroup as it is, and incrementally shrink the free space out of the PRIMARY filegroup.

    Ok... then, lets say, you need to do the rebuild on that same CI.  If you have moved the CI back to the PRIMARY as in step 6 above, you have to go through the same thing all over again.

    ... OR...

    If you left the new filegroup with the CI in it, create a new FileGroup and File but, this time, name them both MyTable2.  Do the REBUILD with DROP_EXISTING on to ON [MyTable2].  When that's done, just drop the MyTable1 File and Filegroup and you're done.

    Next time you need to do it again, just reverse the process to move from MyTable2 to a new MyTable1 File/FileGroup and drop the MyTable2 File/FileGroup.

    If the non-clustered indexes on that same table fragment, as well, chances are they're pretty big as well.  Include the NCIs in the move from one File/FileGroup to the other.

    Because of the shrink after the first move, it's all a bit painful the first time but, oh my!  Is it ever worth it for the future!

    Another thing to consider... if the CI has an ever-increasing key, consider partitioning the CI IF the older rows no longer suffer fragmentation due to Expansive Updates.  I prefer the old Partitioned Views but whatever.  IF you have such a condition, you could have the fragment free partition(s) living in a MyTableO File/FileGroup and just move the more recent, more active rows from one group to another, etc, wash, rinse, repeat.

    Another thing to do is to simply move those fragment free legacy rows to another database so you're not constantly backing up data that will never change.  You can do similar by setting such inactive Files/Filegroups for partitions to READ_ONLY and then back them up one final time and never have to do it ever again.

    I rather prefer moving them to a different database and using a Partitioned View to sew them back together.  You can't do that with a Partitioned Table (at least you couldn't the last time I did had to do this... things may have changed).

    I call the method of bouncing between two file groups the "Swap'n'Drop" method.  It's a little painful to setup for the first move but, then again, look what you're going through now... same pain.  After the first move, you'll never need to feel that pain ever again with that table.

    And no, to your next question... The DROP_EXISTING ON does NOT mess with any FKs pointing that the CI or any NCI.  It's PFM.

    Ah... almost forgot.  Once you're done with a move, don't forget to set your database back to the FULL Recovery Model.  This doesn't break the log file chain but I'd take a log backup both before and after because minimally logged operations mean the you can't do a point in time restore to the middle of that particular log file.  You either have to quit the restore before you get to the log file or use the entire log file for a PIT restore to sometime after that log file.

    I also use this process if I have to delete more than about 20% of a large table.  Deletes are fully logged... minimally logged inserts are minimally logged. 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • p.s.  The Sort_In_TempDB thing only takes care of the sorting aspect.  You can leave that in if you want but consider the load it puts TempDB under for such a large table... with the "Swap'n'Drop" method, you no longer need to do that.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Amazing... Thank you for your detailed responses, I can't wait to try this on the next build!

  • Thanks for the feedback.  On this whole thing, remember that planning is the key.  Shoot, the first time I did it, I actually drew a picture of what I wanted to do and wrote a functional flow chart to help me "keep the vision" as I was writing code.  And, yeah... you can make it so that it'll figure out the current 1/2 naming of the File/FileGroup and auto-magically change it using a bit of dynamic SQL so that all you need to do is run a stored procedure for a given table and it does the rest of the work.

    The biggest issue is that I hard-code the indexes for now because there's no built in method for generating the code for indexes.  I've checked on some methods and, at first glance, they look great but, when you read the discussions that follow, you find out they're filled with corrections, which doesn't exactly build any confidence and so I've not used any of the SQL methods for this.  It's not like I need to do this for a thousand tables.

    I've not worked on it for a while but I have discovered that a couple of folks have some PowerShell code that does the trick and I'm not afraid to use xp_CmdShell to call PowerShell (it is NOT the security issue that most would have you believe IF it's setup correctly and prevents NOTHING if it's turned off and a hacker gets in with SysAdmin privs and exposes NOTHING if they can't because they can't use it when it's properly setup).

    I actually have an hour long presentation on the subject of xp_CmdShell and how to set it up correctly and how to use it correctly.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Actually, if you have LOB off-page data, that data will not be moved simply by CREATEing the clustered index on another filegroup.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • ScottPletcher wrote:

    Actually, if you have LOB off-page data, that data will not be moved simply by CREATEing the clustered index on another filegroup.

    To be sure, that's one of the many reasons why I left LOBs out of this conversation.  However, any LOBs that fit IN-ROW WILL be moved with the CI and will continue to have the "Trapped Short Rows" issue.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Are you page-compressing that table?  If not, why not??  Especially since it is read-only almost all the time.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Just as a bit of a sidebar, remember that page compression causes CI rebuilds to take about 3 times longer.  I'm NOT saying that makes it not worth it but you DO have to plan for it.  And, if you do the rebuild ONLINE, that's also going to take a huge amount of time and it won't be minimally logged.

    Page compression may also be the cause of fragmentation where there was none before.  For example, if you have date columns that are NULL during the insert, they will occupy just 1 byte.  Update those to any date after 14 Sep 1900 and you end up with an expansive update of 1 byte until the 2079 issue kicks in.  The Time portion of times is even worse... it jumps from 1 byte for NULL times to 4 bytes for any time after 4PM for DATETIME.  Think things like "Modified_On" columns, which is the normal culprit for such problems.

    Don't get me wrong... I LOVE Page compression.  But it's not a panacea and you can't just up and use it without understanding all of the ramifications, which can be terribly costly in more ways than just fragmentation and the page splits that go with it.  In fact, I have some large tables that do suffer from the old Insert/Expansive Update problem that I've page compressed because they're already permanently fragmented... but I didn't page compress them without finding that out first.

    "Must look eye".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden wrote:

    Just as a bit of a sidebar, remember that page compression causes CI rebuilds to take about 3 times longer.  I'm NOT saying that makes it not worth it but you DO have to plan for it.  And, if you do the rebuild ONLINE, that's also going to take a huge amount of time and it won't be minimally logged.

    Page compression may also be the cause of fragmentation where there was none before.  For example, if you have date columns that are NULL during the insert, they will occupy just 1 byte.  Update those to any date after 14 Sep 1900 and you end up with an expansive update of 1 byte until the 2079 issue kicks in.  The Time portion of times is even worse... it jumps from 1 byte for NULL times to 4 bytes for any time after 4PM for DATETIME.  Think things like "Modified_On" columns, which is the normal culprit for such problems.

    Don't get me wrong... I LOVE Page compression.  But it's not a panacea and you can't just up and use it without understanding all of the ramifications, which can be terribly costly in more ways than just fragmentation and the page splits that go with it.  In fact, I have some large tables that do suffer from the old Insert/Expansive Update problem that I've page compressed because they're already permanently fragmented... but I didn't page compress them without finding that out first.

    "Must look eye".

    Ok, versus what?  Not compressing them?  Original table is 3TB; new, compressed table is, say, 1.4TB (and that's not uncommon).  To me, you seem overly-worried about an occasional lengthening of a column / a few page splits.

    It's worth it unless you have an awful lot of expansions.  If you have too many, then store some old-school "this-value-means-NULL" value in the column -- such as 0001-01-01 or so that it won't expand when a "real" value is assigned; you can use a view to replace that will NULL so it is transparent to the apps (which it should be!).

    As to "modified on" columns, those should be isolated outside the table using a bigint pointer as the key to the details, but of course that's a different discussion.  Still, I can't believe how many bytes people lard up their table rows with for such limited info.  Move it outside the table and (1) properly comment the change and (2) be able to see all related table changes in order in one place!  That's vastly more valuable *and* less overhead to the main tables.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • It doesn't take much to cause a page split with compression enable on tables that aren't suffering from page splits to being with.  If you're going to use compression on a table that doesn't or only occasionally suffer page splits, you really need to check what all of your updates are doing and whether or not they expansive because sudden having to rebuild a 3TB table on a more regular base that didn't use to need it, isn't your basic "Martha Stuart Moment". 😉

    As you said...

    It's worth it unless you have an awful lot of expansions

    You really need to know before you apply compression.

    And yes... I totally agree that "alternate NULL"s can work but frequently don't because too much code is relying on IS NULL, etc.

    Like I said, I LOVE Page Compression but you can't just up an tell people to compress stuff without actually knowing what is going to happen.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden wrote:

    It doesn't take much to cause a page split with compression enable on tables that aren't suffering from page splits to being with.  If you're going to use compression on a table that doesn't or only occasionally suffer page splits, you really need to check what all of your updates are doing and whether or not they expansive because sudden having to rebuild a 3TB table on a more regular base that didn't use to need it, isn't your basic "Martha Stuart Moment". 😉

    As you said...

    It's worth it unless you have an awful lot of expansions

    You really need to know before you apply compression.

    And yes... I totally agree that "alternate NULL"s can work but frequently don't because too much code is relying on IS NULL, etc.

    Like I said, I LOVE Page Compression but you can't just up an tell people to compress stuff without actually knowing what is going to happen.

    I think you can, and should, tell them to compress with a table that size unless they have a good reason not to.  You shouldn't have to suddenly rebuild a table that's only updated every 3 months.  If anything, compression reduces page splits from char column expansions, since such columns are often much shorter when updated later.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • When you read this, Scott, remember that I LOVE page compression...

    You say the following but you need to qualify, especially to a person who might not know what compression is never mind a "good reason not to", just exactly what the ramification are.

    ScottPletcher wrote:

    I think you can, and should, tell them to compress with a table that size unless they have a good reason not to.  You shouldn't have to suddenly rebuild a table that's only updated every 3 months.

    The OP stated the following in his original post...

    haahof wrote:

    This table only gets updated quarterly, its there for reads...

    Then you asked...

    ScottPletcher wrote:

    Are you page-compressing that table?  If not, why not??  Especially since it is read-only almost all the time.

    Most wouldn't take exception to that but consider that the OP didn't know that the conversion from HEAP to CI would leave all that free space nor what could be done about it in the future.  By that same token, you never asked the OP what the quarterly "Updates" consist of.  Are the "updates" to the table really just Inserts at the end of the ever-increasing CI.  If so, I agree... page compress that puppy!  But, you didn't ask and you implied that it would be alright "since it is read-only almost all the time".   Without knowing what the "updates" actually are or how many there are or even whether or not the "updates" are just "hot spot" Inserts or not, you may have made it so it has now become necessary to rebuild the index every 3 months and that means "downtime" for the table.  You also didn't mention that rebuilding a compressed index takes to to 2 to 4 times longer (usually 3 for the indexes I work with) and is very CPU intensive.

    And then you say...

    ScottPletcher wrote:

    If anything, compression reduces page splits from char column expansions, since such columns are often much shorter when updated later.

    Gosh.. that's just not true.    First of all, if ANSI PADDING is correctly enabled, then a CHAR column will NOT EVER cause "column expansions" because they're a fixed width for non-compressed indexes.  If you enable Row (or Page) compression, then all those nice CHAR columns suddenly become variable width because Row compression removed the trailing spaces that were making them fixed width in nature.  If they go from something smaller to something bigger, they won't be an ExpAnsive Update in a non-compressed table but they WILL become ExpAnsive when at least Row Compression is present (which is also present in Page compression).

    If you meant that the column will be inherently smaller because no one usually fills a CHAR to it's full width, I agree but, when in a compressed table, there's no extra space to "ExpAnd into" unless you added a Fill Factor and that will cause something going from smaller to larger to cause ExpAnsion.  Having a Fill Factor is a bit counter productive on a 3TB index because even the old data that isn't updated is going to take a lot more space.

    And you'd have to show me a link with an extensive study that proves that CHAR columns are "often much shorter when updated later".

    My point is that if you're going to recommend compression on a table that you actually know nothing about,  you need to identify the caveates and there are a huge number of things to be considered.  Compression is not only NOT a panacea but can also cause a huge number of problems (including performance issues) if not used correctly.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden wrote:

    When you read this, Scott, remember that I LOVE page compression...

    You say the following but you need to qualify, especially to a person who might not know what compression is never mind a "good reason not to", just exactly what the ramification are.

    ScottPletcher wrote:

    I think you can, and should, tell them to compress with a table that size unless they have a good reason not to.  You shouldn't have to suddenly rebuild a table that's only updated every 3 months.

    The OP stated the following in his original post...

    haahof wrote:

    This table only gets updated quarterly, its there for reads...

    Then you asked...

    ScottPletcher wrote:

    Are you page-compressing that table?  If not, why not??  Especially since it is read-only almost all the time.

    Most wouldn't take exception to that but consider that the OP didn't know that the conversion from HEAP to CI would leave all that free space nor what could be done about it in the future.  By that same token, you never asked the OP what the quarterly "Updates" consist of.  Are the "updates" to the table really just Inserts at the end of the ever-increasing CI.  If so, I agree... page compress that puppy!  But, you didn't ask and you implied that it would be alright "since it is read-only almost all the time".   Without knowing what the "updates" actually are or how many there are or even whether or not the "updates" are just "hot spot" Inserts or not, you may have made it so it has now become necessary to rebuild the index every 3 months and that means "downtime" for the table.  You also didn't mention that rebuilding a compressed index takes to to 2 to 4 times longer (usually 3 for the indexes I work with) and is very CPU intensive.

    And then you say...

    ScottPletcher wrote:

    If anything, compression reduces page splits from char column expansions, since such columns are often much shorter when updated later.

    Gosh.. that's just not true.    First of all, if ANSI PADDING is correctly enabled, then a CHAR column will NOT EVER cause "column expansions" because they're a fixed width for non-compressed indexes.  If you enable Row (or Page) compression, then all those nice CHAR columns suddenly become variable width because Row compression removed the trailing spaces that were making them fixed width in nature.  If they go from something smaller to something bigger, they won't be an ExpAnsive Update in a non-compressed table but they WILL become ExpAnsive when at least Row Compression is present (which is also present in Page compression).

    If you meant that the column will be inherently smaller because no one usually fills a CHAR to it's full width, I agree but, when in a compressed table, there's no extra space to "ExpAnd into" unless you added a Fill Factor and that will cause something going from smaller to larger to cause ExpAnsion.  Having a Fill Factor is a bit counter productive on a 3TB index because even the old data that isn't updated is going to take a lot more space.

    And you'd have to show me a link with an extensive study that proves that CHAR columns are "often much shorter when updated later".

    My point is that if you're going to recommend compression on a table that you actually know nothing about,  you need to identify the caveates and there are a huge number of things to be considered.  Compression is not only NOT a panacea but can also cause a huge number of problems (including performance issues) if not used correctly.

    Obviously I was referring to varchar, not a fixed-length column.

    I would presume a 99% or 98% fillfactor if columns are to be lengthened.  Roughly 25GB per TB of used space is trivial to give up if it avoids page splits.

    Hopefully the table is clustered on date/datetime if it's critical to its processing, in which case it would be new rows.  But even if not, yes, I still strongly urge compression until/unless you see it's causing a problem.  It would be helpful to know what % of rows are modified every 3 months.  I'd expect a relatively low %, 10%, less?, so you wouldn't need to rebuild every time.

    If it's a big issue, partition the table so that older, static data is in its own partition(s) if you need to rebuild the current frequently.

    Again, I think you're way over downplaying how much time you'll save using compression, both for writing and reading, not to mention the disk space itself.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • It was not "obvious" that you were speaking of VARCHAR because they are NOT affected by row compression at all.  Only CHAR is affected by row compression and you clearly stated CHAR.

    The ratio of 25 GB to 1 Tb works out to 0.025, which is 2.5%.  If we use Microsoft's definition of a maximum of 8060 bytes of user definable space and being generous on the Fill Factor of 97%, the leaves 241 bytes for expansion.  If you have 24 rows in the table, that allows you enough room to expand each of the rows by only 10 bytes.  11 Bytes per row will cause a page split.  That's without compression.

    With row compression, you might get lucky with a 2:1 compression, which would give you 48 rows per page.  If your "Expansive" update was on CHAR (which is now expansive) or VARCHAR, your 10 byte update per row is still going to be 10 bytes per row.  10 *48 isn't going to fit into 241 bytes and so you're guaranteed to have page splits.  The good part is you can lower the Fill Factor more on the compressed table, which might now only be a half TB if you're lucky.  Ah... but wait a minute... the compression has made a whole lot of things that used to be fixed width become variable width.  What used to be a mere 10 byte update might be a whole lot more.  On top of all that, you've also slowed down your queries even if the data resides in memory.

    Could it all be worked out?  Sure... well... maybe.  What is the actual trade off?  How badly will those "only once every 3 months {updates}" affect the table and where?  Is it going to cause the most recent data (which is typically also the most read data) to madly fragment?

    The only time that it's safe to recommend compression without doing a whole lot of work is if you can guarantee that you're only doing Inserts on an ever-increasing keyed index and you can also guarantee that there will be no updates.  I say it's "safe" but not really.  You have to do use the proc that estimates the savings and then also ask your self how many times the table will be read for large reports both large and small and try to figure out if it's going to be worth the performance hit, especially and ironically if you're using SSDs, which don't take much of a hit on read-aheads as spinning rust did in the past... that even takes away one of the justifications for compression.

    Again, my point isn't to avoid page compression.  My point is that people should stop recommending it to neophytes (all forums have neophytes that are just lurkers) without mentioning the other potentially serious problems like a crazy increase in fragmentation that basically destroys the memory savings and the fact a monster index will have to be rebuilt where it may not have been fragmented at all without compression.

    We need to find out what the OP means by "This table only gets updated quarterly...".  We can't even recommend partitioning of the data and compressing the "non-updated" data because we don't know where the "updated" data is going to be done.

    Speaking of the OP, I hope none of this scares him off.  The non-compressed solution I provided to prevent the huge size increase and tremendous waste in the form of free-space works a treat for me and it will for him, as well.  Once he tells us what the quarterly updates consist of, then we can make other suggestions and recommendations, including page compression (or not).

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply