Reclaiming freed space

  • Interesting. I took Hugo's word for it and didn't bother to test it.

  • Hi all,

    First: Apologies for the late replies. Unfortunately, Steve ran this question just during my holiday, so I was busy soaking in sun, blissfully unaware of your reactions. I am only now clearing out my backlog.

    Several people have complimented me on the question. To you, I say: Thank you! Good to get the positive feedback.

    I'm not sure if Chad's "brutal" comment was intended as a compliment, but I'll take it as such: I don't submit many questions, but when I do I want to try to ask a question that teaches many readers something they didn't know, which automatically translates to a tough question. If 95% get it right, I failed because only 5% learned from it. If 20% get it right due to a badly worded question, I failed because I try think the QotD should test SQL Server skills, not reading or interpretation skills. But if 20% get it right because the other 80% didn't know, than I succeeded in teaching those 80% something new.

    As to RBarryYoung's comment, I agree. The submitter of the question does not get to set the point total, that's up to Steve - and I too am surprised to see this being rated as a one-pointer.

    Several people have also posted constructive criticism. Thanks to them as well - as good and nice as positive feedback is, nothing is better for increasin my own knowledge than cionctructive criticism. I will address each of those messages in individual replies after this one.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • nico van niekerk (7/28/2009)


    I clicked the correct answer but it still said that I was wrong and cited an option I didn't click.

    Then, it is not "than", it is "then".

    It's a pity that the effort wasn't debugged.

    Hi Nico,

    I'm not sure what to make of your first sentence. Do you mean that you disagree with the options I marked as correct? Or do you mean that you site malfunctioned, claiming you clicked an option you didn't actually click? If the latter, you'll have to take it up with Steve, I can't help you there. And in the former case, I can only reallly comment if you tell me WHICH of the options you disagree with.

    Your comment about my spelling is correct. I could try to hide behind me not being a native speaker, but in all honesty I can only admit that this is an error I should be able to avoid. Sorry for not proofreading the question a bit better. I hope this did't cause you to misunderstand the question.

    I know Steve does look at submitted QotD entries before posting them, but he can never catch all errors. In the case of a simple then/than error, I think he can be forgiven.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Chad Crawford (7/28/2009)


    Hugo (et al.), I was running some scripts (in 2K5) based on this question to learn a little more and I noticed that the delete doesn't seem to return any space at all, even before reloading all the data back in?!? I filled your table with 1,000,000 records, copied the rows to a new table, then did a delete with no where (scary all by itself) and sp_spaceused showed the same amount of reserved space as before I ran the delete. Is it really holding on to all those pages still? I looked in the BOL under Dropping and Rebuilding Large Objects and it seems to indicate that the cleanup doesn't happen until after the transaction commits, but the result is the same whether or not I wrap it in a transaction. It's the oddest thing - sp_spaceused shows 0 rows taking up 47MB. :ermm: I understand now why the same space is used after reloading the data, but what about when there is no data at all? I never expected that.

    Thanks,

    Chad

    Hi Chad,

    This is strange. I ran some tests as well, and I found that deleting all rows does free up most space, but not all. I also found that most space is released immediately, but there's also a bit released only after a second or so. And the most surprising is that the amount of unreleased space seems to depend on various factors I could not put my finger on - at one time, I consistently was left with about 200 KB or so after deleting (with a 10,000 row test); later when I tried again (but apparently not with the exact same table and data), I was consistently left with a mere 32KB.

    However, this is a bit off-topic here, as the QotD was not about reclaiming free space after deleting all or any rows from a table, but after deleting one column without removing any row.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • mtassin (7/28/2009)


    The Garbage collection process is also a correct answer.

    ]SQL Server 2005 Books Online April 2006 Help File


    (snip - HK)

    So sp_spaceused for a large table (1 million rows seems large enough to me) will not report the space freed up until the batches of physical phases are complete. Unless sp_spaceused @objectusage='updateusage' is used.

    Calling it garbage collection might be a stretch, but it is serving that purpose behind the scenes.

    I have no objection at all to calling this garbage collection. But that does not make it a correct answer. The Books Online fragment you quote specifically states that this process kicks in "When you drop or rebuild large indexes, or drop or truncate large tables". In the question's scenario, a column was dropped and the garbage collection answer suggested that the table's pages will be automatically rebuilt to free up the associated disk space by some automatic process. This is not true. If you drop a column from a large table without doing anything else, the space taken by that column will continue to be reserved forever, unless you create a new table or create, drop, or rebuild a clustered index.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • andrew.me.ph (7/28/2009)


    As to me, this is a silly answer: since both building clustered index and copying data multiple times solve "the problem" - presenting multiple copy as a correct answer is really silly. It is like saying that doing more operations (involving both more scripting amd more database operations) is better than doing less scripting and less database operations that lead to the same result. I am really surprised with the answer. I wonder who has decided which answer is the correct one.

    Hi Andrew,

    I decided which answers are correct. 😀

    I agree that there is no need to perform all the actions in the correct answers. One is enough. Pick whichever you like best. 🙂

    I disagree that this makes the answer silly. The question specifically asked which of the presented strategies "can be used", not which one "should be used". (And in one of the later messages, you'll also see that there is no clear-cut reply to that question). So the question was designed to test if you know which actions will free up space after dropping a column and which won't, not to ask for a "best practice".

    EDIT: Fixed the smileys


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Stuart M. Garner (7/28/2009)


    Interesting that in the explanation for the "correct" answer it states:

    Creating a clustered index does involve a rebuild of all the data pages, so this will also change the metadata and free up the space previously taken by Column2.

    Stated in the referenced article at

    http://msdn.microsoft.com/en-us/library/ms177563.aspx

    it states:

    Note:

    Dropping a column does not reclaim the disk space of the column. You may have to reclaim the disk space of a dropped column when the row size of a table is near, or has exceeded, its limit. Reclaim space by creating a clustered index on the table or rebuilding an existing clustered index by using ALTER INDEX [ http://msdn.microsoft.com/en-us/library/ms188388.aspx ] .

    As observed by others, it appears there is more than one way to do this.

    🙂

    Hi Stuart,

    Yes, there is more than one way to do this. I even included three ways, and you had to tick alll of them to score a point 😛

    I have a hunch that you think the text you quote contradicts what I wrote, but I fail to see the contradiction. Or am I misunderstanding the point of your post?

    EDIT: Just saw that the quote includes using ALTER INDEX to rebuild a clustered index. Yes, that would work too - if there is an existing clustered index to rebuild. In this case, there was only a nonclustered index, so rebuilding was not an option.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • angelo.cas (7/28/2009)


    Well I would have added and removed the clustered index. It's by far the simplest process that would have less room for fault or down time on the table.

    Why you would do all the other stuff is beyond me, just asking for trouble.

    Hi Angelo,

    Adding and removing a clustered index is the simplest, that is true. That does not automatically make it the best, and it will definitely not yield the lowest amount of down time automatically.

    Remember that nonclustered indexes include either a reference to the clustered index if there is one, or a reference to the database page otherwise. Adding a clustered index forces each nonclustered index (one in the case of this example) to be rebuilt. And that goes for removing a clustered index as well.

    Let's compare two approaches: the copy/drop/rename approach and the create/drop clustered index approach.

    Create/drop clustered index

    Creating the clustered index involves the following steps:

    * Sort data (one million rows!)

    * Copy data to new data pages

    * Free previously allocated data pages

    * Rebuild one nonclustered index

    Dropping the clustered index is cheaper:

    * Drop the index (metadata only)

    * Rebuild one nonclustered index

    Copy/drop/rename table

    Copying the data to a new table:

    * Copy data to new data pages

    * Build one nonclustered index

    Drop old table:

    * Drop table and index (metadata only)

    * Free previously allocated data pages (for index as well)

    Rename table:

    * Metadata only

    As you see, adding and removing a clustered index causes the nonclustered index to be rebuilt twice, whereas the copy/drop/rename scenarion has to build the new nonclustered only once. This can add up significantly if the table has several nonclustered indexes. Other than that, the create/drop clustered index method also incurs a sort operation that is not really required.

    I think the "best" option depends. How afraid are you of data loss due to operator error? How many nonclustered indexes are involved? How long is the time window available for the operation? If you need 24x7 availability and you have enterprise edition, you can use its online index rebuilding capabilities (though I'd have to check if they work for creating and dropping a clustered index as well) Etc, etc, etc.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • sean redmond (7/28/2009)


    Wouldn't shrinking the DB also reclaim the space lost by the dropped column?

    Frankly, I have no idea. I guess I could test but I can't really be bothered. I prefer not to shrink me databases at all, because shrinking comes with a bucketload of undesired side effects.

    See http://www.karaszi.com/SQLServer/info_dont_shrink.asp for the gory details.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • stevecable (7/31/2009)


    The answer claims "Even though TRUNCATE TABLE does remove all the data pages and could theoretically be a great time to change the metadata, it doesn't work this way (as can easily be verified by testing). "

    So, I tested truncation, and it sure seems to me like all the space is reclaimed. I get zero's across the board on data and indexes. I have to admit being baffled why truncation is not a correct option for reclaiming space. Hmmm.

    Hi Steve,

    Yes, truncating a table will free up all space allocated to it. No problem. But that is not the scenario this question was about. You have to relaod the rows that were there (without the now unused Column2). If you do that and check sp_spaceused, you'll see that the 200 bytes per row previously required for Column2 are still used for each row - in other words, after truncating and reloading the data, you still have not reclaimed any space. Run the sample below (make sure to create and populate MyTable2 first) to check:

    DROP TABLE MyTable;

    go

    CREATE TABLE MyTable

    (Column1 int NOT NULL,

    Column2 char(200) NOT NULL,

    Column3 varchar(40) NOT NULL,

    PRIMARY KEY NONCLUSTERED(Column1));

    INSERT INTO MyTable

    SELECT * FROM MyTable2

    go

    EXEC sp_spaceused MyTable;

    ALTER TABLE MyTable

    DROP COLUMN Column2;

    EXEC sp_spaceused MyTable;

    TRUNCATE TABLE MyTable;

    EXEC sp_spaceused MyTable;

    INSERT INTO MyTable(Column1, Column3)

    SELECT Column1, Column3

    FROM MyTable2;

    EXEC sp_spaceused MyTable;

    CREATE CLUSTERED INDEX xx ON MyTable(Column1)

    DROP INDEX MyTable.xx

    EXEC sp_spaceused MyTable;

    go

    You'll see that after the TRUNCATE TABLE, the space used is zero - but after the INSERT INTO, it is right back at its original size, even though Column2 has already been dropped. Only after creating and dropping a clustered index (or creating, copying, and renaming a new table) will the data size of the table be reduced.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Hugo Kornelis (8/5/2009)


    You'll see that after the TRUNCATE TABLE, the space used is zero - but after the INSERT INTO, it is right back at its original size, even though Column2 has already been dropped. Only after creating and dropping a clustered index (or creating, copying, and renaming a new table) will the data size of the table be reduced.

    That's very interesting. I would not have guessed that it would work that way. Thanks for the clear and concise explanation.

  • Cliff Jones (8/5/2009)


    Hugo Kornelis (8/5/2009)


    You'll see that after the TRUNCATE TABLE, the space used is zero - but after the INSERT INTO, it is right back at its original size, even though Column2 has already been dropped. Only after creating and dropping a clustered index (or creating, copying, and renaming a new table) will the data size of the table be reduced.

    That's very interesting. I would not have guessed that it would work that way. Thanks for the clear and concise explanation.

    My pleasure! 😉


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • I'm not sure if Chad's "brutal" comment was intended as a compliment, but I'll take it as such

    Whoops - I didn't even think that might be taken as a complaint, it certainly wasn't meant that way. I was using it in the same way you might call a long, hard workout "brutal". It's good for you, you're glad it's over and you feel like you have conquered something. Thanks for assuming positive intent - there was definitely nothing negative intended there.

    This is strange. I ran some tests as well, and I found that deleting all rows does free up most space, but not all. I also found that most space is released immediately, but there's also a bit released only after a second or so. And the most surprising is that the amount of unreleased space seems to depend on various factors I could not put my finger on - at one time, I consistently was left with about 200 KB or so after deleting (with a 10,000 row test); later when I tried again (but apparently not with the exact same table and data), I was consistently left with a mere 32KB.

    Wow, that is interesting. I wonder what it is underneath that makes the difference. I'm cleaning up a lot of old data (horizontal and vertical), so this topic is quite opportune.

    However, this is a bit off-topic here, as the QotD was not about reclaiming free space after deleting all or any rows from a table, but after deleting one column without removing any row.

    Quite right. I'll take it elsewhere if I decide to pursue it. Thanks Hugo, and welcome back.

    Chad

  • Chad Crawford (8/6/2009)


    Wow, that is interesting. I wonder what it is underneath that makes the difference. I'm cleaning up a lot of old data (horizontal and vertical), so this topic is quite opportune.

    However, this is a bit off-topic here, as the QotD was not about reclaiming free space after deleting all or any rows from a table, but after deleting one column without removing any row.

    Quite right. I'll take it elsewhere if I decide to pursue it.

    Well, you know about the vertical cleanup by know. 🙂

    For the details of horizontal celanup, I'd suggest posting a question to the newsgroup microsoft.public.sqlserver.server (if you have newsgroup access). Kalen Delaney regularly reads and replies there, and if anyone knows the fine details, she does!


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • This was a good question. I would have initially ticked the truncate table option but ran some testing and research first. Though accurate, I am not sure I agree with dropping the clustered index - I think I would rather drop the nonclustered index and leave a clustered index on the table.

    Thanks for the good question.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

Viewing 15 posts - 16 through 30 (of 33 total)

You must be logged in to reply to this topic. Login to reply