Shrink large database

  • I have been asked to look a a sql 2000 database that has been neglected for some time.

    It has an audit database linked to the system and no maintenance has been done on this database on over three years,

    The database has grown to around 322Gb

    I have started doing some maintenance on the table and deleted over 9million entries for just one months worth of auditing data for back in 2010 and I will need to continue doing this on a nightly basis until we only have around 6 month worth of audit data left.

    Since I am deleting so much data, from the database, I thought it might be a good idea to shrink the database. I started shrinking the database and left it to run. The next day back at work it was still running and I had complaints from users using the system. I was reluctant to stop the shrink as I felt it woild stop soon. It continued to run and the next day before work started I decided to cancel the shrink.

    Any suggestions on how to get this done more efficiently?

  • I'm curious has your database full backups taken longer to backup after removing/deleting all that data? We deleted Terabytes of data (though we have less data our backups take longer)

    I'm considering moving all the data in current FileGroup and moving it to another FileGroup. if you have a big enough window that might be something that could be done?

    --------------------------------------------------
    ...0.05 points per day since registration... slowly crawl up to 1 pt per day hopefully 😀

  • Sorry I cannot really answer that.... as the server had been neglected, the backups appear to have failed for some time now.

    Not sure it helps, but I noted as soon as I had cancelled the shrink that had been running a couple of days, the log file backup was much bigger.

    I have been executing must of my commands to this server from a remote machine with SSMS 2008 installed. Today I decided to re-visit sql 2000 query analyser and fired that up on the server in question. I ran the shrink from there but instead of shrinking to 0% free I set it to 50% first.

    The shrink ran suprisingly quick and freed up around 10gig. I ran another and this time set it to 40% and this ran quick also. I have now kicked off another delete to run over the weekend.

    I feel confident that I will manage to get the space down significantly in this way.

  • terry.home (5/17/2013)


    Sorry I cannot really answer that.... as the server had been neglected, the backups appear to have failed for some time now.

    Might not be a bad idea to setup some backups, especially if it's important data. You could use the SQL Server maintenance plan feature as an option.

    Not sure it helps, but I noted as soon as I had cancelled the shrink that had been running a couple of days, the log file backup was much bigger.

    Thanks for the info. I've noticed that when running shrink there is some log growth, possible after log grew too when stopping? Good information, I don't use shrink often so that last part is interesting. Documentation on Shrink is not so good online. I wonder if Sybase docs would be better?... Mostly everything on internet just states "don't shrink" not really internals of how it works.

    I have been executing must of my commands to this server from a remote machine with SSMS 2008 installed. Today I decided to re-visit sql 2000 query analyser and fired that up on the server in question. I ran the shrink from there but instead of shrinking to 0% free I set it to 50% first.

    The shrink ran suprisingly quick and freed up around 10gig. I ran another and this time set it to 40% and this ran quick also. I have now kicked off another delete to run over the weekend.

    I feel confident that I will manage to get the space down significantly in this way.

    How did it go with shrinking 50% to 40%.. and lower? Did it get progressively slower to shrink as you got closer to the actual data size?

    --------------------------------------------------
    ...0.05 points per day since registration... slowly crawl up to 1 pt per day hopefully 😀

  • First thing's first, you need to get a good backup of that database. Next run a DBCC CHECKDB to ensure that after years of neglect that it is still consistent.

    After that, if you can temporarily change the database to simple recovery then your deletes will go much faster, as they are not getting logged.

    Joie Andrew
    "Since 1982"

  • Thank you for the responses.

    The backups are no longer failing. They were failing due to lack of disk space and this has now been sorted, partly down to allocating more disk space, and partly down to shrinking the file. Also found a very old, huge backup file which I deleted..

    Both the shrink to 50% and 40% ran quick and as I am deleting more data every night (10's of millions) the database is progressively getting smaller.

    Since I am deleting one months worth of data at a time, I expect to take another couple of weeks to get to a point where I only have 6 months worth of data left and will continue to shrink to 50% as I continue.

  • After doing a shrink, also rebuild your indexes, as they will probably be invalidated by the shrink.

    I don't do a shrink often, but in this case it was a valid choice. Only, one should indeed start with a slight decrease, and in most cases you're looking at days of decreased performance. Also, in my inderstanding, a shrink can not be cancelled, but I may be wrong. The reason why it finished quickly the next time tound, is because all the data had been moved already, so it's simply a matter of decreasing file size.

  • The backups are no longer failing. They were failing due to lack of disk space and this has now been sorted, partly down to allocating more disk space, and partly down to shrinking the file. Also found a very old, huge backup file which I deleted..

    Both the shrink to 50% and 40% ran quick and as I am deleting more data every night (10's of millions) the database is progressively getting smaller.

    Since I am deleting one months worth of data at a time, I expect to take another couple of weeks to get to a point where I only have 6 months worth of data left and will continue to shrink to 50% as I continue.

    This might be a moot point, since you're gone so far down this path "shrinking" and "deleting"....

    I wonder if it's not too late to update add a thought. Have you considered.. copying the "6 months" of data needed into another table/filegroup instead of deleting 10's of millions in the database? I'm assuming last 6 month set is much smaller.

    --------------------------------------------------
    ...0.05 points per day since registration... slowly crawl up to 1 pt per day hopefully 😀

  • Not a bad suggestion. I have considired this, but not done it.

    I dont know whether the last six months worth of data is any smaller in comparison to previous.

    My original problem was a lack of disk space which caused performance issues. Therefore deleting and shrinking helped to free up some space and allowed the database to respond.

    I could potentially now move the last 6 months to a new file, but once this is done, I assume I will still need to get the data older than 6 months deleted.

    Am I wrong to think that I will therefore be adding another step into my process but still end up with the same result at the end of the process?

    Regards

    T

  • sqlsurfing (5/26/2013)


    The backups are no longer failing. They were failing due to lack of disk space and this has now been sorted, partly down to allocating more disk space, and partly down to shrinking the file. Also found a very old, huge backup file which I deleted..

    Both the shrink to 50% and 40% ran quick and as I am deleting more data every night (10's of millions) the database is progressively getting smaller.

    Since I am deleting one months worth of data at a time, I expect to take another couple of weeks to get to a point where I only have 6 months worth of data left and will continue to shrink to 50% as I continue.

    This might be a moot point, since you're gone so far down this path "shrinking" and "deleting"....

    I wonder if it's not too late to update add a thought. Have you considered.. copying the "6 months" of data needed into another table/filegroup instead of deleting 10's of millions in the database? I'm assuming last 6 month set is much smaller.

    +1 to that. It might also be a good time to consider partitioning the table to make future once-per-moth deletes easier.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Database partitioning is not my first thought, and I think partitioning didn't work in SQL 2000 like we're used to now. I have cleaned out a SQL 2005 database around 350GB and much more important is that indexes should be present to support the delete.

    Database partitioning is high maintenance to set up and maintain, I have created and updated scripts to create these partitions, and we have yet to delete partitions.

    Negative aspects:

    - your indexes need to be partition aligned (say if you split partitions on month basis, the index needs to be split on datetime type)

    - if they're not, any move or remove may render your database blocked for extended periods of time during which data is moved and indexes are updated

    - you need to understand it thoroughly before accepting the challenge

    - applying that to an existing database, where the problem you want to solve is the speed op data movement, is not recommended. You can kill a script to delete records, you can not kill the data movement.

    I have a 4,3TB database which is partitioned by month. That's a valid case. I feel 350GB is not a valid case and not worthwhile. Start Profiler, run the result through tuning wizard and apply indexes. I did that, and while I could just about delete the daily turnover, now the script I used will delete history at a rate of 1 hour per day. So my script takes an hour a day to delete that same day 3 months ago. It should be possible.

  • pnauta (5/28/2013)


    Database partitioning is not my first thought, and I think partitioning didn't work in SQL 2000 like we're used to now. I have cleaned out a SQL 2005 database around 350GB and much more important is that indexes should be present to support the delete.

    Database partitioning is high maintenance to set up and maintain, I have created and updated scripts to create these partitions, and we have yet to delete partitions.

    Negative aspects:

    - your indexes need to be partition aligned (say if you split partitions on month basis, the index needs to be split on datetime type)

    - if they're not, any move or remove may render your database blocked for extended periods of time during which data is moved and indexes are updated

    - you need to understand it thoroughly before accepting the challenge

    - applying that to an existing database, where the problem you want to solve is the speed op data movement, is not recommended. You can kill a script to delete records, you can not kill the data movement.

    I have a 4,3TB database which is partitioned by month. That's a valid case. I feel 350GB is not a valid case and not worthwhile. Start Profiler, run the result through tuning wizard and apply indexes. I did that, and while I could just about delete the daily turnover, now the script I used will delete history at a rate of 1 hour per day. So my script takes an hour a day to delete that same day 3 months ago. It should be possible.

    Partitioning works just fine in SQL Server 2000. Lookup "Partitioned Views" and you'll see.

    You're correct about the initial setup being a bit complex especially if you have an IDENTITY column in the table, but it's worth the effort even on a 350GB database especially when you consider things such as backup and restore times. Consider the following... will you ever update an audit table? If it's truly an audit table, the answer should be "NO". That means that you have (in this particular case) 6 months of data that won't ever change. Only the "current month" will change. If you use a "Partitioned View" of the table (unlike a "Partitioned Table"), you can store those older, static months in a separate database which has several advantages. First, the database can be set to the "SIMPLE" recovery mode so that there's no need to backup the log. That also decreases the log file backup time on the "main" database where the "current month" is stored. That also means that you only have to backup the separate database once per month when you add a new month's worth of data to it. It also means that a restore would take much less time. During a DR restore, you're not so much concerned with old log data as you are with getting the database back up and running so you can return to normal business. Properly indexing such a thing is a trivial matter.

    As a bit of a sidebar, remember that if the "current month" table has an IDENTiTY column, you won't be able to do a direct insert into the view. Again, that's trivial because you can insert directly into the Current Month table instead of trying to insert into the "Partitioned View". It means changing the target of the audit triggers that are currently in place but, again, a trivial change compared to the benefits.

    The code to do all of the above is relatively trivial and can be done as a "set it and forget it" scheduled job.

    So far as your comment on being able to stop deletes but not datamovement, I'd say that just doesn't matter because once you setup the system to maintain the partitioning, it's just not going to matter at all. You are, in fact, supposed to test such things before you implement them to ensure that they not only work correctly but also have the necessary error checking to prevent any data loss.

    So far as "350GB is not a valid case and not worthwhile", that's likely an incorrect assumption but does need to be evaluated especially when it comes to backup space and the amount of time to do a DR restore. If the largest database a company has is only 350GB, they're probably not setup with huge reserves of disk and tape backup space. Most companies just won't spend the money on it because, for example, buying and powering up 4TB of disk space and a relatively larger number of backup tapes doesn't make sense for a 350GB database because technology changes (no sense buying lots of hardware that will go out of date without ever being used). All of that usually makes such partitioning very worthwhile.

    To summarize, the investigation and planning to pull off partitioning in the manner that I've just recommended would only take a day or two to come up with a rock solid plan. The coding for the automated monthly partition "moves" is trivial. The coding to use the new partitioned view is also trivial because, if you plan it correctly, the view will be named the same as the original monolithic audit table and no front end changes will be required. Only the triggers that put the audit data into the "current month" table would need to be changed and that's also a trivial change.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi Jeff,

    I still think we're talking about different things here. I don't see how partitioned views in SQL 2000 are going to help (I have experience with partitioned tables in SQL2005 and beyond which makes it easy to switch out a partition based on date), but if the TS is an expert on SQL 2000 I will not hold him back. I still maintain that to delete info, it's worth his while to check whether indexes are in place to aid in doing that. Furthermore he mentions years of neglect, and failing backups, so chances are even that there were no index rebuild tasks scheduled. So, I always try to start from there, instead of adding complexity. And I would never recommend going into simple recovery mode, as backups can be done while the delete job runs, to free up space in the log. I would always want to have a full DR path to enable me to do a point in time restore.

    Regards,

    Peter

  • terry.home (5/27/2013)


    Not a bad suggestion. I have considired this, but not done it.

    I dont know whether the last six months worth of data is any smaller in comparison to previous.

    My original problem was a lack of disk space which caused performance issues. Therefore deleting and shrinking helped to free up some space and allowed the database to respond.

    I could potentially now move the last 6 months to a new file, but once this is done, I assume I will still need to get the data older than 6 months deleted.

    Am I wrong to think that I will therefore be adding another step into my process but still end up with the same result at the end of the process?

    Regards

    T

    Hi Terry,

    Sorry late reply!

    Jeff Moden brought up a great point, that you could possibly use Partitioning as a method. Have you looked in that before?

    But may be right right adding more complexity than you really want. It might be just as easy to just schedule some type of delete for records older than 6 months if you've already trimemd it down-depending how large it still is/performance of delete operations. It really would depend on your situation, how much down time you can have, and impact to business/users)

    --------------------------------------------------
    ...0.05 points per day since registration... slowly crawl up to 1 pt per day hopefully 😀

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply