The Case of the Shrinking CFO, err Database

  • andre.quitta (8/17/2015)


    You may also free up some space with a more aggressive archiving approach to your data, or install archiving at all.

    I've seen a number of legacy systems have up 10 years worth of data. This may take a bit longer, but it will make the system run faster with less space required. I used to start out with the question of whether we needed invoices from the previous decade readily available. The reality is that an invoice is often not needed after 15 months (comparing this quarter to last quarter).

    These solutions will take more time to put in place over the one you proposed, but it will help with long term strategy.

    If there is not enough disk space for the existing data, where will the archive data be stored?

    Wouldn't that be moving the problem from one place to another?

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • You're right. I should have expanded on this. The archive data would be exported to a database on another server. If that is not possible, then an export to a file, to be reimported as needed to another database.

    If given the time, I had recommended in the past to use a data store for reporting, grouping the data to be optimized for reports (sort of a poor man's data warehouse). They're not that hard to set up and they run quickly as there is little need for joins. It's based on the idea that legacy data is pretty static.

  • Michael L John (8/17/2015)


    andre.quitta (8/17/2015)


    You may also free up some space with a more aggressive archiving approach to your data, or install archiving at all.

    I've seen a number of legacy systems have up 10 years worth of data. This may take a bit longer, but it will make the system run faster with less space required. I used to start out with the question of whether we needed invoices from the previous decade readily available. The reality is that an invoice is often not needed after 15 months (comparing this quarter to last quarter).

    These solutions will take more time to put in place over the one you proposed, but it will help with long term strategy.

    If there is not enough disk space for the existing data, where will the archive data be stored?

    Wouldn't that be moving the problem from one place to another?

    Tapes/CD/DVD

  • Thanks for the comments.

    In this case the data was ancient unread so removing the rows was needed. There was a significant performance improvement by removing the rows. Alas there was no budget for IT at the time but the need for disk space was critical.

  • Get a new CFO I guess.

  • At the time the IT budget was non existent. It happens and DBAs need tools to shrink database in times like that. The tone was meant to be humorous. BTW the CFO approved the article.:-)

  • No business impact at all - ran this through the middle of the business day with no lock issues. Give it a try and let me know.

  • Edward.Polley 76944 (8/17/2015)


    No business impact at all - ran this through the middle of the business day with no lock issues. Give it a try and let me know.

    Guess the age old DBA answer of "it depends" applies here, certainly specific to your operating environment. In my case, the databases here are generally 2-5TB in size, and so anything like that is certainly very intensive, and any changes like that require customer messaging, planned maintenance windows, and so on.

  • Jesus this article is downright dangerous. Follow at your own peril and only after making sure you have an updated resume.

    To start by claiming that shrinkfile magically causes contention correlated with how many pages are in your database is ridiculous; remembering the article quotes 50,000 pages or 400MB. Shrinkfile only locks one page at a time as it is moved, not permanently, and I've seen it run on extremely busy databases for hours at a time with zero impact. That's not to say there may be some workload out there which won't tolerate it but for our enterprise we consider it a safe operation when required.

    However it's the proposed solution that gets me.

    For a listed procedure that will take out shared locks on an entire table for hours while selecting the data into another table, kill any running processes during switchover (stored procedures with recompiled portions, and any updates in snapshot isolation), plus lose updated and inserted data between the two steps, and rely on further scripting to make sure no permissions are lost... that's a real liberal definition of "no business impact"!!!!

    Pretty sure I know which option the CFO is going to go for.

    This aside:

    BenWard (8/17/2015)


    I feel that the number of situations in which this will help are quite limited though... How many people have 120GB databases that really only need 20GB of space?

    I can answer this. An enterprise solution which does constant ETL and stores some blobs in a heap table. It was soon consuming 150GB. It took me a long while to work out what was going on (there were a lot of red herrings), and notify the vendor who had no idea it could happen and frankly wasn't very interested in fixing it. It now has a maintenance window to add/remove a clustered index once a week. Shrinks down to 20GB.

  • a better solution (can even be done in GUI if you don't want write code :-D):

    - add a new file to the filegroup

    - run DBCC ShrinkFile on the original file with the EMPTYFILE-Option -> this will move all (used) pages from the original file to the new one

    - now you can either drop the original file

    - or (if you prefer to still use it, e.g. because DBCC reported that it could not move all data) run a usual SHRINKFILE on the original again (because the filesize will not be reduced if DBCC gives this error)

    - run DBCC ShrinkFile with EMPTYFILE on the new file -> moves the data back to the original file

    - drop the new file

    - repeat if you have more than one filegroup and have to shrink the other ones too

    Used this methode few minutes ago to shink my msdb from about 2 GB to 400 MB

  • Edward,
    you are lucky to have space on I drive. What would you do if there is no space for the second filegroup?

  • I would also think FK constraints etc. would greatly complicate this solution? You need to be empowered to do your job and if they want space managed efficiently that means you need to be able to implement a management approved policy where you can purge data in a prioritized fashion. You might want to consider looking for another employer. It sounds like you are being setup to fail.  That said, thanks for taking the time to write this article, it's still good information to be aware of.

  • Cody Konior - Wednesday, August 19, 2015 8:52 AM

    Jesus this article is downright dangerous. Follow at your own peril and only after making sure you have an updated resume.To start by claiming that shrinkfile magically causes contention correlated with how many pages are in your database is ridiculous; remembering the article quotes 50,000 pages or 400MB. Shrinkfile only locks one page at a time as it is moved, not permanently, and I've seen it run on extremely busy databases for hours at a time with zero impact. That's not to say there may be some workload out there which won't tolerate it but for our enterprise we consider it a safe operation when required.However it's the proposed solution that gets me.For a listed procedure that will take out shared locks on an entire table for hours while selecting the data into another table, kill any running processes during switchover (stored procedures with recompiled portions, and any updates in snapshot isolation), plus lose updated and inserted data between the two steps, and rely on further scripting to make sure no permissions are lost... that's a real liberal definition of "no business impact"!!!!Pretty sure I know which option the CFO is going to go for.This aside:

    BenWard (8/17/2015)


    I feel that the number of situations in which this will help are quite limited though... How many people have 120GB databases that really only need 20GB of space?

    I can answer this. An enterprise solution which does constant ETL and stores some blobs in a heap table. It was soon consuming 150GB. It took me a long while to work out what was going on (there were a lot of red herrings), and notify the vendor who had no idea it could happen and frankly wasn't very interested in fixing it. It now has a maintenance window to add/remove a clustered index once a week. Shrinks down to 20GB.

    Why not just leave the Clustered Index?  Even in the Simple Recovery Model, changing a HEAP to a Clustered Table causes the HEAP to be preserved until the Clustered Index is committed.  When you drop the Clustered Index, the same happens in reverse.  It's a waste of time, resources, and disk space to create and then drop a Clustered Index.  If you also have any non-clustered indexes, those will get hammered, as well.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Three different points.
    A.) It is not the DBA's job to convince the CFO and that approach is ultimately doomed.  The DBA should be working with the database's business side stakeholders (HR, Sales, Manufactoring, ....).  Let them work the budget issues with the CFO.  You may even find that the business no longer needs the database.

    II.) is the approach to "moving" the tables and indexes.  I inherited a database with all objects in a single file in primary.  As the database approached 600GB we faced the physical limits of the then available disks.

    After adding disks, file groups, and files; I moved selected tables and indexes using index maintenance tools.  Non-clustered indexes were moved by rebuilding on new file groups and clustered indexes were moved with "Create - Drop Existing"

    3.) The CTO will have to go back to school.  You can't expect to perform major maintenance actions with "No performance impact".  You can minimize and spread out the impact but you can't prevent it.

  • Ray Herring - Wednesday, January 3, 2018 9:57 AM

    Three different points.
    A.) It is not the DBA's job to convince the CFO and that approach is ultimately doomed.  The DBA should be working with the database's business side stakeholders (HR, Sales, Manufactoring, ....).  Let them work the budget issues with the CFO.  You may even find that the business no longer needs the database.

    II.) is the approach to "moving" the tables and indexes.  I inherited a database with all objects in a single file in primary.  As the database approached 600GB we faced the physical limits of the then available disks.

    After adding disks, file groups, and files; I moved selected tables and indexes using index maintenance tools.  Non-clustered indexes were moved by rebuilding on new file groups and clustered indexes were moved with "Create - Drop Existing"

    3.) The CTO will have to go back to school.  You can't expect to perform major maintenance actions with "No performance impact".  You can minimize and spread out the impact but you can't prevent it.

    I have to disagree with the first point.  HR, Sales, and Manufacturing have no clue in the areas we speak of.  Their discussions may actually hurt the cause if they're not properly prepped for the discussion.  If there's no CIO and no infrastructure group, then the DBA is going to have to be the one to produce and deliver the convincing argument.

    On that third point, it sounds like the CTO doesn't actually need to go "back to school" for this problem and was directing an under-prepared DBA to come up with a better plan than just throwing hardware at stuff.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 16 through 30 (of 31 total)

You must be logged in to reply to this topic. Login to reply