DB Size

  • Hi everyone

    I am doing some test development in SS19 and the file size looks way too big given the size of the test data.  The test data I uploaded is at most a few hundred MB.  Yet, when I checked the file usage it is showing GB which can't be right.  I know Access has this annoying feature where the size of the DB doesn't get reduced if data is deleted.  I had to manually compress/clean the Access DB to get the file size down to a normal value.  I am still fairly new to SS.  Does anyone know why my SS DB is so large?  Please let me know what info you need from to help you help me.

    Total Space Reserved 25.70 GB

    Data Files Space Reserved 11,080.00 MB

    Transaction Log Space Reserved 15,240.00 MB

    Thank you

  • It sounds to me like the database was defined that large to begin with.  And, no... SQL Server doesn't reduce the file size when data is deleted, either.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Can I change it?  I would like it to grow as the data gets larger...not start off so big and then the data size doesn't justify the db size.

  • It also looks like you are not performing regular frequent transaction log backups.  In SQL Server - when the database is not in simple recovery model, you must take frequent transaction log backups or the transaction log will just continue growing.

    Once you have those setup and running - then you can perform a one-time shrink of both the data file and the log file.  Before you do that - right-click on the database in Object Explorer, select Reports and then select the Disk Usage report.  See how much space is actually being used in both the data file and the log file.

    Once you know how much space is actually used - you can then perform a DBCC SHRINKFILE(logical file, size), where logical file is the logical file name or number (USE {your database here}; SELECT * FROM sys.database_files;) and size is the desired size in MB.  Make that size a bit larger than the used space - rounded up to an even multiple of the autogrowth setting.

    Check the autogrowth setting for the files - if they are set to percentage, change them to a fixed size.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • If your tables don't have a clustered index (aka heap-table), the space might not be released rightaway

    https://www.sqlskills.com/blogs/paul/sqlskills-sql101-why-does-my-heap-have-a-bunch-of-empty-pages/

  • No, SQL service itself does not reduce file size. SQL service is a database management system that allows you to store and retrieve data in a structured manner.

    However, there are ways to reduce the size of a database or individual tables within a database, such as:

    Removing unnecessary data: You can delete redundant or outdated records to reduce the size of a table.

    Compressing data: Some databases have built-in compression options that can reduce the size of the data stored within them.

    Normalizing data: Normalizing the data within a database can also reduce its size by reducing duplicated data.

    Optimizing indexes: Reorganizing or rebuilding indexes can improve the performance of a database and potentially reduce its size.

    Archiving data: Moving older data that is rarely accessed to a separate, smaller database or file can help reduce the size of the main database.

    It is important to note that reducing the size of a database can have trade-offs, such as increased complexity or slower performance in certain scenarios. Before making any changes to your database, it is important to understand the potential impact and to thoroughly test the changes in a safe and controlled environment.

  • Prepping non-production environments can be a lot work. Script the heck out of it. And yeah, there is a command you can use, SHRINK. I don't recommend it for production systems (OK, maybe as a one-off after massive data migrations or something, but not those silly/dangerous scripts that run it once a day or more). However, running it in a non-production environment makes perfect sense.

    So you can do something like this:

    • Restore the production database to a non-production environment.
    • Clean out sensitive data from production through some scripted process (updates, deletes, a data masking tool, your call).
    • After that's done, rebuild the indexes (eliminates wasted space).
    • Shrink the database.
    • Backup the clean, shrunk, database and use this in your non-production environments.

    That's the way I've done it, with variations, for years. It works.

  • Talha222 wrote:

    Optimizing indexes: Reorganizing or rebuilding indexes can improve the performance of a database and potentially reduce its size.

    I strongly recommend against the use of Reorganize.  It simply doesn't do what most people think it does and the costs of use in the log file can literally be two orders of magnitude larger all while perpetuating fragmentation.  And, if you don't rebuild indexes correctly, you end up causing just as much damage (which can be huge) and more as when you use Reorganize.

    I'll also tell you that the supposed "Best Practices" that about 98% of the world is using for Index Maintenance isn't actually a "Best Practice", was never meant to be a "Best Practice" and, except for a really tiny subset of indexes, is actually a "Worst Practice".

    I have a video on the subject that proves, with code, that all of that and more is true and clearly demonstrates what Reorganize actually does.  Watch it past the Q'n'A where there's a super interesting out-take, as well.

    https://www.youtube.com/watch?v=rvZwMNJxqVo

    Also, if you're listening with a headset, be aware that they added 3 advertisements at the 15:00, 30:15, and 45:25.  They're sudden and they're loud.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • thanks everyone.  i will review and get back to you if I have any questions

  • water490 wrote:

    thanks everyone.  i will review and get back to you if I have any questions

    If you do a shrink, make sure that you do a check on fragmentation both before and after.  Anything that becomes more fragmented after the shrink, has suffered fragmentation due to "Index Inversion".  That's one form of fragmentation that WILL cause problems with scans and range scans.  It still won't affect single row OLTP though.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • the db is 11080.00 MB and the log is 17032.00 MB.  Why do I need the log file?  My DB is backed up so if something happens to the tables or stored procedures then I just restore from backup.  I don't really see the need for the log file.  I am sure I am missing something important with my assumption.  here is the query I am using to see all of this:

    SELECT 

    mdf.database_id,

    mdf.name,

    mdf.physical_name as data_file,

    ldf.physical_name as log_file,

    db_size_MB = CAST((mdf.size * 8.0)/1024 AS DECIMAL(8,2)),

    log_size_MB = CAST((ldf.size * 8.0 / 1024) AS DECIMAL(8,2))

    FROM (SELECT * FROM sys.master_files WHERE type_desc = 'ROWS' ) mdf

    JOIN (SELECT * FROM sys.master_files WHERE type_desc = 'LOG' ) ldf

    ON mdf.database_id = ldf.database_id

     

    • This reply was modified 11 months, 4 weeks ago by  water490.
  • The logfile is used to keep a record of done / pending tranasctions till the logfile has been backed up. This in order to make it possible to redo all necessary transactions since full/differential backup so you get as close as possible to the state the database was at the moment of your choosing

    If you don't need full-recovery logging, you may chose for simple logging. Which only keeps data for the current transactions.

    Just restoring the files .mdf,.ldf won't help unless the database was closed/offline when you backed it up

  • May I recommend some reading on what the transaction log does and how it impacts things like backups.

    https://www.sqlservercentral.com/books/sql-server-transaction-log-management-by-tony-davis-and-gail-shaw

    https://www.sqlservercentral.com/books/sql-server-backup-and-restore

    https://www.sqlservercentral.com/books/troubleshooting-sql-server-a-guide-for-the-accidental-dba

     

    The log isn't just used for backup purposes it has a key and crucial role in the day to day operations of the database itself.

    While you may not need to back it up, regardless of the recovery model, it is still very much heavily used to keep the database running.

    Additionally you have to ask yourself, what is the RTO and RPO of this particular database, the wider databases on the instance, then the actual server itself.

    If you can afford to lose 24 hours worth of changes (RPO of 24hrs), then yes you could get away with switching the recovery model to simple, and doing a daily FULL backup.

    However if your RPO is say 15 minutes, then doing a full backup every 15 minutes is just not feasible in most situations.  So you need to look at doing log backups to get you to that 15 minutes requirement.

     

    As I say there is more to the log than just backups, hopefully the reading material above will shine some light for you into the inner workings of the log file.

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply