Database Data File Physical Fragmentation From Small Auto-Growth Setting

  • I've placed this post in the SS2005 Performance Tuning section of the forum because it was totally ignored in the 2012 section of the forum.

    The server is SS2012.

    ________________

    To All,

    I need some advice. I've just started a new job and one of the first problems I've encountered is that a 365 GB database has auto-grown 1 MB at a time since the day it was created.

    3 years ago, the database was on direct attached storage. Then, it was moved to a Dell-Compellent SAN, a system for which I am a certified administrator. The database migration involved a database backup and a restore to the SAN which eliminated the physical fragmentation. But after that, it kept auto-growing in 1 MB increments. So, it probably has at least 100,000->200,000+ 1MB file fragments scattered all over multiple disk drives.

    I am going to review the performance implications of this condition with Dell-Compellent.

    In the meantime, I'd like to hear from any of the community on this forum:

    1. If you've had this situation, how did you respond to it?

    2. Is the potential for server performance degradation reduced if the database data file is on a SAN?

    3. If I must take action to rectify this condition, what are my best options?

    On the last question, I'm considering 3 options:

    1. A full backup and restoration of the database:

    1a. Backup the database.

    1b. Take the database out of production.

    1c. Delete the database.

    1d. Recreate the database with appropriate initial and auto-growth settings.

    1e. Restore the database.

    1f. Reconnect the Login and User SIDs.

    1g. Resume production.

    2. Take a SAN based snapshot of the database and use it to replace the existing database. (I don't think this will produce the desired results and I'm going to review it with Dell-Compellent.)

    3. File defragmenting software. There is a file defragmenting program available from Microsoft that is specifically designed to turn physically fragmented files into contiguous files: http://technet.microsoft.com/en-us/sysinternals/bb897428.aspx. Contig.exe Version 1.7. It sounds promising.

    I would be very interested in your responses, especially if you've actually had to solve a similar problem.

  • Wow. Not a single response.

    My post was either a really good one or a really bad one.

  • I only considered doing this once but never had to carry it out (we ended up migrating to new servers) so was curious towards the responses. I was watching this thread (and the one in the 2012 forum) to see if anyone would jump in, but for lack of an expert wanting to weigh in I'll throw in my lowly two cents.

    When I was considering it my plan was to:

    1. run CHECKDB

    2. take backups

    3. stop the service

    4. test the restores somewhere

    5. move all the database files to another drive available on the same server (no network involved)

    6. reformat the drive (you could also defrag the drive here if reformatting would cause other pain)

    7. bring all database files back from temporary drive location

    8. start the service

    9. run CHECKDB

    This was only an option because I had the maintenance window and the drive space to do something like this and there were many databases on the instance that were growing at 1MB per grow operation for several years so you can imagine the physical fragmentation that was in play. Moving files is risky though. I would have had good backups but it could have made for a huge setback if one of my files became corrupt during the process.

    You only have one file you're interested in defragging so if I had to choose to start somewhere I would maybe try the way I outlined if you have the space and time or try option 3 with the Sysinternals defragger. I haven;t used it but that group puts out some superior tools so I would have no qualms about trying it. I would definitely run CHECKDB before (as my control group) and after (as my compare group) though just to make sure the process was not destructive to any bytes in the file. Of course it should go without saying that taking good backups (and testing they are valid) is highly important before attempting anything like this.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Am not sure if I am understanding the question right, there is autoincrement set to 1 MB for the database files. The files are located on indvigual disk drives. Wouldnt simply changing the file factor and rebuilding indexes be good enough the remove any fragmentation , probably shrinking the database too.

    Jayanth Kurup[/url]

  • Jayanth_Kurup (1/18/2013)


    Am not sure if I am understanding the question right, there is autoincrement set to 1 MB for the database files. The files are located on indvigual disk drives. Wouldnt simply changing the file factor and rebuilding indexes be good enough the remove any fragmentation , probably shrinking the database too.

    You're thinking of logical fragmentation. This is physical fragmentation, i.e. fragmentation of the mdf or ldf on the OS file system. No commands in SQL Server handle that directly. Shrinking the file may hep though in certain situations, but not likely in cases where multiple files existed and all started at a very small size and autogrew many times at 1MB each.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Thank you all for your responses.

    I've been really busy in my new job. I haven't had the time to contact our SAN vendor, Dell-Compellent, and review the situation and/or potential solutions with them. I'll let you know if they have any interesting assessments and/or solutions.

    I know for a fact that a full database backup and restoration will solve the problem but I'm looking at hours of downtime on my employer's most important production database if I do that and I want to avoid it if possible.

  • I apologize for taking so long to respond. I finally had a conversation with Dell-Compellent on the subject of this post.

    My email to my management summarizing the conversation follows:

    _______________________

    I have been concerned about the subject of this email since early January. Today it was satisfactorily resolved.

    I was able to have Dell-Compellent’s Copilot Support arrange a conference call with their SAN Architecture personnel regarding physical disk fragmentation.

    I have good news: We don’t need to fix it.

    The details:

    When I first came to work and began to evaluate the SQL Server on the main database server, I found what I usually find when I first look at a database server: The SQL Server default auto-growth settings had not been customized for the company's requirements. Database data file auto-growth was 1 MB. Database log file growth was 10%. By my rough estimates, we should have well over 500,000 physical file fragments that have been allocated for our databases.

    This is a major problem on direct attached storage. In this condition, serial data reads create a frenzy of disk head seeks to locate all of the physical file fragments. This wears out disk drives and destroys database server performance. The only cure is to backup the database, delete it from the server, recreate the database with appropriate auto-growth settings, and restore it. This process is slow and causes a significant amount of downtime.

    Because Dell-Compellent’s SAN architecture is designed to allocate and store data across multiple disk drives, there is no such thing as a serial read. All I/O appears to be random. Spread across enough disks, it makes I/O more efficient. In actual tests of serial reads, there are a few percentage points of improved performance for storage space that has been allocated in large “chunks” vs. storage space that has been allocated in small “chunks. But in practicality, it isn’t enough to matter.

  • Hi Gail,

    I've just found that a lot of our databases have had the same setting.

    They are on small servers, with RAID 5 or RAID 10.

    I know we could fix the external fragmentation with a backup/restore.

    I think it would also be possible to fix the problem if we created a new file group (apart from PRIMARY), and then REBUILD all index onto the new file group.

    A few of the MDF files do not have many fragments. So there is no external fragmentation.

    What about the internal fragmentation?

    Do I need to worry about that?

    TIA,

    Henrik

  • Henrik,

    Logical fragmentation means that your data is not in the expected logical sequence in each data page. This is generally not the most efficient way to store data.

    Yes, it is usually a concern but not always and to varying degrees. As in many database related issues, it depends...

    Logical fragmentation, what is tolerable on your server, and the best way(s) to respond to it can be complicated subjects because of the various causes of the fragmentation and the frequency of its recurrence.

    I recommend that you study Books Online to get a good understanding of logical fragmentation, the causes of it, and then look at the fragmentation in your databases and the causes of it. From there, you can make intelligent decisions about how best to respond to it.

    ___________

    I liked your solution for physical fragmentation (rebuilding the indexes into a secondary partition). I hadn't thought of that. I think it's a good solution.

  • hi Gail,

    Thank you very much for writing back so quickly.

    <<I liked your solution for physical fragmentation (rebuilding the indexes into a secondary partition)

    That is what we normally do when we work on our 46 TB DataWareHouse database. When we change the clustering key of one of the larger tables, it is done into a new file group, in order not to increase the size of the database permanently.

    The current problem databases are seeing a new row to each of 600 tables every 10 minute. The rows are quite large; almost 3 kB, so we can only fit 3 rows on a page. The tables are clustered by a DateTime column, so we do not see logical fragmentation as such.

    But we do see "Extent Fragmentation"; for every 3 rows/page * 8 pages/extent (which is 24 rows) in a query, we need to find another Extent.

    The only way I see around that is to create 600 file groups, and then database growth is a bit of problem.

    I was worried if there were SGAM, GAM pages that got fragmented, when we grow the database in 1MB bits.

    That seem not to be the case.

    Thank you for your help.

    Best regards,

    Henrik

  • Oops. My post was inadvertently submitted twice. See the next post for my reply.

    Gail

  • Henrik,

    About those 3K byte rows:

    If you have one or more BLOB columns, or simply some very wide columns, and if the columns are rarely or never used in searches, I recommend that you consider removing them from the records in the primary table and store them in a secondary table designed to hold wide columns. When you need them, those wide columns can be accessed via a JOIN between the primary table and the secondary table.

    This will do several things:

    1. Significantly increase the "packing density" of the primary table.

    2. Speed up searches involving primary table rows.

    3. Isolate the inefficiencies of storing very wide data columns to the secondary table.

    My suggestion involves just one scenario where this might make sense.

    My suggestion has been implemented with good results.

    Gail

  • Hi Gail,

    Unfortunately the table has over 300 column, where most are FLOATs or Decimals.

    We have investigated if we could do such a split table exercise, but we haven't found a good solution.

    Thank you very much for your suggestion.

    Best regards,

    Henrik

  • Henrik,

    Have you experimented with data compression, both at the row and page level?

    I'm created significant storage efficiency improvements at the page level but those results involved a lot of character data.

    I've never experimented with data containing a lot of numerics. You might experiment with compression, if you haven't already, at both the row and page level, to see what storage efficiency improvements you can obtain.

    A word of warning: Don't do the experiment on LARGE indexes. The experiment will rebuild the index in the TempDB to report the efficiency gained and this can consume a lot of TempDB space. Try it on smaller indexes first.

    Gail

  • Hi Gail,

    Unfortunately the Compress keyword is an Enterprise Only feature.

    It is only backup compression that is in the standard product.

    This code runs on 1400 SQL Servers, so upgrading is not an option (yes, I did try).

    We use it on our Data Warehouse system, where it really helps, also on FLOAT, DECIMAL etc columns.

    Best regards,

    Henrik

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply