Distributing Data from Primary (mdf) datafiles to Secondary (ndf) datafiles

  • I tried finding a previous topic on this but did not come across it. Please post link to thread if you come across it.

    After doing some research I am describing my issue / purposed solution on the following. Please correct me if I'm wrong on any of this:

    1) Each data file (mdf or ndf) of a database can be read by only one processor core at a time. If I have 4 data files, then 4 cores can read the 4 data files at once to retrieve data so long as that data is distributed properly across the 4 data files. Conversely, if I have 4 cores and only one data file, then I am not using all the resources of the machine when performing reads to the database.

    2) This applies to physical cores only.

    I started a new job and immediately noticed a 260 GB database on our VM - MOSS database server. Since the priority was not (and still is not) splitting the database in SP or upgrading SP I immediately made 3 more data files (ndf) in the same filegroup. I also applied a max size to the main data file (mdf) set to the size of the file (260 GBs). Now the problem I face are timeouts on SP due to several queries trying to access data older than when I made the change. Since the older data all resides in the one large mdf the OS is only leveraging one core to access the one file.

    My server admin is unwilling to devote more resources (memory for cache) since the SAN is not "stressed enough" to warrant it.

    I believe I am under utilizing my resources. If that is the case, the only way I can think of to justify more resources would be to distribute the older data from the large mdf across all 4 data files. I would of course then rebuild indexes. This would allow me to leverage all the resources and give better justification for more resources / making splitting the database / upgrading SP and then splitting the database more of a priority.

    Questions are

    1) Pros? Cons? Alternatives?

    2) How would I do this? I haven't been able to find this (possibly for good reason). Preferably I'd like to leave the database online to avoid flushing the cache and let this run over the weekend.

    3) Will I lose the plans in cache related to the data that is moved? I'm guessing yes, but if this can be preserved that'd be ideal of course. The cache takes about 3 weeks to be optimal.

    I'm open to alternatives, links to further my education of this, hearing your experience, etc. Please help.

  • Cpt_Picard (3/30/2015)


    1) Each data file (mdf or ndf) of a database can be read by only one processor core at a time.

    False.

    Any thread can issue an IO to any file

    Since the older data all resides in the one large mdf the OS is only leveraging one core to access the one file.

    False.

    I believe I am under utilizing my resources. If that is the case, the only way I can think of to justify more resources would be to distribute the older data from the large mdf across all 4 data files. I would of course then rebuild indexes. This would allow me to leverage all the resources and give better justification for more resources / making splitting the database / upgrading SP and then splitting the database more of a priority.

    Probably not.

    A DB should be split for one of two reasons:

    - IO load. This requires careful design of what data goes into what file with the aim to balance load on the underlying drives and requires that all files are on different physical IO subsystems

    - Fast restore. This requires careful design of what data goes into what file with the aim being to bring the critical portion of the DB online as fast as possible in the case of a disaster and the less-critical portions later. This requires full recovery model and a large degree of comfort with more advanced restore options. It can't be done for a Sharepoint DB.

    1) Pros? Cons? Alternatives?

    Since the entire plan was based on a false premise, I would suggest scrapping the entire thing and removing those extra files you created.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks for the corrections Gila. I appreciate your help on this.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply