Additional Data Files and Proportional Fill

Question

Additional Data Files and Proportional Fill

David Fundakowski

Ten Centuries

Points: 1328
More actions
July 23, 2016 at 9:18 pm

#328491

Comments posted to this topic are about the item Additional Data Files and Proportional Fill

Viewing 9 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

Alan Burstein SSC Guru Points: 61152 More actions · Answer 1

Great work David. Very interesting, I was not aware that you could do this with Extended Events.

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

John N Hick Ten Centuries Points: 1320 More actions · Answer 2

Alan.B (7/25/2016)
Great work David. Very interesting, I was not aware that you could do this with Extended Events.

Nor I. Great work!

g.britton SSChampion Points: 13863 More actions · Answer 3

Very cool! Time to dig deeper into XEs!

Gerald Britton, Pluralsight courses

Uwe Ricken Hall of Fame Points: 3267 More actions · Answer 4

Thank you David,
i can only repeat the others; very interesting. I will definitly use this as a good example for uneven configured multiple database files 🙂

Microsoft Certified Master: SQL Server 2008
MVP - Data Platform (2013 - ...)
my blog: http://www.sqlmaster.de (german only!)

h.tobisch SSCommitted Points: 1671 More actions · Answer 5

Dear author
i do not have the time to read your (well executed but lengthy) treatise in order to find out if it contains value for me (i guess chances are good it does).
so, could you say in some short sentences what makes my life better if i read it.
for i actually had problems with distribution of data in datafiles and would appreciate any help.

Uwe Ricken Hall of Fame Points: 3267 More actions · Answer 6

h.tobisch - Friday, November 3, 2017 4:17 AM
Dear author
i do not have the time to read your (well executed but lengthy) treatise in order to find out if it contains value for me (i guess chances are good it does).
so, could you say in some short sentences what makes my life better if i read it.
for i actually had problems with distribution of data in datafiles and would appreciate any help.

Hi, H.
the article describes how Microsoft SQL Server is using multiple data files for "round robin methode" to fill the database files "evenly". If your database files have unqual sizes you will generate hotspots on the largest files. Microsoft SQL Server tries to fill all data files evenly based on a procentual value.
If File 1 has a size of 1024MB and File 2 consumes 128 MB on the disk you will write more data into File 1. This can - in a high transactional system with lots of users - lead to page contention; especially on PFS, SGAM and GAM. That is btw the same reason why it is recommended to have multiple files for TEMPDB.
The demos from David will point exactly to this behavior. So your benefit from reading the article is an understanding how and why Microsoft SQL Server may suffer from performance problems when you use multiple files and these files for a filegroup have different sizes.

BTW: Reading the article would have taken you less time than writing your question 🙂

Microsoft Certified Master: SQL Server 2008
MVP - Data Platform (2013 - ...)
my blog: http://www.sqlmaster.de (german only!)

kevaburg SSCoach Points: 18130 More actions · Answer 7

Hi David,

thanks for the article! The Extended Event is definately one I will be using in the future.

What I would like to point out though is something that Uwe Ricken touched on above and that is the subject of unequally filled datafiles leading to hotspots on highly transactional OLTP systems and page contention.

Although a new datafile is added, data is not automatically distributed evenly across them all as you have demonstrated. Automatic Distribution takes a relatively long time before equality is achieved and (to the best of my knowledge) the only way to force this to happen is to rebuild all clustered indexes across the database assuming of course clustered indexes have been configured on all the tables. Heap tables will not be data distributed across the new datafiles using this technique. This is something I would have liked to have seen described on your article.

Bear in mind though that this will require downtime to be planned in as is normal with the rebuild of a clustered index and on very large tables is going to take time to complete.

Yet another reason to plan the design of a database with foresight and the future in mind.....

Just my 10 pence worth.. 🙂

Regards
Kev

kevaburg SSCoach Points: 18130 More actions · Answer 8

Uwe Ricken - Friday, November 3, 2017 6:59 AM
h.tobisch - Friday, November 3, 2017 4:17 AM
Dear author
i do not have the time to read your (well executed but lengthy) treatise in order to find out if it contains value for me (i guess chances are good it does).
so, could you say in some short sentences what makes my life better if i read it.
for i actually had problems with distribution of data in datafiles and would appreciate any help.
Hi, H.
the article describes how Microsoft SQL Server is using multiple data files for "round robin methode" to fill the database files "evenly". If your database files have unqual sizes you will generate hotspots on the largest files. Microsoft SQL Server tries to fill all data files evenly based on a procentual value.
If File 1 has a size of 1024MB and File 2 consumes 128 MB on the disk you will write more data into File 1. This can - in a high transactional system with lots of users - lead to page contention; especially on PFS, SGAM and GAM. That is btw the same reason why it is recommended to have multiple files for TEMPDB.
The demos from David will point exactly to this behavior. So your benefit from reading the article is an understanding how and why Microsoft SQL Server may suffer from performance problems when you use multiple files and these files for a filegroup have different sizes.
BTW: Reading the article would have taken you less time than writing your question 🙂

+1 to what Uwe said. My addition to his BTW is that you can't learn effectively by bullet points alone. It is worth taking the time to learn something that could eventually make the difference between a successful system and one that is barely adequate.....