Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««123»»

Fill Factor clarification Expand / Collapse
Author
Message
Posted Monday, February 18, 2008 3:42 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Monday, May 12, 2014 12:57 PM
Points: 2,285, Visits: 4,225
The terms FILLFACTOR and PAD_INDEX are very confusing due to the names and some default behavior. From BOL:

FILLFACTOR specifies a percentage that indicates how full the Database Engine should make the leaf level of each index page during index creation or rebuild.

PAD_INDEX = { ON | OFF } Specifies index padding. The default is OFF.
ON - The percentage of free space that is specified by fillfactor is applied to the intermediate-level pages of the index.
OFF or fillfactor is not specified -The intermediate-level pages are filled to near capacity, leaving sufficient space for at least one row of the maximum size the index can have, considering the set of keys on the intermediate pages.

For the clustered index, as the leaf level of the index is the data, FILLFACTOR determines free space for inserts or updates that increase the row size.
For clustered index on a column with an every increasing value ( identity, NEWSEQUENTIALID or current_timestamp) , since inserts are appended, only free space for updates that increase the row size needs to be specified and free space for intermediate-level index is not needed.
In this scenario, set FILLFACTOR appropriately and PAD_INDEX = OFF.

For clustered index on a column without an every increasing value , since inserts are not appended, set FILLFACTOR appropriately and PAD_INDEX = ON

Use the function sys.dm_db_index_physical_stats to determine the degree of fragmentation.


SQL = Scarcely Qualifies as a Language
Post #456843
Posted Monday, February 18, 2008 8:47 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 11:11 AM
Points: 7,122, Visits: 15,031
Jeff Moden (2/17/2008)
How does one determine if a page split has occured during an update?


I don't know how to retrospectively figure it out. That being said - if I want to test it - I either turn on the page splits/sec metric in perfmon, or I just run "snapshots" of DBCC SHOWCONTIG. Paul Randal has some way to tell using DBCC PAGE, but that's all greek to me, and since it's undocumented....


----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
Post #457179
Posted Monday, February 18, 2008 10:59 PM
SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Tuesday, July 29, 2014 8:53 AM
Points: 1,861, Visits: 3,599
To retrospectively figure it out, as I wrote earlier:

To get a feel on whether a 90% fill factor is adequate for your index, you can monitor
sys.dm_db_index_physical_stats.avg_page_space_used_in_percent over a period of time.
If you find that it quickly and significantly deviates from the 90%-space-used value enforced originally by the fillfactor setting, then that means the 90% fillfactor value is probably too high.


SELECT avg_page_space_used_in_percent
FROM sys.dm_db_index_physical_stats(@db_id, @object_id, @index_id, NULL , 'DETAILED');

Whether or not page splits are occurring is not as much the issue, as whether they are occurring at a statistically significant rate, and if that is the case, you will need to lower your fill factor.


__________________________________________________________________________________

Turbocharge Your Database Maintenance With Service Broker: Part 2
Turbocharge Your Database Maintenance With Service Broker: Part 1
Real-Time Tracking of Tempdb Utilization Through Reporting Services
Monitoring Database Blocking Through SCOM 2007 Custom Rules and Alerts
Preparing for the Unthinkable - a Disaster/Recovery Implementation
Post #457208
Posted Tuesday, February 19, 2008 1:59 AM
SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Tuesday, July 29, 2014 8:53 AM
Points: 1,861, Visits: 3,599
You also may want to consider making your primary key ID column non-clustered and define your clustered index on a more suitable column, eg. a record timestamp.

Queries involving a *range* of values on a certain column/columns, such as a datetime range, benefit greatly from a clustered index defined on that column/columns.

Defining your PK as non-clustered means that page splits due to table UPDATES will NEVER happen - an update of a varchar column will not affect the stacking of the PK pages.

Personally, I find that defining an identity column PK as clustered is wasting a precious resource, unless this is the ONLY index you think you will EVER need on that table.


__________________________________________________________________________________

Turbocharge Your Database Maintenance With Service Broker: Part 2
Turbocharge Your Database Maintenance With Service Broker: Part 1
Real-Time Tracking of Tempdb Utilization Through Reporting Services
Monitoring Database Blocking Through SCOM 2007 Custom Rules and Alerts
Preparing for the Unthinkable - a Disaster/Recovery Implementation
Post #457244
Posted Tuesday, February 19, 2008 4:46 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Wednesday, December 1, 2010 8:12 AM
Points: 9, Visits: 43
I do agree that primary key clustered is not a good suggestion for an identity column, instead this can be done as primay key nonclustered and other valid column with clustered index. (so, everything wud be a index seek) One doubt here, suppose it is an retail bank customer search, we ll have to take last name for clustered.... Your suggestions please...

Also, thanks for the information regarding the PAD_INDEX... as it says it does a defragmentation at the leaf level alone, if it is OFF. But i guess, this will not solve the problem as there will be empty space in all the data pages which will be unused other than the last data page.

Now, one more question here, if we are having identity column as primary key clustered and sme other columns (say 2) as non clustered indexes, is it OK that we do a INDEXDEFRAG instead of DBREINDEX, if the environment is big? and of course a scheduled DBREINDEX in 4 or 6 months... I know i am asking too much... this wud be the last for index...
Post #457311
Posted Tuesday, February 19, 2008 7:17 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 11:11 AM
Points: 7,122, Visits: 15,031
Marios Philippopoulos (2/19/2008)
You also may want to consider making your primary key ID column non-clustered and define your clustered index on a more suitable column, eg. a record timestamp.

Queries involving a *range* of values on a certain column/columns, such as a datetime range, benefit greatly from a clustered index defined on that column/columns.

Defining your PK as non-clustered means that page splits due to table UPDATES will NEVER happen - an update of a varchar column will not affect the stacking of the PK pages.

Personally, I find that defining an identity column PK as clustered is wasting a precious resource, unless this is the ONLY index you think you will EVER need on that table.


Must need some more coffee- but I fail to see how the choice of the clustered index is ever going to prevent page splits due to updates of varchars. If a row is in a page, and its varchar column is updated from holding 20 characters to 5000 characters, then it's going to cause a page split if the page doesn't have enough room to accomodate the extra 4980 characters.

Also - the PK has nothing to do whatsoever with page splits. The Clustered index does on the other hand.


----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
Post #457367
Posted Tuesday, February 19, 2008 8:20 AM
SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Tuesday, July 29, 2014 8:53 AM
Points: 1,861, Visits: 3,599
If a row is in a page, and its varchar column is updated from holding 20 characters to 5000 characters, then it's going to cause a page split if the page doesn't have enough room to accomodate the extra 4980 characters.


That's correct.

What I'm saying is that the page split will not be on the PK, IF it is made into a non-clustered index. Page splits will potentially occur on the clustered index (or heap) of the table, whichever one that index may be.

The original question was about setting a fill factor value for the PK identity column. If that is made into a non-clustered index, setting the FILL Factor to 100 (or 0) should be OK. I don't see a scenario of page splits for the non-clustered PK in this case, do you? Unless I am missing something.


__________________________________________________________________________________

Turbocharge Your Database Maintenance With Service Broker: Part 2
Turbocharge Your Database Maintenance With Service Broker: Part 1
Real-Time Tracking of Tempdb Utilization Through Reporting Services
Monitoring Database Blocking Through SCOM 2007 Custom Rules and Alerts
Preparing for the Unthinkable - a Disaster/Recovery Implementation
Post #457428
Posted Tuesday, February 19, 2008 8:31 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 11:11 AM
Points: 7,122, Visits: 15,031
I'm being nitpicky about this - but let's avoid using the word PK in this case. Its function as Primary Key has no bearing whatsoever on the page splits. The clustered index is the only thing driving that decision. The fact that they're both on the same field is incidental only. The fact that SQL server allows us to use the same index for both is regrettable IMO, but again - that's not the question of the moment.

If your primary goal is to prevent page splits, then actually - an identity is a GOOD choice for a clustered index, since all of the inserts will have at the "end", and not in the middle (already full) pages. Now - you don't get it to leverage the clustered index in range seeks/scans that way - but again - that's a choice based on what's most important.

Also - Heaps don't have page splits. They're a byproduct of having a clustered index.


----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
Post #457439
Posted Tuesday, February 19, 2008 8:53 AM
SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Tuesday, July 29, 2014 8:53 AM
Points: 1,861, Visits: 3,599
Matt, I completely agree with most of your points, but are you sure about the last one?

Heaps don't have page splits.


An update of a VARCHAR column in a table with no clustered index cannot cause a page split in the data pages? I think it can.

Sorry if I'm splitting pages (... I mean hairs :) ), but I want to make sure my understanding is correct here.


__________________________________________________________________________________

Turbocharge Your Database Maintenance With Service Broker: Part 2
Turbocharge Your Database Maintenance With Service Broker: Part 1
Real-Time Tracking of Tempdb Utilization Through Reporting Services
Monitoring Database Blocking Through SCOM 2007 Custom Rules and Alerts
Preparing for the Unthinkable - a Disaster/Recovery Implementation
Post #457450
Posted Tuesday, February 19, 2008 9:09 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 11:11 AM
Points: 7,122, Visits: 15,031
Marios Philippopoulos (2/19/2008)
Matt, I completely agree with most of your points, but are you sure about the last one?

Heaps don't have page splits.


An update of a VARCHAR column in a table with no clustered index cannot cause a page split in the data pages? I think it can.

Sorry if I'm splitting pages (... I mean hairs :) ), but I want to make sure my understanding is correct here.


No issue at all - it's actually not very well documented IMO. Take a read through this:

http://www.microsoft.com/technet/prodtechnol/sql/bestpractice/clusivsh.mspx

The part that's relevant is of course "buried" most of the way down, but here it is anyway:

For the table with the clustered index, the Page Splits/sec is about four times higher than the corresponding number of Page Splits/sec for the table with the nonclustered index. In addition, the ratio of Page Splits/sec to Pages Allocated/sec for the table with the clustered index is higher and increases more rapidly as the number of processes increases. This behavior is due to the fact that page allocations for a table without a clustered index (heap) never result in a page split (the data is just added to the heap), and there are fewer page splits for the nonclustered index pages because for the given table and index structure, almost twice as many rows can fit in a data page. The ratios of Page Splits/sec to Pages Allocated/sec are depicted in Figure 16.


Meaning - when you run out of space in a heap - you just whack a pointer to the end towards some more space somewhere else and put the "extra info over there" (in a REALLY simplified framing of the behavior). As I recall - that's called a "forward pointer".

Of course - GilaMonster has a better handle on this behavior than I, so I hope she'll stop on by and flesh out the explanation where it's lacking.


----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
Post #457461
« Prev Topic | Next Topic »

Add to briefcase ««123»»

Permissions Expand / Collapse