Fill Factor "best" setting for GUID

  • I've been tasked to support an application in the dev stages. The decisions has been made tu use guid for all PK / FKs so assume I can't change that.

    I'm wondering if anyone has a best guess starting value for the fill factor and maybe pad_index.

    I don't have any production dbs right now so I can't do any kind of continued monitoring to figure it out... and I'd love to start on the good foot with the clients and avoid the "why's your app so slow" calls if at all possible... especially after 2 days of deployment!

  • There's little wrong with a GUID for a primary key, providing the clustered index is somewhere else.

    Can you make the pk nonclustered and put the cluster somewhere more suitable?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (5/9/2011)


    There's little wrong with a GUID for a primary key, providing the clustered index is somewhere else.

    Can you make the pk nonclustered and put the cluster somewhere more suitable?

    That's not what I'm seeing at the moment. PK = GUID = Clustered

    I'll see if I can move those around at our next meeting.

    Also I was wondering about the FF so that we don't get hammered with pagesplits.

  • Ninja's_RGR'us (5/9/2011)


    That's not what I'm seeing at the moment. PK = GUID = Clustered

    Yeah, but PK = clustered index is only by default, not required.

    This is how I handled a large system with Guids everywhere at a bank. Bigger, faster-inserted tables had the pk made nonclustered and the clustered index moved to another column

    Devs shouldn't have a problem with that, it doesn't affect the front end, allows you to find a better column (less likely to fragment) for the cluster

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Ok, so I gather by your complete lack of answer about the FF that I should ignore that option completely even if all the guids are indexed?

  • Ninja's_RGR'us (5/9/2011)


    Ok, so I gather by your complete lack of answer about the FF that I should ignore that option completely even if all the guids are indexes?

    No, but the only answer I can give you is the universal useless one.

    It depends on how many records get inserted into that table between index rebuilds and how large the records are. I had tables that 80% fill was fine on. I had tables where even if I'd put it to 10% I'd still be getting page splits by the end of the day (and we rebuilt indexes weekly)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (5/9/2011)


    Ninja's_RGR'us (5/9/2011)


    Ok, so I gather by your complete lack of answer about the FF that I should ignore that option completely even if all the guids are indexes?

    No, but the only answer I can give you is the universal useless one.

    It depends on how many records get inserted into that table between index rebuilds and how large the records are. I had tables that 80% fill was fine on. I had tables where even if I'd put it to 10% I'd still be getting page splits by the end of the day (and we rebuilt indexes weekly)

    Ok, so if I choose to start at 80% and then monitor it makes some sense? I know all tables will be different but I need a shotgun approach that should work "pretty good".

    Also what's the point of doing rebuilds on guid as the leading column? I don't see the day where I'll ever do a range scan there.

  • Ninja's_RGR'us (5/9/2011)


    Ok, so if I choose to start at 80% and then monitor it makes some sense? I know all tables will be different but I need a shotgun approach that should work "pretty good".

    It's probably better than the default. You should be able to identify which tables will get frequent inserts from the DB design, or from the devs. May be worth hitting those with slightly lower fill factor, while static lookup tables get 100%

    Also what's the point of doing rebuilds on guid as the leading column? I don't see the day where I'll ever do a range scan there.

    Providing you don't have table scans, likely little-none. The problem comes when indexes aren't covering, or queries need the entire table and SQL decides to scan the cluster rather than using another index

    Also watch the avg space used on pages, it can get quite low, meaning lots of wasted space.

    I'd still recommend see if you can get the cluster moved off the pk to another column, perhaps a dateinserted or similar. Depends how the table is used.

    At the bank, a lot of the busier tables were queried mostly by date, so moving the cluster from the Guid PK to the transaction date both reduced fragmentation and page splits and helped the queries.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Awesome advice.

    The challenge here is this is a CRM application. And we're building plugins in that CRM (MS dynamics CRM). That means that we both create new tables or new columns in those tables.

    MS uses extreme RBAR to access the tables. I definitely will never have access / control over that section of the software.

    I'll get more acquinted with the software / plugins and come back if I need more advice.

    Tx.

  • You might consider defaulting the PK columns with the NEWSEQUENTIALID() function to minimize the page splits as you insert data.

    create table myTable

    (

    ColumnA uniqueidentifier not null DEFAULT NEWSEQUENTIALID() primary key clustered,

    x int not null

    )

    insert into myTable (x) select 1

    insert into myTable (x) select 2

    insert into myTable (x) select 3

    select * from myTable

    drop TABLE myTable

    However, I would prefer to use INT or BIGINT identity columns for primary key clustered indexes. It makes the tables and indexes much smaller and it will probably perform much better.

  • Michael Valentine Jones (5/9/2011)


    You might consider defaulting the PK columns with the NEWSEQUENTIALID() function to minimize the page splits as you insert data.

    create table myTable

    (

    ColumnA uniqueidentifier not null DEFAULT NEWSEQUENTIALID() primary key clustered,

    x int not null

    )

    insert into myTable (x) select 1

    insert into myTable (x) select 2

    insert into myTable (x) select 3

    select * from myTable

    drop TABLE myTable

    However, I would prefer to use INT or BIGINT identity columns for primary key clustered indexes. It makes the tables and indexes much smaller and it will probably perform much better.

    Cool, I never knew about that function.

    I'm totally onboard with you about identity. The choice has been made in case clients need integration between system and probably just because that's what's used by the CRM at the moment. I know that the MSMD is not the best idea, but that's what's going on at the moment.

  • Hi Ninja;

    By now, I'm sure that you've looked it up in BOL, but it's worth mentioning for others who read this thread; a constraint if you're implementing this function is that it "can only be used with DEFAULT constraints on table columns of type uniqueidentifier" (from BOL).

    e.g. CREATE TABLE myTable (ColumnA uniqueidentifier DEFAULT NEWSEQUENTIALID())

    So, using it is contingent on our ability to alter the tables in a COTS application. We ran into a similar issue but were fortunate to be able to solve it by changing the clustered index in the way Gail suggested.

  • SwayneBell (5/10/2011)


    Hi Ninja;

    By now, I'm sure that you've looked it up in BOL, but it's worth mentioning for others who read this thread; a constraint if you're implementing this function is that it "can only be used with DEFAULT constraints on table columns of type uniqueidentifier" (from BOL).

    e.g. CREATE TABLE myTable (ColumnA uniqueidentifier DEFAULT NEWSEQUENTIALID())

    So, using it is contingent on our ability to alter the tables in a COTS application. We ran into a similar issue but were fortunate to be able to solve it by changing the clustered index in the way Gail suggested.

    Thanks for sharing the not so obvious.

    The good news is that we create some tables. So we have full control over those. So that may help, bu then again I'm really looking forward to changing the clustered indexes around.

    Thanx again.

  • sequential is definitely the way to go. And there are ways you can do this in the middle tier too if you need it. With those you can cluster on the GUID and still be ok with a high fill factor.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Tx.

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply