Fill Factor "best" setting for GUID

Question

Post reply

Fill Factor "best" setting for GUID

Ninja's_RGR'us

SSC Guru

Points: 294069
More actions
May 9, 2011 at 8:42 am

#250025

I've been tasked to support an application in the dev stages. The decisions has been made tu use guid for all PK / FKs so assume I can't change that.
I'm wondering if anyone has a best guess starting value for the fill factor and maybe pad_index.
I don't have any production dbs right now so I can't do any kind of continued monitoring to figure it out... and I'd love to start on the good foot with the clients and avoid the "why's your app so slow" calls if at all possible... especially after 2 days of deployment!

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 1

There's little wrong with a GUID for a primary key, providing the clustered index is somewhere else.

Can you make the pk nonclustered and put the cluster somewhere more suitable?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 2

GilaMonster (5/9/2011)
There's little wrong with a GUID for a primary key, providing the clustered index is somewhere else.
Can you make the pk nonclustered and put the cluster somewhere more suitable?

That's not what I'm seeing at the moment. PK = GUID = Clustered

I'll see if I can move those around at our next meeting.

Also I was wondering about the FF so that we don't get hammered with pagesplits.

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 3

Ninja's_RGR'us (5/9/2011)
That's not what I'm seeing at the moment. PK = GUID = Clustered

Yeah, but PK = clustered index is only by default, not required.

This is how I handled a large system with Guids everywhere at a bank. Bigger, faster-inserted tables had the pk made nonclustered and the clustered index moved to another column

Devs shouldn't have a problem with that, it doesn't affect the front end, allows you to find a better column (less likely to fragment) for the cluster

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 4

Ok, so I gather by your complete lack of answer about the FF that I should ignore that option completely even if all the guids are indexed?

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 5

Ninja's_RGR'us (5/9/2011)
Ok, so I gather by your complete lack of answer about the FF that I should ignore that option completely even if all the guids are indexes?

No, but the only answer I can give you is the universal useless one.

It depends on how many records get inserted into that table between index rebuilds and how large the records are. I had tables that 80% fill was fine on. I had tables where even if I'd put it to 10% I'd still be getting page splits by the end of the day (and we rebuilt indexes weekly)

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 6

GilaMonster (5/9/2011)
Ninja's_RGR'us (5/9/2011)
Ok, so I gather by your complete lack of answer about the FF that I should ignore that option completely even if all the guids are indexes?
No, but the only answer I can give you is the universal useless one.
It depends on how many records get inserted into that table between index rebuilds and how large the records are. I had tables that 80% fill was fine on. I had tables where even if I'd put it to 10% I'd still be getting page splits by the end of the day (and we rebuilt indexes weekly)

Ok, so if I choose to start at 80% and then monitor it makes some sense? I know all tables will be different but I need a shotgun approach that should work "pretty good".

Also what's the point of doing rebuilds on guid as the leading column? I don't see the day where I'll ever do a range scan there.

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 7

Ninja's_RGR'us (5/9/2011)
Ok, so if I choose to start at 80% and then monitor it makes some sense? I know all tables will be different but I need a shotgun approach that should work "pretty good".

It's probably better than the default. You should be able to identify which tables will get frequent inserts from the DB design, or from the devs. May be worth hitting those with slightly lower fill factor, while static lookup tables get 100%

Also what's the point of doing rebuilds on guid as the leading column? I don't see the day where I'll ever do a range scan there.

Providing you don't have table scans, likely little-none. The problem comes when indexes aren't covering, or queries need the entire table and SQL decides to scan the cluster rather than using another index

Also watch the avg space used on pages, it can get quite low, meaning lots of wasted space.

I'd still recommend see if you can get the cluster moved off the pk to another column, perhaps a dateinserted or similar. Depends how the table is used.

At the bank, a lot of the busier tables were queried mostly by date, so moving the cluster from the Guid PK to the transaction date both reduced fragmentation and page splits and helped the queries.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 8

Awesome advice.

The challenge here is this is a CRM application. And we're building plugins in that CRM (MS dynamics CRM). That means that we both create new tables or new columns in those tables.

MS uses extreme RBAR to access the tables. I definitely will never have access / control over that section of the software.

I'll get more acquinted with the software / plugins and come back if I need more advice.

Tx.

Michael Valentine Jones SSC Guru Points: 64818 More actions · Answer 9

You might consider defaulting the PK columns with the NEWSEQUENTIALID() function to minimize the page splits as you insert data.

create table myTable

(

ColumnA uniqueidentifier not null DEFAULT NEWSEQUENTIALID() primary key clustered,

x int not null

)

insert into myTable (x) select 1

insert into myTable (x) select 2

insert into myTable (x) select 3

select * from myTable

drop TABLE myTable

However, I would prefer to use INT or BIGINT identity columns for primary key clustered indexes. It makes the tables and indexes much smaller and it will probably perform much better.

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 10

Michael Valentine Jones (5/9/2011)
You might consider defaulting the PK columns with the NEWSEQUENTIALID() function to minimize the page splits as you insert data.
create table myTable
(
ColumnA uniqueidentifier not null DEFAULT NEWSEQUENTIALID() primary key clustered,
x int not null
)
insert into myTable (x) select 1
insert into myTable (x) select 2
insert into myTable (x) select 3
select * from myTable
drop TABLE myTable

However, I would prefer to use INT or BIGINT identity columns for primary key clustered indexes. It makes the tables and indexes much smaller and it will probably perform much better.

Cool, I never knew about that function.

I'm totally onboard with you about identity. The choice has been made in case clients need integration between system and probably just because that's what's used by the CRM at the moment. I know that the MSMD is not the best idea, but that's what's going on at the moment.

SteveBell SSCrazy Points: 2008 More actions · Answer 11

Hi Ninja;

By now, I'm sure that you've looked it up in BOL, but it's worth mentioning for others who read this thread; a constraint if you're implementing this function is that it "can only be used with DEFAULT constraints on table columns of type uniqueidentifier" (from BOL).

e.g. CREATE TABLE myTable (ColumnA uniqueidentifier DEFAULT NEWSEQUENTIALID())

So, using it is contingent on our ability to alter the tables in a COTS application. We ran into a similar issue but were fortunate to be able to solve it by changing the clustered index in the way Gail suggested.

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 12

SwayneBell (5/10/2011)
Hi Ninja;
By now, I'm sure that you've looked it up in BOL, but it's worth mentioning for others who read this thread; a constraint if you're implementing this function is that it "can only be used with DEFAULT constraints on table columns of type uniqueidentifier" (from BOL).
e.g. CREATE TABLE myTable (ColumnA uniqueidentifier DEFAULT NEWSEQUENTIALID())
So, using it is contingent on our ability to alter the tables in a COTS application. We ran into a similar issue but were fortunate to be able to solve it by changing the clustered index in the way Gail suggested.

Thanks for sharing the not so obvious.

The good news is that we create some tables. So we have full control over those. So that may help, bu then again I'm really looking forward to changing the clustered indexes around.

Thanx again.

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 13

sequential is definitely the way to go. And there are ways you can do this in the middle tier too if you need it. With those you can cluster on the GUID and still be ok with a high fill factor.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 14

Ninja's_RGR'us

SSC Guru

Points: 294069

May 11, 2011 at 8:53 am

#1323094

Tx.