GUID Pirmary key

Question

GUID Pirmary key

yuvipoy

SSChampion

Points: 10469
More actions
March 27, 2014 at 7:35 am

#305654

Hi,
Can we have a GUID Column to primary key with nonclusterindex?
Is nonclusterindex is needed while we run the GUID column on WHERE Condition.
Say for eg i am having 50 tables are the 50tables are GUID column and each inter linked with FKey.
While to join i will be using GUID Columns to see if there is a record in the child table or not, so here we really need a NON-Clunsterindex on this Columns?.
It is a old system where GUID has been used instead of identity.
Thanks!

Viewing 15 posts - 1 through 15 (of 22 total)

You must be logged in to reply to this topic. Login to reply

Sean Lange SSC Guru Points: 286573 More actions · Answer 1

yuvipoy (3/27/2014)
Hi,
Can we have a GUID Column to primary key with nonclusterindex?
Is nonclusterindex is needed while we run the GUID column on WHERE Condition.
Say for eg i am having 50 tables are the 50tables are GUID column and each inter linked with FKey.
While to join i will be using GUID Columns to see if there is a record in the child table or not, so here we really need a NON-Clunsterindex on this Columns?.
It is a old system where GUID has been used instead of identity.
Thanks!

I know we have covered this previously in one of your threads. Yes you can have a primary key that has a non clustered index. I would recommend though that you have some other column as your clustered index.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

yuvipoy SSChampion Points: 10469 More actions · Answer 2

Sean Lange (3/27/2014)
I would recommend though that you have some other column as your clustered index.

Coludn't get you.

You mean to say primary key clusterindex on the child tables??

Sean Lange SSC Guru Points: 286573 More actions · Answer 3

yuvipoy (3/27/2014)
Sean Lange (3/27/2014)
I would recommend though that you have some other column as your clustered index.
Coludn't get you.
You mean to say primary key clusterindex on the child tables??

No I meant that you should have a clustered index on some column in your table. You want to change your primary key to be a nonclustered index (which is a good idea since the datatype is a guid). You do however, want to create a clustered index on some column in your table.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Koen Verbeeck SSC Guru Points: 259085 More actions · Answer 4

Sean Lange (3/27/2014)You do however, want to create a clustered index on some column in your table.

Is that absolutely necessary? It depends on the type of table and of the queries, but should a table always be clustered instead of a heap?

If you're doing a lot of scans, the heap is efficient as well.

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

Sean Lange SSC Guru Points: 286573 More actions · Answer 5

Koen Verbeeck (3/27/2014)
Sean Lange (3/27/2014)You do however, want to create a clustered index on some column in your table.
Is that absolutely necessary? It depends on the type of table and of the queries, but should a table always be clustered instead of a heap?
If you're doing a lot of scans, the heap is efficient as well.

True. I have worked with this OP on several threads and the problem is performance because they have several related tables and the primary key (and clustered index) is GUIDs on several tables each with millions of rows. I am certain they have fragmentation in the 90%+ range consistently given the amount of traffic it sounds like these tables have.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

yuvipoy SSChampion Points: 10469 More actions · Answer 6

Sean Lange (3/27/2014)
True. I have worked with this OP on several threads and the problem is performance because they have several related tables and the primary key (and clustered index) is GUIDs on several tables each with millions of rows. I am certain they have fragmentation in the 90%+ range consistently given the amount of traffic it sounds like these tables have.

Apart from GUID Column i dont have any columns unique.

for fragmentation we use to rebuild index when we delete some million of recrods.

while rebuilding the transaction (ldf and mdf)will grow huge these things are apart from the main problem.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 7

Koen Verbeeck (3/27/2014)
should a table always be clustered instead of a heap?

Yes, in vast majority of cases.

Heap has the forwarded records and the non-deallocation of pages to contend with, scans of a heap can be vastly less efficient than scans of a cluster as a result.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Koen Verbeeck SSC Guru Points: 259085 More actions · Answer 8

GilaMonster (3/27/2014)
Koen Verbeeck (3/27/2014)
should a table always be clustered instead of a heap?
Yes, in vast majority of cases.
Heap has the forwarded records and the non-deallocation of pages to contend with, scans of a heap can be vastly less efficient than scans of a cluster as a result.

OK, good to know.

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

ScottPletcher SSC Guru Points: 100949 More actions · Answer 9

yuvipoy (3/27/2014)
Sean Lange (3/27/2014)
True. I have worked with this OP on several threads and the problem is performance because they have several related tables and the primary key (and clustered index) is GUIDs on several tables each with millions of rows. I am certain they have fragmentation in the 90%+ range consistently given the amount of traffic it sounds like these tables have.
Apart from GUID Column i dont have any columns unique.
for fragmentation we use to rebuild index when we delete some million of recrods.
while rebuilding the transaction (ldf and mdf)will grow huge these things are apart from the main problem.

If you delete by datetime, then cluster by that.

If not, this seems to be one of the relatively rare times when you should likely add an IDENTITY column and cluster on that. You will also have a nonclus, PK index on the GUID column, if you need to lookup by the GUID.

Then, when deleting, you can delete rows in batches by cluster key, say 100,000 at a time (or whatever value works best for keeping down the log size).

SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

andrew gothard SSChampion Points: 12301 More actions · Answer 10

ScottPletcher (3/27/2014)
yuvipoy (3/27/2014)
Sean Lange (3/27/2014)
True. I have worked with this OP on several threads and the problem is performance because they have several related tables and the primary key (and clustered index) is GUIDs on several tables each with millions of rows. I am certain they have fragmentation in the 90%+ range consistently given the amount of traffic it sounds like these tables have.
Apart from GUID Column i dont have any columns unique.
for fragmentation we use to rebuild index when we delete some million of recrods.
while rebuilding the transaction (ldf and mdf)will grow huge these things are apart from the main problem.
If you delete by datetime, then cluster by that.
If not, this seems to be one of the relatively rare times when you should likely add an IDENTITY column and cluster on that. You will also have a nonclus, PK index on the GUID column, if you need to lookup by the GUID.
Then, when deleting, you can delete rows in batches by cluster key, say 100,000 at a time (or whatever value works best for keeping down the log size).

100 000 will escalate to a table lock. Batches of - say 4000 should prevent lock escalation if a table lock would be problematic

I'm a DBA.
I'm not paid to solve problems. I'm paid to prevent them.

ScottPletcher SSC Guru Points: 100949 More actions · Answer 11

andrew gothard (3/27/2014)
ScottPletcher (3/27/2014)
yuvipoy (3/27/2014)
Sean Lange (3/27/2014)
True. I have worked with this OP on several threads and the problem is performance because they have several related tables and the primary key (and clustered index) is GUIDs on several tables each with millions of rows. I am certain they have fragmentation in the 90%+ range consistently given the amount of traffic it sounds like these tables have.
Apart from GUID Column i dont have any columns unique.
for fragmentation we use to rebuild index when we delete some million of recrods.
while rebuilding the transaction (ldf and mdf)will grow huge these things are apart from the main problem.
If you delete by datetime, then cluster by that.
If not, this seems to be one of the relatively rare times when you should likely add an IDENTITY column and cluster on that. You will also have a nonclus, PK index on the GUID column, if you need to lookup by the GUID.
Then, when deleting, you can delete rows in batches by cluster key, say 100,000 at a time (or whatever value works best for keeping down the log size).
100 000 will escalate to a table lock. Batches of - say 4000 should prevent lock escalation if a table lock would be problematic

SQL would be able to use page locks if it does a key-range lock. I was hoping that deleting by the clustering key, as I stated above, would allow SQL to use page locks.

But I'll admit, often SQL doesn't use key-range locks as I would hope or expect it to :-D, so a table lock could result (assuming that SQL could actually acquire a table lock, otherwise it won't escalate anyway). Since, however, the pages would always be contiguous and linked, I would hope the lock wouldn't last that long anyway.

SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

andrew gothard SSChampion Points: 12301 More actions · Answer 12

ScottPletcher (3/27/2014)
andrew gothard (3/27/2014)
ScottPletcher (3/27/2014)
yuvipoy (3/27/2014)
Sean Lange (3/27/2014)
True. I have worked with this OP on several threads and the problem is performance because they have several related tables and the primary key (and clustered index) is GUIDs on several tables each with millions of rows. I am certain they have fragmentation in the 90%+ range consistently given the amount of traffic it sounds like these tables have.
Apart from GUID Column i dont have any columns unique.
for fragmentation we use to rebuild index when we delete some million of recrods.
while rebuilding the transaction (ldf and mdf)will grow huge these things are apart from the main problem.
If you delete by datetime, then cluster by that.
If not, this seems to be one of the relatively rare times when you should likely add an IDENTITY column and cluster on that. You will also have a nonclus, PK index on the GUID column, if you need to lookup by the GUID.
Then, when deleting, you can delete rows in batches by cluster key, say 100,000 at a time (or whatever value works best for keeping down the log size).
100 000 will escalate to a table lock. Batches of - say 4000 should prevent lock escalation if a table lock would be problematic
SQL would be able to use page locks if it does a key-range lock. I was hoping that deleting by the clustering key, as I stated above, would allow SQL to use page locks.
But I'll admit, often SQL doesn't use key-range locks as I would hope or expect it to :-D, so a table lock could result (assuming that SQL could actually acquire a table lock, otherwise it won't escalate anyway). Since, however, the pages would always be contiguous and linked, I would hope the lock wouldn't last that long anyway.

It will - up to a threshold of 5000 locks (3000 in 2000 and earlier). If you delete 100000 rows, you will escalate to a table lock, that's it (well -you can override it - but I'm not even going to post a hint as to how, because some idiot will try it, seriously, if you do, unless you're a SQL Server GOD, you will end your career) - this is by design. This is documented and deliberate behaviour. It doesn't happen by accident or randomly, and happens for a good reason. http://technet.microsoft.com/en-us/library/ms184286(v=sql.105).aspx

I know lock escalation is a little known feature, but it needs to be bourne in mind. I've seen too many cases of "We'll delete a few (hundreds of) million of rows and they're really old it doesn't matter, nothing's using them" from a third party and seeing a system just go down for hours.

Documented and deliberate behaviour, and done for a reason.

I'm a DBA.
I'm not paid to solve problems. I'm paid to prevent them.

andrew gothard SSChampion Points: 12301 More actions · Answer 13

andrew gothard (3/27/2014)
ScottPletcher (3/27/2014)
andrew gothard (3/27/2014)
ScottPletcher (3/27/2014)
yuvipoy (3/27/2014)
Sean Lange (3/27/2014)
True. I have worked with this OP on several threads and the problem is performance because they have several related tables and the primary key (and clustered index) is GUIDs on several tables each with millions of rows. I am certain they have fragmentation in the 90%+ range consistently given the amount of traffic it sounds like these tables have.
Apart from GUID Column i dont have any columns unique.
for fragmentation we use to rebuild index when we delete some million of recrods.
while rebuilding the transaction (ldf and mdf)will grow huge these things are apart from the main problem.
If you delete by datetime, then cluster by that.
If not, this seems to be one of the relatively rare times when you should likely add an IDENTITY column and cluster on that. You will also have a nonclus, PK index on the GUID column, if you need to lookup by the GUID.
Then, when deleting, you can delete rows in batches by cluster key, say 100,000 at a time (or whatever value works best for keeping down the log size).
100 000 will escalate to a table lock. Batches of - say 4000 should prevent lock escalation if a table lock would be problematic
SQL would be able to use page locks if it does a key-range lock. I was hoping that deleting by the clustering key, as I stated above, would allow SQL to use page locks.
But I'll admit, often SQL doesn't use key-range locks as I would hope or expect it to :-D, so a table lock could result (assuming that SQL could actually acquire a table lock, otherwise it won't escalate anyway). Since, however, the pages would always be contiguous and linked, I would hope the lock wouldn't last that long anyway.
It will - up to a threshold of 5000 locks (3000 in 2000 and earlier). If you delete 100000 rows, you will escalate to a table lock, that's it (well -you can override it - but I'm not even going to post a hint as to how, because some idiot will try it, seriously, if you do, unless you're a SQL Server GOD, you will end your career) - this is by design. This is documented and deliberate behaviour. It doesn't happen by accident or randomly, and happens for a good reason. http://technet.microsoft.com/en-us/library/ms184286(v=sql.105).aspx
I know lock escalation is a little known feature, but it needs to be bourne in mind. I've seen too many cases of "We'll delete a few (tens of) million of rows and they're really old it doesn't matter, nothing's using them" from a third party and seeing a system just go down for hours.
Documented and deliberate behaviour, and done for a reason.

Oh - and if you've made the mistake of update / delete cascade being set to on, do not delete from the parent table. Use an approach that deletes, in batches, from the lowest children to the highest parent appropriately. If you have cascade on and delete 5000 parent records with a number of generations of child tables - you're in for a lot of woe. Don't do it kids!

I'm a DBA.
I'm not paid to solve problems. I'm paid to prevent them.

yuvipoy SSChampion Points: 10469 More actions · Answer 14

andrew gothard (3/27/2014)
Oh - and if you've made the mistake of update / delete cascade being set to on, do not delete from the parent table. Use an approach that deletes, in batches, from the lowest children to the highest parent appropriately. If you have cascade on and delete 5000 parent records with a number of generations of child tables - you're in for a lot of woe. Don't do it kids!

Ok say if i am having 10 tables with cascade Delete ON. All tables are linked with Fkey.

Say 7 of the tables are connected to a single table(F7 key tables--> 1 Master table), if i delete recrods from my child tables(7 tables) first and then my master table.Will not the master table will look in to child tables for records?

Time consumed in the master table delete is more even when i delete the child records first,why? So can i go with deleting master table straightly instead of deleting child table and then master.

ScottPletcher (3/27/2014)
If you delete by datetime, then cluster by that.
If not, this seems to be one of the relatively rare times when you should likely add an IDENTITY column and cluster on that. You will also have a nonclus, PK index on the GUID column, if you need to lookup by the GUID.
Then, when deleting, you can delete rows in batches by cluster key, say 100,000 at a time (or whatever value works best for keeping down the log size).

Kept my clusterindex on date key column

Deleting all the records on batches only 3000 records per batch, to delete 1/2 million row.

Still time consuming.I am first deleting child tables and then master table