"Remove Clutered property" from PK

  • Hi,

    We have some tables that have a PK clustered on a UNIQUEIDENTIFIER column...

    This generates a lot of fragmentation and I'd like to deploy a script to change this on all our customers.

    Unfortunately the PK is also used by FK references...

    Is there any easy way to determine what FKs are "hanging" on the PK? Or do I have to see the PK columns and seek all the FKs referenced columns to see which ones depend on the PK?

    This can be a very looonnnngggg operation due to FKs rebuild and PK and the new clustered index but the fragmentation is very high (lots of page splits)..

    Thanks,

    Pedro



    If you need to work better, try working less...

  • Yep. A long operation. Because clustered is not simply a property of the PK, it describes the type of index being created. A clustered index defines how the data is stored. So, to move the cluster from the PK to another column (and you do want to have a clustered index on every table, even if it's not on the PK) will require dropping and recreating the PK. This will require dropping all the FKs first. To make this change, you'll have to do the work. There's no way around it.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • That what I thought... :S

    I'm writing a query to get all the clustered indexes (CIDX) with UNIQUEIDENTIFIER on the 1rt column, get all FKs that have the same columns as the CIDX (I assume there's no other unique index equal to the PK...), build a query with the FKs' DROP statement and another with the CREATE statements, execute the drop FKs, drop the constraint PK, create the new int identity as a CIDX and recreate the PK as non clustered. rebuild the FKs....

    In some databases, I made a test, and this can take forever....

    Thanks,

    Pedro



    If you need to work better, try working less...

  • Yeah, a big part of why it's so important to pick the correct location for the clustered index is because it's so difficult later to rebuild them, especially if they're on the PK (not saying that putting on the PK is a bad thing).

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Perhaps make a brand new table with the structure you want (the new pk and any fk constraints) and populate it with all your data.

    For instance, given your old table dbo.myTable, build a new table called dbo.myTableNew. Populate myNewTable with all your data, constraints, etc. Then, in a single transaction, rename the tables.

    begin try

    begin tran

    sp_rename 'dbo.myTable', 'dbo.myTableOld'

    sp_rename 'dbo.myTableNew', 'dbo.myTable'

    commit tran

    end try

    begin catch

    while @@trancount > 0 rollback tran

    select 'Error'

    end catch

    Executive Junior Cowboy Developer, Esq.[/url]

  • JeeTee (12/16/2014)


    Perhaps make a brand new table with the structure you want (the new pk and any fk constraints) and populate it with all your data.

    For instance, given your old table dbo.myTable, build a new table called dbo.myTableNew. Populate myNewTable with all your data, constraints, etc. Then, in a single transaction, rename the tables.

    begin try

    begin tran

    sp_rename 'dbo.myTable', 'dbo.myTableOld'

    sp_rename 'dbo.myTableNew', 'dbo.myTable'

    commit tran

    end try

    begin catch

    while @@trancount > 0 rollback tran

    select 'Error'

    end catch

    That would take even longer than changing the index structure... I'd have to process the FKs anyway and have the extra of inserting the data into the new table... FKs wont "migrate" from one table to the other...



    If you need to work better, try working less...

  • Is there any easy way to determine what FKs are "hanging" on the PK?

    Yes, using view "sys.foreign_keys", and, if needed, "sys.foreign_key_columns" as well.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher (12/16/2014)


    Is there any easy way to determine what FKs are "hanging" on the PK?

    Yes, using view "sys.foreign_keys", and, if needed, "sys.foreign_key_columns" as well.

    that's what i'm currently doing...

    select the columns from the pk and check the foreign keys that reference that object_id and column_id...

    it's quite simple, actually but it takes forever on large databases...

    Thnks,

    Pedro



    If you need to work better, try working less...

  • PiMané (12/16/2014)


    ScottPletcher (12/16/2014)


    Is there any easy way to determine what FKs are "hanging" on the PK?

    Yes, using view "sys.foreign_keys", and, if needed, "sys.foreign_key_columns" as well.

    that's what i'm currently doing...

    select the columns from the pk and check the foreign keys that reference that object_id and column_id...

    it's quite simple, actually but it takes forever on large databases...

    Thnks,

    Pedro

    Hmm, the actual check itself should be extremely fast. Of course the speed reordering the table will depend on the number of rows and their width.

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • What is it you're trying to optimize for while doing this? You mentioned "rebuilding FK"s several times, which isn't really a thing. Are you trying to keep up-time on the existing table? Are you simply trying to make it as fast as possible regardless of locking, tempdb, etc? Are you trying to make it as simple as possible? If the foreign key checks are adding a considerable amount of time while you're rebuilding the clustered index, you could always disable them during the re-keying (assuming you have the table locked or in some other way know illegitimate records can't be inserted during that time.)

    FWIW, the reason I suggested pre-building and populate another table and then using SP_rename is that it does all the costly work offline in another table, and then when you're ready to make the switch, you just rename the tables (by the way, sp_rename DOES update the meta info on foreign key constraints). Whether or not you insert all this data into a separate table, or do it in place, you're still going to end up writing an entire new table when you change the clustered index.

    Executive Junior Cowboy Developer, Esq.[/url]

  • JeeTee (12/16/2014)


    What is it you're trying to optimize for while doing this?

    Since the CIDX is a UNIQUEIDENTIFIER the data is very fragmented due to page splits that happen to store the data. The table has over 150 fields and its row max length is greater than a page (8k). With integration programs inserting over 10.000 rows per day in the table, from other software, you can imagine the fragmentation level and the time it takes to insert the records....

    Since we can't change the SW (create smaller tables to store data properly) the only thing we can do is change the "thing" that's making this go slow, the UNIQUEIDENTIFIER in the CIDX.. That why the PK still has to be the same column (it's used in FKs) but the CIDX will change to an INT IDENTITY to avoid fragmentation and page splits when inserting data... We made come tests and from over 10min it went to less than 1mn... In our tests we changed everything by hand but our clients can have other FKs "hanging" on the PK so we have to make a dynamic SQL to achieve this.

    JeeTee (12/16/2014)


    You mentioned "rebuilding FK"s several times, which isn't really a thing.

    Since the FKs are using the CIDX PK and I want to create another CIDX and keep the PK I have to drop the CIDX PK but to do so I must 1rt remove the FKs. After recreating the PK, NC, I have to rebuild (recreate) the existing FKs...

    JeeTee (12/16/2014)


    Are you trying to keep up-time on the existing table? Are you simply trying to make it as fast as possible regardless of locking, tempdb, etc? Are you trying to make it as simple as possible? If the foreign key checks are adding a considerable amount of time while you're rebuilding the clustered index, you could always disable them during the re-keying (assuming you have the table locked or in some other way know illegitimate records can't be inserted during that time.)

    This is a maintenance operation so downtime is more than expected, I'm just trying to make the downtime the less possible. Since we have "both-ways" indexes on FKs the FKs operation will be quite fast, the existing indexes rebuild, due to the new CIDX, is whats going to take time.

    Since the new column is INT IDENTITY the data won't probably "move"... Unless the "extra" 4 bytes for the new column won't fit in the page and then generate a page slit...

    The only thing I'm creating a CIDX with an INT is that less "expensive" on the other indexes, 4 bytes is less than the ROWID... If it wasn't for this I wouldn't mind having an HEAP table since no CIDX over a Natural Key can be created on it.

    Thanks,

    Pedro



    If you need to work better, try working less...

  • The key question is How long is the window in which you have to do this and how big is the table?

    having done something similar, it was a case of

    1) write a script to drop and recreate all the FK references to the PK, this should take seconds to run, a little longer to reapply if using the CHECK operation, test it works

    2) Drop and recreate the PK in NON ClusteredMode

    3) Create the New Clustered index.

    4) Reapply the Foreign Keys,

    Unfortunately theres no way to avoid rebuilding the table when the new clustered index is applied, you can minimise the read/write contention and disk IO bottlenecks if you create a filegroup on a different disk and simply create the Clustered index on the new file group. So you effectively stream the data from one disk to the other.

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

  • Jason-299789 (12/17/2014)


    The key question is How long is the window in which you have to do this and how big is the table?

    having done something similar, it was a case of

    1) write a script to drop and recreate all the FK references to the PK, this should take seconds to run, a little longer to reapply if using the CHECK operation, test it works

    2) Drop and recreate the PK in NON ClusteredMode

    3) Create the New Clustered index.

    4) Reapply the Foreign Keys,

    Unfortunately theres no way to avoid rebuilding the table when the new clustered index is applied, you can minimise the read/write contention and disk IO bottlenecks if you create a filegroup on a different disk and simply create the Clustered index on the new file group. So you effectively stream the data from one disk to the other.

    I'd change the 3 with the 2, since all the indexes "point" to the Clustered Index and the PK index would have to be "recalculated" to point to the new CIDX.

    But that's what I'm doing...



    If you need to work better, try working less...

  • PiMané (12/17/2014)


    Jason-299789 (12/17/2014)


    The key question is How long is the window in which you have to do this and how big is the table?

    having done something similar, it was a case of

    1) write a script to drop and recreate all the FK references to the PK, this should take seconds to run, a little longer to reapply if using the CHECK operation, test it works

    2) Drop and recreate the PK in NON ClusteredMode

    3) Create the New Clustered index.

    4) Reapply the Foreign Keys,

    Unfortunately theres no way to avoid rebuilding the table when the new clustered index is applied, you can minimise the read/write contention and disk IO bottlenecks if you create a filegroup on a different disk and simply create the Clustered index on the new file group. So you effectively stream the data from one disk to the other.

    I'd change the 3 with the 2, since all the indexes "point" to the Clustered Index and the PK index would have to be "recalculated" to point to the new CIDX.

    But that's what I'm doing...

    I concur. Get the new clustered index built first.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • You have to drop the PK anyway, so why not rebuild it as an NC, then when you build the new CIDX it will rebuild all the NC indexes wont it so they point to the new Leaf pages on the CIDX?

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

Viewing 15 posts - 1 through 15 (of 28 total)

You must be logged in to reply to this topic. Login to reply