Integer to Big Integer and foreign keys

  • We need to convert an integer column to big int in the EVENT table.  Since the PK ( event_id ) is found in a few other tables and there are foreign keys, I'm assuming all tables will have to be handled in a single maintenance window??
    In other words, once the main table's primary key column is converted to big integer, we'll get  warnings until that is done in the related columns  ( in fact, perhaps we can't even do the parent table without dropping those keys on the other tables )

    TableNameRowCnt 
    PERSISTENT_OBJECT_REFERENCE1,372,631,714 
    EVENT1,048,422,404 
    EVENT_HISTORY381,677,153 
    EVENT_CLIENT_UPLOAD_FILE_DESCRIPTOR112,696,035 
    USER_ACTIVITY_LOG59,303,699    NO FK
    EVENT_DISPATCH16,697,500 
  • You will need to:

    Drop the foreign keys
    Drop the primary key
    Make the data type change
    Recreate the primary key
    Recreate the foreign keys

    With a billion rows, that's not going to be fast.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Yes I had to do that on the first table in the list last January, but in that case the identity integer column being upsized didn't exist anywhere else.  For EVENT, the primary key/identity integer column  ( event_id ) does exist in the other tables with foreign keys based on that column.

    A test this morning on the related Event_dispatch table actually went surprisingly fast, up until the point of putting the FK back on: ( hadn't modified EVENT yet )        This is what leads me to think I'll have to do all of the tables in one maintenance window.
    Column 'dbo.EVENT.EVENT_ID' is not the same data type as referencing column 'EVENT_DISPATCH.EVENT_ID' in foreign key 'EVENT_EVENT_DISPATCH_FK1'.
    Msg 1750, Level 16, State 0, Line 46
    Could not create constraint.

    For the largest tables, like the one in January, I created a new empty copy of the tale with the big integer type and then used the data export (ssis) process to populate it from the base table.  Took 4+ hours but at least didn't fill up the log or tempdb.

  • Indianrock - Tuesday, July 4, 2017 2:35 PM

    This is what leads me to think I'll have to do all of the tables in one maintenance window.

    Yes, as I said, you'll have to drop all the foreign keys, drop the primary key on the events table, make the data type changes and then only can you recreate the primary key and foreign keys.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • if storage allows, I'd SELECT INTO a new table with the column data type updated, then drop the old table and rename the new table to the old name.
    In my experience, changing data types on large tables can take quite a while, during which time you can't do anything with the table.

    Wes
    (A solid design is always preferable to a creative workaround)

  • That's more or less what I did in January on a 1.2 billion record table, that had no other tables related on the identity column.  I actually used SSMS data export, which runs in batches, caused virtually no log growth.

    It did take 4 hours.   This time the table isn't quite as large but has several related tables on that identity column, so we're looking at most of a weekend with customer downtime.  

    Right now I'm trying to sell the idea of re-seeding down to negative 2.1 billion.  Quick, no impact on other tables or foreign keys --- it's all a question of whether our code will choke on that.  It took us many years to max out the integer, so this should give us plenty of time to purge older data, re-design the database, or just figure out how to change to big integer with less downtime.

    Another approach is to take the 4 hour hit on the main table, but a week or two earlier, add a big integer column to the child tables and begin loading that column from the original column.   Then on conversion day you have much less data to "true up."

  • depending on how stable your data is the following approach may result in very little down time comparing to a full copy back at a point in time.

    It does require that you have space on the database to hold 2 copies of each table involved - which can be reclaimed at a later stage.

    Setup

    For each big table involve create another table with exactly the same structure and indexes and with the required fields changed from int to bigint as desired.
    If any of the tables are partitioned create them partitioned also with exactly the same partition function/schemas.
    Do not create the FK's on these tables.

    In advance of the implementation date copy all the data from the main tables onto the new tables.
    (from a load point of view do it using ssis/c# (bulkload)/bcp following standard ETL practices - drop indexes before load, recreate them afterwards.)

    On implementation data do as follows
    1 - prevent any access to the tables from users
    2 - determine any inserts/deletes/updates required on the new tables and apply them.
    3 - drop fk's on main tables
    4 - if any table has an identity field ensure that you record what is the next identity value so you can reseed it at a later stage.
    5 - truncate main tables
    6 - issue alter command on main tables to convert the columns from int to bigint
    7 - issue partition switch from new tables to old tables
    8 - recreate fk's
    9 - reseed tables identities if applicable

    Possible issues
    A - if volume of updates is very high there may be a significant downtime to do those updates
    B - if space is a issue and you need to shrink the datafiles there will be a bit of downtime to shrink, plus the need to rebuild indexes to remove fragmentation

    Benefits.

    Apart from point 2 and point B above impact on tlog is minimal.
    Switch command is metadata only so takes no time at all

    What can take some time is the recreation of FK's - I assume here that appropriate indexes are present to support them.

    For partition switch and if you arent familiarized with it you can have a look at https://www.sqlservercentral.com/Forums/FindPost1880513.aspx
    If any of your tables is partitioned the switch command is slightly different but as easy as the example.

    For shrinking of the database you can have a look at script I have on https://www.sqlservercentral.com/Forums/1866405/dbcc-shrinkfile#bm1866482 - do not use this one as is unless you ensure that "fileproperty(name, 'SpaceUsed')" returns the correct space - I have to post a better script to replace this bit.
    Using this approach does reduce the time required to shrink a database data file significantly.

    Edit: New version of the shrink script posted so the one there can be used.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply