Maximum Identity Value Reached

  • All,

    We have a large table in our data warehouse that has reached the maximum identity value for an integer data type (2,147,483,647) and now no additional rows can be inserted into the table. As a quick fix I reseeded the column to start at -2,147,483,647 (incrementing by one). This got our data loads working again. However, I still need to alter the column to a bigint.

    This is a fairly wide table housing inventory data. The column in question, FactInventoryId, is the primary key column of the table. There are no foreign keys pointing to this column. The only index on this column is the one that enforces the PK and it is a non-clustered index (the clustered index is on another column). There is also a view with schemabinding against this table.

    In order to alter the column to a BIGINT data type, I will have to drop the view and drop the primary key constraint. Then, alter the column and reapply the PK (and recreate the view). My major concern is that when applying the PK (and the enforcing index) that will cause a huge amount of transaction log growth since the index recreate must be done as one transaction.

    Is there anyway to get a rough estimate of how must T-log space will be required? Is there a better way to accomplish this task?

    Also, I'm assuming that the table will be locked during this operation and all query access will be blocked. Is this correct?

    Any thoughts would be most appreciated!

    Thanks,

    Peter

  • Sorry, don't have much time but here's a quick overview of one method to consider:

    1) create another table with the new bigint data type to copy the existing data to [btw, if you're on Enterprise Edition, and you're not using data compression, you should use at least row compression for the new table]. put the necessary indexes in place, so that time is already included in the initial load time.

    2) capture the current identity value, which you'll need later to make sure all recently inserted rows get copied to the new table

    3) copy the data in batches from the old table to the new one using the clustering key. Commit each batch immediately. Typically batch size is ~20K to 100K rows, depending on disk speed and logging capacity.

    4) after all rows are copied, verify that no identity value(s) are missing from the new table

    Then you'll need some minor down time to be sure:

    5) drop fk constraints on the current table

    5) rename the current table so no one can access it

    6) one final verification that all identity values are in the new table

    7) rename the new table to the original table name

    8) recreate the fk constraints, and be sure to explicitly WITH CHECK them so that they are trusted in SQL Server

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • I would follow along the same line of thinking that Scott has laid out. One minor thing I would add to step 3 for consideration:

    3) copy the data in batches from the old table to the new one using the clustering key. Commit each batch immediately. Typically batch size is ~20K to 100K rows, depending on disk speed and logging capacity. monitor t-log %-full and take periodic t-log backups during the load process to the parallel table to avoid unwanted t-log growth.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Thanks guys for your advice. The table is very large - about 250GB. I'd rather not have to make a duplicate of the data if I don't have to. Would it be advisable to simply alter the table "in place":

    1) Drop the primary key

    2) Alter the column to BigInt

    ALTER TABLE FactInventory

    ALTER COLUMN FactInventoryId BIGINT NOT NULL

    3) Reapply the PK

    ALTER TABLE ADD CONSTRAINT PK_FactInventoryId

    PRIMARY KEY NONCLUSTERED (FactInventoryId)

    Thoughts?

    Peter

  • That could also work but will require more downtime. You are somewhat fortunate in this case that the PK is nonclustered.

    Some other notes, check for and drop any foreign key references before you try step 1, and then re-add them after step 3. Also consider if you need to and how you will prevent access to the table during the changes.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • by the way, how long did it take you to exhaust the max value if INT? just wondering what the horizon is for exhausting the negative ID values, i.e. is this a pressing need?

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • one final note about FKs, you will need to alter those child-columns as well. i.e. you cannot have a column refer to a parent unless they match data types.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Orlando - Thanks for the reply. There are no foreign keys pointing to this column, so nothing to drop & recreate and as a result no other columns need altering. It took since May 2015 to accumulate this volume of data. The rate of growth was way, way more than we anticipated, otherwise, I would have made it a bigint from the start.

    We are currently in talks with the client to determine if we can delete/archive a large portion of this data. If they agree, then this entire process will be a whole lot easier.

    Thanks!

  • Thanks for posting back. If you think of it would love to hear how things turned out down the line.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Orlando Colamatteo (2/14/2016)


    That could also work but will require more downtime. You are somewhat fortunate in this case that the PK is nonclustered.

    Perhaps not so fortunate. If the FILL FACTOR on the Clustered Index is 100%, changing a fixed length INT to a fixed length BIGINT is probably going to fragment the dickens out of the table/leaf level of the Clustered Index. That could waste huge amounts of space and that may require a rebuild of the Clustered Index which is also going to waste a huge amount of space because the old CI won't drop until the new one is in place even if you do this in the BULK LOGGED or SIMPLE Recovery Model just to keep the log file from blowing up.

    A REORGANIZE will save the space in the MDF but will cause the LDF to blow sky high. MS did us no favors for index maintenance. They could have learned a whole lot from Peter Norton.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • AZ Pete (2/14/2016)


    The table is very large - about 250GB.

    AZ Pete (2/14/2016)


    We are currently in talks with the client to determine if we can delete/archive a large portion of this data. If they agree, then this entire process will be a whole lot easier.

    This might be the perfect opportunity to setup a little temporally based partitioning whether it be a Partitioned View (my favorite) or a Partitioned Table (not my favorite for many tactical reasons and requires the Enterprise Edition).

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (2/14/2016)


    Orlando Colamatteo (2/14/2016)


    That could also work but will require more downtime. You are somewhat fortunate in this case that the PK is nonclustered.

    Perhaps not so fortunate. If the FILL FACTOR on the Clustered Index is 100%, changing a fixed length INT to a fixed length BIGINT is probably going to fragment the dickens out of the table/leaf level of the Clustered Index. That could waste huge amounts of space and that may require a rebuild of the Clustered Index which is also going to waste a huge amount of space because the old CI won't drop until the new one is in place even if you do this in the BULK LOGGED or SIMPLE Recovery Model just to keep the log file from blowing up.

    A REORGANIZE will save the space in the MDF but will cause the LDF to blow sky high. MS did us no favors for index maintenance. They could have learned a whole lot from Peter Norton.

    Are you sure that is how the CI would behave in this scenario, i.e. that every page in the entire table would immediately need to be rewritten with a wider space on every record for the BIGINT?

    I will need to test. I seem to remember a scenario where it did not behave that way, but my observations were when widening an CHAR column and I did not have time to dig in to confirm that what I saw was what I thought I saw.

    At any rate, my comment was related to not having to drop the clustered index so the column could be altered, thereby reverting the table to a heap, only to have to turn around and re-add the clustered index.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Nope. I'm not 100% sure (and I think the mechanism has changed over the years). I'd do just like you said. I'd have to rig up a test on a smaller table.

    I do know from experience that you can add a column to a table with no immediate effect but, once you start adding data to it, the CI goes on a splitting rampage. It was a real shocker for the folks that declared the change control as "complete" and started letting users back in.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I do know from experience that you can add a column to a table with no immediate effect but,

    True, provided the column is nullable, my understanding is also that it is a simple metadata change and is almost instant regardless of the data type.

    once you start adding data to it, the CI goes on a splitting rampage. It was a real shocker for the folks that declared the change control as "complete" and started letting users back in.

    Sounds like a bit of a rude awakening there!

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply