Changing Data Type in Azure DB on a table with many rows.

  • Hello,

    I'm looking for some advice on the best way to change the data type on a column from an INT to a BIGINT on a table with 220 million rows, as it has reached the upper limit of the INT data type.

    The database is on SQL Azure DB, and there is a clustered index with a primary key constraint, a foreign key constraint, and a non-clustered index. The table the foreign key is from also has 3 non-clustered indexes which include the key.

    What is the fastest/most efficient way to achieve this? Ideally in such a way as to minimize the length of time the table will be inaccessible.

    I am planning to test by dropping the constraints and indexes, making the change, and recreating them, as well as moving the data into a new table and creating the appropriate constraints and indexes there, but I wanted to find out if anyone could offer some advice on this first. I've seen a few ideas online such as moving data in batches, but I'm not sure what will work best.

    Thanks in advance.

  • Replicant21 (12/20/2016)


    Hello,

    I'm looking for some advice on the best way to change the data type on a column from an INT to a BIGINT on a table with 220 million rows, as it has reached the upper limit of the INT data type.

    The database is on SQL Azure DB, and there is a clustered index with a primary key constraint, a foreign key constraint, and a non-clustered index. The table the foreign key is from also has 3 non-clustered indexes which include the key.

    What is the fastest/most efficient way to achieve this? Ideally in such a way as to minimize the length of time the table will be inaccessible.

    I am planning to test by dropping the constraints and indexes, making the change, and recreating them, as well as moving the data into a new table and creating the appropriate constraints and indexes there, but I wanted to find out if anyone could offer some advice on this first. I've seen a few ideas online such as moving data in batches, but I'm not sure what will work best.

    Thanks in advance.

    First, I have no knowledge of Azure specifically, but as the data is not moving either to that environment from outside of it, or vice versa, the only thing you have to worry about is disk space and the speed of whatever machine you happen to be assigned to, to do the work. Thus it shouldn't be all that different from what you would do if the environment was local to you. I'd probably create a whole new table, sans the indexes, and copy everything into it from the existing table, and do some experimenting to see how long a given number of records takes. That could at least give you some idea of the overall time cost. Then you rename the existing table to something else, and rename the new table to the original name of the original table. Then you rebuild all the indexes and such. The hard part is knowing what kind of time cost you'll end up with. If you have any weekend time frames that could be used to do large quantities of the work where the effects on production would be either minimal or zero, maybe you can shoot for multiple windows. But, if you have to break up the work, you'll end up needing a tracking table to determine which rows are done, and a trigger to reset that value for any row that's already been copied that gets updated.

    That's my two cents...

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • Thanks for the suggestion. I'm going to spend a bit more time on this today and try to get a better idea of how long it will take.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply