SQLServerCentral Article

Change Integer to BigInt on a Very Large Table

,

Recently we faced an issue in UAT environment. In our application, one table has more than 50 million records. It has one Identity column with an int datatype. The table has ten default constraints and one clustered index. This table is highly utilized by our application and frequent DML operations are being performed. Every night when a user performs a calculation operation, a bulk set of rows get deleted. As we know, the identity will not reset when we perform the delete operation, so during the daytime when new data added, the Identity value kept increasing. Finally, it reached at the maximum allowable int data type value and insert operations started failing.

Considering amount of data in the table and the allowed downtime, we find two approaches which are suitable for our environment to change from an int to a biginet.

  1. Perform an insert operation from a source table to a destination table in multiple batches, issue frequent checkpoints so that log does not fill, and complete a manual switch of the tables.
  2. Use an SSIS package to perform a bulk insert and complete a manual switch of the tables.

Approach One

Step 1: Change database recovery model from full to simple so that log file does not get full.

Step 2: Script out the definition of existing table. (Below is table definition of source table)

USE [DemoDatabase]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[DemoTable_Source](
[ID] [int] IDENTITY(1,1) NOT NULL,
[AuditEventID] [int] NOT NULL,
[TimeRecorded] [datetime] NOT NULL,
[UserName] [varchar](50) NOT NULL

) ON [PRIMARY]
GO
ALTER TABLE [dbo].[DemoTable_Source] ADD  CONSTRAINT [DF_DemoTable_Source_AuditEventID]  DEFAULT ((0)) FOR [AuditEventID]
GO
ALTER TABLE [dbo].[DemoTable_Source] ADD  CONSTRAINT [DF_DemoTable_Source_TimeRecorded]  DEFAULT (getdate()) FOR [TimeRecorded]
GO                  

Step 3: Make changes in the extracted script. Rename the Indexes and constraints. Update the datatype of the identity column from int to bigint.

USE [DemoDatabase]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[DemoTable_Destination](
[ID] [int] IDENTITY(1,1) NOT NULL,
[AuditEventID] [int] NOT NULL,
[TimeRecorded] [datetime] NOT NULL,
[UserName] [varchar](50) NOT NULL
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[DemoTable_Destination] ADD  CONSTRAINT [DF_DemoTable_Destination_AuditEventID]  DEFAULT ((0)) FOR [AuditEventID]
GO
ALTER TABLE [dbo].[DemoTable_Destination] ADD  CONSTRAINT [DF_DemoTable_Destination_TimeRecorded]  DEFAULT (getdate()) FOR [TimeRecorded]
GO

Step 4: Insert rows from source table to destination table into batches and run frequent checkpoints.

SET IDENTITY_INSERT dbo.DemoTable_Bigint ON
DECLARE @x INT
DECLARE @BatchSize INT
DECLARE @Message VARCHAR(500)
SET @x = 0
SET @BatchSize = 100000
WHILE @x < 2500000  -- Set appropriately
BEGIN
    INSERT INTO [DemoTable_Bigint](ID,AuditEventID,TimeRecorded,UserName )
SELECT TOP(@BatchSize) ID, AuditEventID,TimeRecorded,UserName
FROM DemoTable_Source WITH (nolock)
WHERE ID > @x ORDER BY ID ASC
    SET @x = @x + @BatchSize
    SET @Message = CAST(@x AS VARCHAR(10))
    RAISERROR (@Message,10,1) WITH NOWAIT
    CHECKPOINT
END

Step 5: Rename tables and constraints. 

/*Rename table Names */EXEC sys.sp_rename N'DemoTable_Source', N'DemoTable_Source_Old'
EXEC sys.sp_rename N'DemoTable_Destination', N'DemoTable_Source'
/*Rename constraints */EXEC sys.sp_rename N'DF_DemoTable_Source_AuditEventID', N'DF_DemoTable_Source_AuditEventID_Old'
EXEC sys.sp_rename N'DF_DemoTable_Destination_AuditEventID', N'DF_DemoTable_Source_AuditEventID'
EXEC sys.sp_rename N'DF_DemoTable_Source_TimeRecorded', N'DF_DemoTable_Source_TimeRecorded_Old'
EXEC sys.sp_rename N'DF_DemoTable_Destination_TimeRecorded', N'DF_DemoTable_Source_TimeRecorded'
/*Drop old Tables and constraints*/alter table  DemoTable_Source_Old drop constraint  DF_DemoTable_Source_AuditEventID_Old 
alter table  DemoTable_Source_Old drop constraint  DF_DemoTable_Source_TimeRecorded_Old 
Drop Table DemoTable_Source_Old

Approach Two

Step 1: Script out the definition of existing table. (Below is table definition of source table)

USE [DemoDatabase] 
GO 
SET ANSI_NULLS ON 
GO 
SET QUOTED_IDENTIFIER ON 
GO 
CREATE TABLE [dbo].[DemoTable_Source]
( 
[ID] [int] IDENTITY(1,1) NOT NULL, 
[AuditEventID] [int] NOT NULL, 
[TimeRecorded] [datetime] NOT NULL, 
[UserName] [varchar](50) NOT NULL
 ) 
ON 
[PRIMARY] 
GO
ALTER TABLE [dbo].[DemoTable_Source] ADD CONSTRAINT [DF_DemoTable_Source_AuditEventID] DEFAULT ((0)) FOR [AuditEventID] 
GO 
ALTER TABLE [dbo].[DemoTable_Source] ADD CONSTRAINT [DF_DemoTable_Source_TimeRecorded] DEFAULT (getdate()) FOR [TimeRecorded] 
GO

Step 2: Make changes in extracted script. Rename the indexes, constraints and update the datatype of identity column from int to bigint.

USE [DemoDatabase] 
GO 
SET ANSI_NULLS ON
GO 
SET QUOTED_IDENTIFIER ON 
GO 
 CREATE TABLE [dbo].[DemoTable_Destination]
 ( 
[ID] [int] IDENTITY(1,1) NOT NULL,
[AuditEventID] [bigint] NOT NULL, 
[TimeRecorded] [datetime] NOT NULL, 
[UserName] [varchar](50) NOT NULL 
)
ON [PRIMARY] 
GO 
ALTER TABLE [dbo].[DemoTable_Destination] ADD CONSTRAINT [DF_DemoTable_Destination_AuditEventID] DEFAULT ((0)) FOR [AuditEventID] 
GO 
ALTER TABLE [dbo].[DemoTable_Destination] ADD CONSTRAINT [DF_DemoTable_Destination_TimeRecorded] DEFAULT (getdate()) FOR [TimeRecorded] 
GO

Step 3: Create a SSIS package to perform bulk insert. Below is screen shot of data flow task of SSIS package.

Below is a description of the dataflow package and the tasks performed.

  1. The source table is an ADO.Net source that will extract data from "DemoTable_Source" table.
  2. An Int datatype is a four byte signed and a Bigint is eight byte signed, so we have to add a data conversion step.
  3. The converted data of DemoTable_Source will be transfered to the DemoTable_Destination table.

Step 4: Rename tables and constraints and drop old tables and columns

/*Rename table Names */EXEC sys.sp_rename N'DemoTable_Source', N'DemoTable_Source_Old'
EXEC sys.sp_rename N'DemoTable_Destination', N'DemoTable_Source'
/*Rename constraints */EXEC sys.sp_rename N'DF_DemoTable_Source_AuditEventID', N'DF_DemoTable_Source_AuditEventID_Old'
EXEC sys.sp_rename N'DF_DemoTable_Destination_AuditEventID', N'DF_DemoTable_Source_AuditEventID'
EXEC sys.sp_rename N'DF_DemoTable_Source_TimeRecorded', N'DF_DemoTable_Source_TimeRecorded_Old'
EXEC sys.sp_rename N'DF_DemoTable_Destination_TimeRecorded', N'DF_DemoTable_Source_TimeRecorded'
/*Drop old Tables and constraints*/alter table  DemoTable_Source_Old drop constraint  DF_DemoTable_Source_AuditEventID_Old 
alter table  DemoTable_Source_Old drop constraint  DF_DemoTable_Source_TimeRecorded_Old 
Drop Table DemoTable_Source_Old

Conclusion

As we have dropped indexes before the bulk insert, our data was not correctly updated. To ensure that data get inserted properly we need to perform a sort operation, which was not advisable with 50 million records. We decided to use the second approach, and it worked well. Even though we did not drop indexes, the entire operation was completed in 5 hours.    

Rate

2 (18)

You rated this post out of 5. Change rating

Share

Share

Rate

2 (18)

You rated this post out of 5. Change rating