Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

Changing a Non-IDENTITY column to IDENTITY and vice versa

By Thomas Pieries, (first published: 2008/02/21)

Changing from Non-IDENTITY to IDENTITY and vice versa

In my career I have come across many situations where it was required to include/remove the Identity Property to/from a column at a latter stage of the development cycle due to some flaw.

If a sequence number is manually maintained, it would require the application to have a separate logic. The change-over will require extra effort and additional over heads. The obvious workaround for this would be to introduce an Identity Column to the table which will not require any additional effort or cost. The main disadvantage in it is that it cannot be updated.

Before SQL Server 2005 was introduced there were only two methods to add an Identity Column to a table through TSQL. There is no DDL statement that assists with altering a column to an Identity and vice versa. The two methods are as follows.

I will be using the following table structure in the explanation:

CREATE TABLE SOURCE_TB
(
  
ID INT NOT NULL 
  ,
DATE DATETIME 
  
,COST MONEY )

100,000 records were inserted to the table SOURCE_TB when testing was carried out

Method 1 - Adding a new Identity column to the existing table.

The thinking behind this is to introduce a new column (SOURCE_TB.ID_II) to the SOURCE_TB table with the Identity and remove the column SOURCE_TB.ID.

Following are the steps that could be followed:

  • Add a new identity field SOURCE_TB.ID_II

  • Drop all related objects ex. Indexes, Foreign Keys, Primary Keys that have been created on SOURCE_TB.ID

  • Drop the field SOURCE_TB.ID

  • Change the name of SOURCE_TB.ID_II to SOURCE_TB.ID

  • Recreate all Foreign Keys, Primary Keys and other constraints that were dropped in step 2

The Code

ALTER TABLE SOURCE_TB ADD ID_II INT IDENTITY(1,1)
ALTER TABLE SOURCE_TB DROP COLUMN ID
EXEC sp_rename 'SOURCE_TB.ID_II' ,'ID','COLUMN'

The following counters were gathered from the profiler:

  Duration CPU Reads Writes
Adding the column Source.ID_II 7,957,414 7,422 38,657 451
Dropping the Column Source.ID 20,089 0 28 2
Rename the Column Source.ID_II to Source.ID 296,194 0 194 0

Disadvantages

  • Depending on the size of the row there could be a possibility of a page split

  • The columns will not be in the original order when SOURCE_TB is created.
    Originally the order of the columns would be ID, Date and Cost and after including the Identity column the order will change to Date, Cost and ID. Assume that the application code didn’t explicitly specify the column name when retrieving data from the Data Base through a record set. As the fields are not retrieved in the same order as before the application may generate an error.

  • If there are any gaps in SOURCE_TB.ID field, this could create a referential integrity problem if it is a referenced key
    The work around for this would be a cumbersome task:

    • Before the SOURCE_TB.ID is dropped you need to identify existing gaps

    • After the SOURCE_TB.ID_II is renamed to SOURCE_TB.ID an update needs to be done on all the referencing tables to avoid any integrity issues.

  • If all the fields do not reside in a single page the operation would fail.

  • If the field SOURCE_TB.ID was a clustered Index and you plan to continue with it, it will take a long time to create the Index

  • There will be a down time in the Production Environment when implementing this.

Advantages

  • Only a column change will take place

  • System dependencies will still be in tack as no changes have taken place on the SOURCE_TB

  • It will not require a large amount of additional hard disk space to implement this compared to the next option.

Method 2 - Create the destination table with the same schema as the source table

The thinking behind this is to transfer all the data from SOURCE_TB to DESTINATION_TB and rename the DESTINATION_TB to SOURCE_TB.

Following are the steps that could be followed:

  • Create DESTINATION_TB table with the same schema as SOURCE_TB

  • Transfer the data from SOURCE_TB to DESTINATION_TB

  • Drop all the Foreign Keys referencing to SOURCE_TB if any

  • Alter all the objects with schemabinding if any

  • Drop the table SOURCE_TB

  • Rename the table DESTINATION_TB to SOURCE_TB

  • Recreate all the Foreign Keys to refer back to SOURCE_TB

  • Recreate all the Indexes on the SOURCE_TB

  • Recreate functions and views with schemabinding

  • Create all other constraints

The same steps could be followed if the change is done by the Enterprise Manager Tool.

The Code

CREATE TABLE DESTINATION_TB
(   ID INT IDENTITY(125006,1) NOT NULL
  ,
DATE DATETIME
  
,COST MONEY
)
INSERT INTO DESTINATION_TB (IDDATECOST) SELECT ID ,DATECOST FROM SOURCE_TB DROP TABLE SOURCE_TB EXEC sp_rename 'DESTINATION_TB' ,'SOURCE_TB'

Following counters were gathered from the profiler

  Duration CPU Reads Writes
Creation of the Destination table 855 0 64 17
Insert to Destination Table from Source 3,856,785 719 115,957 396
Drop the Source table 6809 0 279 17
Renaming the Destination table to Source 362,363 0 171 0

Disadvantages

  • If the SOURCE_TB is a large table you have to be concern of the disk space availability.
    As we are going to copy all the data from one table to another, you need to be aware of the temporary disk space requirement. So you will need to know the exact space required for the table.

  • As the data is physically moved to the Destination_TB it may require just more than a simple “Insert”
    You may need a batch wise “Insert” to the Destination_TB. This will avoid any locks from taking place and daily operation can proceed without any effect.

  • There will be a down time in the Production Environment

Advantages

  • All the columns will be of the same sequence as the original Source_TB

In SQL Server 2005 you can do this in a much efficient manner.

It is expected, that you have the basic understanding of partitioning. Have a look at the following links:

http://www.databasejournal.com/features/mssql/article.php/3640891

http://www.sqlskills.com/resources/Whitepapers/Partitioning%20in%20SQL%20Server%202005%20Beta%20II.htm

The most important and interesting aspect of SQL Server 2005, is that a table belongs to a partition by default even though the table is not physically partitioned.

You can verify this by going through the sys.partitions table

SELECT FROM sys.partitions
WHERE [object_id] OBJECT_ID('SOURCE_TB')

Following are the steps:

  • Drop all the Foreign Keys referencing to SOURCE_TB if any

  • Alter all the objects with schemabinding if any

  • Drop all the Indexes from SOURCE_TB

  • Create the new table DESTINATION_TB with same schema as the SOURCE_TB

  • SWITCH the data between the tables

  • Drop the table Source_TB

  • Rename the table Destination_TB to Sourse_TB

  • Recreate all the Foreign Keys to refer to SOURCE_TB

  • Recreate all the Indexes on SOURCE_TB

  • Recreate functions, views with SCHEMABINDING

  • Create all other constraints

The code

CREATE TABLE DESTINATION_TB
(
  
ID INT IDENTITY(125006,1) NOT NULL
  ,
DATE DATETIME
  
,COST MONEY
) ALTER TABLE SOURCE_TB SWITCH TO DESTINATION_TB DROP TABLE SOURCE_TB EXEC sp_rename 'DESTINATION_TB' ,'SOURCE_TB'

Following counters were gathered from the profiler:

 
  Duration CPU Reads Writes
Creation of the Destination table 0 0 64 17
Do the switching of the partitions 2 0 54 0
Drop the Source table 1 16 100 0
Renaming the Destination table to Source 259 140 519 2

What you need to know

  • At the time of making the switch, both source and destination tables should be having the same schema (This does not apply to the identity column).

  • No data will be physically moved from one location to another but the pointers will change, which sorts a lot of problem, locks from taking place.

  • Each time you SWITCH data from one to another partition, the data from the source partition will be zero.

Disadvantages

  • It’s only available in Enterprise Edition and Developer Edition.

  • Sysdependencies will be effected as the object ID’s will be swapped.

  • All objects with schemabinding will have to be altered.

  • All Foreign Keys will have to be created to repoint to SOURCE_TB again.

Advantages

  • Requires the least down time in the Production Environment compared to the previous options

Conclusion

Taking the resource usage and cost into consideration, using table partitioning will be the safest and best option.

Total article views: 32642 | Views in the last 30 days: 38
 
Related Articles
FORUM

two source columns to single destination

two source columns to single destination

FORUM

Sequence number column in destination table not in source table

Sequence number column in destination table not in source table

FORUM

two source table columns to one destination table

two source table columns to one destination table

FORUM

identity column

identity column

FORUM

Dynamic Destination Field/Column Mapping

Mapping Destination Columns

Tags
__briefcase    
 
Contribute

Join the most active online SQL Server Community

SQL knowledge, delivered daily, free:

Email address:  

You make SSC a better place

As a member of SQLServerCentral, you get free access to loads of fresh content: thousands of articles and SQL scripts, a library of free eBooks, a weekly database news roundup, a great Q & A platform… And it’s our huge, buzzing community of SQL Server Professionals that makes it such a success.

Join us!

Steve Jones
Editor, SQLServerCentral.com

Already a member? Jump in:

Email address:   Password:   Remember me: Forgotten your password?
Steve Jones