Adding a Primary Key to a Prepopulated Table using a SQL Server Integration Services Package

By:   |   Comments   |   Related: More > Integration Services Development


Problem

A SQL Server table was created long ago without a primary key probably as a result of an import. It began to get populated over time. As a SQL Server Developer, I have been tasked to automate the process of adding a primary key to this table (without removing duplicate rows) in such a way that the Primary Key column becomes the first column of the table and there should be no data loss during this operation.

Solution

This solution is based on the idea of creating a new table based on existing table with an additional auto increment id column set as the primary key and then copying the data from the existing table to the new table which has a primary key.

The script can be created and run manually, but we can automate the process by using Integration Services.

This tip is focused on creating a backup table with primary key followed by creating and running an Integration Services package to achieve the goal.

Adding SQL Server Primary Key to a Table using an SSIS Package

We are ready to add a primary key to a table which does not have a primary key.

Prerequisites

This tip assumes that SQL Server Data Tools (SSDT) with the latest version of Visual Studio has already been installed and the readers are familiar with SQL Server Data Tools (SSDT).

Steps Summary

The steps can be summarized as follows:

  1. Creating a sample database
  2. Creating a table without primary key in the sample database
  3. Populating the table
  4. Renaming the table as old table
  5. Creating a new table based on old table
  6. Adding primary key with identity column to the new table
  7. Creating an SSIS Package to copy data from old table to new table with primary key
  8. Automating the process through SSIS Package by running the SSIS Package

This is illustrated as follows:

development flow

Sample Database Setup

Open SQL Server Object Explorer (SSOX) and Add New Database called “University” or alternatively use the following script to create the sample database:

-- Creating Sample Database University
CREATE DATABASE University

Let us use University database created earlier on to create the Course table.

Right click University and click New Query…:

add new query to create table

Writing the following code and execute to create the Course table:

-- Creating course table
CREATE TABLE [dbo].[Course](
    [Name]     VARCHAR(30)  NOT NULL,
[Detail] VARCHAR(200) NULL,
);
GO
create table in ssdt

Populate the table by using the following script:

-- Populating course table without primary key
INSERT INTO [dbo].[Course]([Name], [Detail]) VALUES(N'DevOps for Databases', N'This is about DevOps for Databases')
INSERT INTO [dbo].[Course]([Name], [Detail]) VALUES(N'Power BI Fundamentals', N'This is about Power BI Fundamentals')
INSERT INTO [dbo].[Course]([Name], [Detail]) VALUES(N'T-SQL Programming', N'About T-SQL Programming')
INSERT INTO [dbo].[Course]([Name], [Detail]) VALUES(N'Tabular Data Modeling', N' This is about Tabular Data Modeling')
INSERT INTO [dbo].[Course]([Name], [Detail]) VALUES(N'Analysis Services Fundamentals', N'This is about Analysis Services')

View the data:

populate table with t-sql

Renaming Course Table

Right click on the Course table and click New Query and write and execute the following script to rename the existing table Course as Course_Old:

-- Renaming Course table as Course_Old
EXEC sp_rename "Course","Course_Old"
rename table with sp rename

Creating New Course Table with a Primary Key

Create a new table using the Query window and execute the following script against the University database:

-- Creating new Course table with primary key and auto increment identity column
CREATE TABLE [dbo].[Course](
   [CourseId] INT IDENTITY(1,1),
   [Name]     VARCHAR(30)  NOT NULL,
   [Detail] VARCHAR(200) NULL,
   CONSTRAINT [PK_Course] PRIMARY KEY([CourseID])
);
GO
table with primary key

Create a New SSIS Project

Create a new Integration Services Project and call it PrimaryKeyProject:

new ssis package

Adding Data Flow Task

Add a Data Flow Task and drag and drop the OLE DB Source and OLE DB Destination:

ssis package flow

Configure Source and Destination

Configure database connection to point to the University database:

connection manager settings

Select Course_Old table as the Source:

ole db source editor

Join the OLE DB Source with the OLE DB Destination to create a pipeline:

data task flow

Configure the OLE DB Destination component to point to a newly created Course table which contains the primary key:

ole db destination editor

Check the column mappings by clicking Mappings on the left side and please note that CourseId in the mappings will be ignored since it is only present in the destination table:

mapping of columns

Running the SSIS Package

Run the SSIS Package to start the data transfer from old table to new table:

run ssis package

Viewing Data

Go to SQL Server Object Explorer (SSOX) and right click Course and click View Data:

view data

Congratulations! You have successfully added a primary key column (in the first column) to the table without losing any data.

Once you are certain all rows have been successfully copied to the new table you can delete the old table.

Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Haroon Ashraf Haroon Ashraf's interests are Database-Centric Architectures and his expertise includes development, testing, implementation and migration along with Database Life Cycle Management (DLM).

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips



Comments For This Article

















get free sql tips
agree to terms