You may have data in a database that was inserted into a table by mistake, or you may have data in your tables that is no longer of value. In either case, when you have unwanted data in a table you need a way to remove it. The DELETE statement can be used to eliminate data in a table that is no longer needed. In this article I will be showing you different ways to use the DELETE statement to identify and remove unwanted data from your SQL Server tables.
Creating Sample Data
In order to show you how to delete records using the DELETE statement, I first need to create a couple of sample tables using the code in Listing 1.
SET NOCOUNT ON; USE tempdb; GO CREATE TABLE dbo.Product ( ID int identity not null, ProductName varchar(25) not null, Price decimal(6,2) not null ); INSERT INTO dbo.Product VALUES ('Widget',25.99), ('WingDing',18.87), ('DingDong',1.99), ('Doo-Dads',87.34), ('Doohickey',19.56), ('Thingamagigs',239.10), (' Whatchamacallit',3.47); CREATE TABLE dbo.DiscontinuedProduct ( ProductName varchar(25) not null); INSERT INTO dbo.DiscontinuedProduct VALUES ('Widget');
Listing 1: Script to create sample data
The code in Listing 1 creates and populates to two tables: Product and DiscontinuedProduct.
Deleting all the Rows in a Table
It is extremely easy to delete all the rows in a table. All you need to do is provide a DELETE statement that is not constrained. The code in Listing 2 deletes all the records in my Product table.
USE tempdb; GO DELETE FROM dbo.Product;
Listing 2: Deleting all rows in a table
You only need to be granted DELETE permissions on a table in order to use the DELETE statement to remove rows from a table.
Another way to delete all the rows in a table is to use the TRUNCATE tablE statement, which is much more efficient in those cases where you want to remove ALL the rows from a table.. Permission to run TRUNCATE TABLE cannot be granted and only the table owner and members of the sysadmin fixed server role, as well as the db_owner or db_ddladmin fixed database role can execute the TRUNCATE TABLE statement. A complete discussion of the TRUNCATE TABLE statement is outside the scope of this article. If you would like more information on this statement refer to the Books Online documentation.
Since the code in Listing 2 deleted all the rows in my Product table I need run the code in Listing 3 to repopulate this table again so it can be used for some of my other examples.
SET NOCOUNT ON; USE tempdb; GO INSERT INTO dbo.Product VALUES ('Widget',25.99), ('WingDing',18.87), ('DingDong',1.99), ('Doo-Dads',87.34), ('Doohickey',19.56), ('Thingamagigs',239.10), (' Whatchamacallit',3.47);
Listing 3: Deleting all rows in a table
Deleting Rows Based on a WHERE Condition
Most of your delete operations in your applications will probably not need to delete all the rows in a table. Instead you will probably be deleting only a subset of rows based on some WHERE condition. You may want to delete a single row, or multiple rows with one or more different conditions. The code in Listing 4 uses the WHERE clause to identify the specific rows I want to delete.
USE tempdb; GO DELETE FROM dbo.Product WHERE ProductName = 'Thingamagigs';
Listing 4: Deleting a Single Row
When I run this code only the rows that have a ProductName of “Thingamagigs” will be deleted. In my sample Product table only one row meets this criteria so only a single row was deleted.
You may find you need to delete several different rows, with each rowing have one of a list of specific values. You can do this by specifying a WHERE condition that matches multiple rows like I have done in listing 5.
USE tempdb; GO DELETE FROM dbo.Product WHERE ProductName in ('DingDong','Doo-Dads','GallyWhapper');
Listing 5: Use the IN clause to identify multiple rows to delete
In Listing 5, I used the IN clause to identify that I wanted to delete any Product row that had a ProductName of “DingDong”, “Doo-Dads”, or “GallyWhapper”. When I ran this code against my sample Product table 2 different rows were deleted, those rows that had a ProductName of “DingDong” and “Doo-Dads”. Since no rows matched the “GallyWhapper” value no rows where deleted for this value. If none of the rows in my table match any of my WHERE criteria then no rows would be deleted. You could add as many WHERE conditions as needed to delete the rows you need.
Deleting a Specific Number of Rows
You may want to only delete a specific number of rows in a table, or only a subset of the rows that meet a particular search criteria. To handle deleting the first N rows SQL Server allows you to use the TOP clause. The code in Listing 6 demonstrates using the TOP clause to delete only the first row based on my search criteria.
USE tempdb; GO SELECT * FROM dbo.Product WHERE ID > 8; DELETE TOP (1) FROM dbo.Product WHERE ID > 8; SELECT * FROM dbo.Product WHERE ID > 8;
Listing 6: Deleting a single row using the TOP clause
In this example I first run a SELECT statement that shows the number of rows that have with an ID greater than 8, which for my example is only 2 records. Then I run a DELETE statement that will delete the first row returned where the ID value is greater than 8 by specifying the TOP clause. I then run a second SELECT statement to verify that only 1 row was deleted. Keep in mind since the rows that meet my criteria are not returned in any specific sorted order, my DELETE statement only deletes the first row returned, whichever row it might be.
Identifying Rows to Delete Based on another Table
There are times when you want to delete rows in one table, based on values from another table. Listing 7 shows you how to use an INNER JOIN clause between two tables to identify those rows that need to be deleted.
USE tempdb; GO DELETE FROM dbo.Product FROM dbo.Product P INNER JOIN dbo.DiscontinuedProduct D ON P.ProductName = D.ProductName;
Listing 7: Deleting rows based on a JOIN operation
In Listing 7 every row in the Product table that meets the join criteria with the DiscontinuedProduct table will get deleted from the Product table.
Another way to delete rows in one table based on another table is to use a sub-query, as demonstrated in Listing 8.(If you ran the query in listing 7, this one will not delete any additional rows.)
DELETE FROM dbo.Product WHERE ProductName in (SELECT ProductName FROM dbo.DiscontinuedProduct);
Listing 8: Deleting rows based on a JOIN operation
In listing 8 I identified the rows that needed to be deleted by using an IN clause within a WHERE condition, in which the candidate values for my IN clause were identified by using a sub-query.
Deleting Duplicate Rows using Common Table Expression
The definition of duplicate row may vary, but for this discussion a duplicate row is a row that has the same column value for each and every column as another row in a table. How do you go about deleting duplicate rows? There are a number of options for doing that. Each method needs a way to uniquely identify each duplicate row so you can delete all but one of the duplicate rows. For my example on how to delete duplicate rows I will be showing you how to uniquely identify and delete duplicate rows using a common table expression (CTE).
The code in Listing 9 creates some sample data that I will be using to demonstrate how to delete duplicate rows.
SET NOCOUNT ON; USE tempdb; GO CREATE TABLE dbo.Dups (Id int, Name varchar(10)); INSERT INTO Dups VALUES (1,'Red'); INSERT INTO Dups VALUES (2,'White'); INSERT INTO Dups VALUES (2,'White'); INSERT INTO Dups VALUES (3,'Blue'); INSERT INTO Dups VALUES (3,'Blue'); INSERT INTO Dups VALUES (3,'Blue');
Listing 9: Creating sample data
CTE’s are an advanced topic and a complete discussion of what they are and how to use them are outside the scope of this article. For this article just think of a CTE as a pseudo table, where the rows in the pseudo table are created from the T-SQL code contained in the CTE. In Listing 10 you will find my example that uses a CTE to delete the duplicate rows from my sample data table Dups.
USE tempdb; GO WITH DupRecords (Id,Name, DuplicateCount) AS ( SELECT Id, Name, ROW_NUMBER() OVER(PARTITION BY Id, Name ORDER BY Id,Name) AS DuplicateCount FROM dbo.Dups ) SELECT * FROM DupRecords;
Listing 10: Displaying duplicate rows using a CTE
The code in Listing 10 first defines the CTE named DupColor This CTE returns a result set that contains all the rows in the my dbo.Dups table, and then also includes a column named DuplicateCount . The DuplicateCount column is created using the ROW_NUMBER() function, which sequentially numbers each row based on the Name column values. The results of ROW_NUMBER() function numbers each duplicate Name value row with a different number starting with 1 for the first value, 2 for the second row with the same Name value, 3 for third duplicate value and so on. The code in Listing 10 then takes the defined CTE and uses it in a SELECT statement. By running this code you will be able to see how this CTE creates a record set that contains a DuplicateCount column value. By looking at the output of this CTE you can see any time the row is a duplicate value the duplicate rows are numbered sequentially, and the sequential number is the DuplicateCount column.
The code in Listing 11 defines the same CTE as in Listing 10, but this time uses it in a DELETE statement.
USE tempdb; GO WITH DupRecords (Id,Name, DuplicateCount) AS ( SELECT Id, Name, ROW_NUMBER() OVER(PARTITION BY Id, Name ORDER BY Id,Name) AS DuplicateCount FROM dbo.Dups ) DELETE FROM DupRecords WHERE DuplicateCount > 1 GO SELECT * FROM dbo.Dups;
Listing 11: Deleting duplicate records using a CTE
The DELETE statement uses the CTE to remove any rows that have a DuplicateCount > 1, which essentially deletes the rows with duplicate color values from the dbo.Dups table. Note a CTEs can only be used in the statement immediately following the CTE declaration, this is why I defined the CTE a second time.
Using the Output Clause
Have you ever wondered what you just deleted with a DELETE statement? If you have then there is help for you by using the OUTPUT clause. When you use the OUTPUT clause on your DELETE statement information about the deleted row is stored in a pseudo-table named DELETED.
Let me go through a couple of examples to show you how this works. But first let me create and populate a new table that will be used in my follow-on DELETE statements.
SET NOCOUNT ON; USE tempdb; GO CREATE TABLE dbo.MyVehicles (ID int, Name varchar(10), Color varchar(10)); INSERT INTO dbo.MyVehicles values (1,'Volvo','Green'); INSERT INTO dbo.MyVehicles values (2,'Kia','Black'); INSERT INTO dbo.MyVehicles values (3,'Chevrolet','Gold'); INSERT INTO dbo.MyVehicles values (4,'Nissan','Red'); INSERT INTO dbo.MyVehicles values (5,'Honda','Gold');
Listing 12: Create sample table
Suppose I want to remove all my vechicles that are of ”Gold” color, and return the column values for every row deleted to my application, then all I would need to do is run the code in Listng 13
USE tempdb; GO DELETE FROM dbo.MyVehicles OUTPUT DELETED.* WHERE Color = 'Gold';
Listing 13: Outputting deleted column/row values for deleted rows.
The code in Listing 13 returns all the column values for each row deleted. It does this because in the OUTPUT clause I specified “DELETED.*”, which means to return all deleted column values. When I run the code in Listing 13 from a Query Window I get the output found in Report 1.
ID Name Color
----------- ---------- ----------
3 Chevrolet Gold
5 Honda Gold
Report 1: Output from when I run the code in Listing 13.
By reviewing the output you can tell 2 vehicles were deleted that had a color of “Gold”.
You don’t have to specify that every column of the deleted rows needs to be outputted. Additionally you don’t have to return the deleted row information to your application. Instead you can write the output for the deleted rows to a table. To show you how this works, I have selectively outputted columns to a table in my DELETE statement using the code in Listing 14.
USE tempdb; GO DECLARE @DeletedValues TABLE (Name varchar(10), Color varchar(10)); DELETE FROM dbo.MyVehicles OUTPUT DELETED.Name, DELETED.Color INTO @DeletedValues WHERE Color in ('Black','Red'); SELECT * FROM @DeletedValues;
Listing 14: Outputting only the Name and Color column to a table variable
The code in Listing 14 identified only two columns to be outputted, Name and Color. I also include the INTO clause which caused the deleted column values for each deleted row to be inserted into my table variable @DeletedValues. By doing this I was able to run a SELECT statement right after the DELETE statement that displayed the column values for the deleted rows that were inserted into my table variable.
Building Blocks for Deleting Rows
Deleting data from a table is needed when you mistakenly enter data into a table, or the data is no longer of any value. Being able to delete all the data in a table or individual rows is something that most applications will occasionally require. Care should be taken when deleting data to make sure you don’t delete too much or not enough data. Hopefully this article provided you the building blocks necessary to allow you to build your own DELETE statements to remove the rows you no longer need in your database tables.