Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

Cursor-Killing: Retrieving Recently Modified Data

By Edward Pollack,

Problem

A common use case for cursors or WHILE loops is when we have recently inserted data and want to get back the identity value immediately.  This is typically facilitated by executing our INSERT, and then using SCOPE_IDENTITY() in order to retrieve our identity and pass it into whatever TSQL or code will require it.

For a few rows, this is an acceptable solution, but for a potentially large data set, this can be a major performance hog.  Inserting one row at a time, grabbing the identity, and then repeating this process over and over again means that we need to access our data over and over again.

Consider a simple cursor-based example such as this one:

USE AdventureWorks
GO

DECLARE @product_id INT
DECLARE @new_id INT

DECLARE product_cursor CURSOR FOR
SELECT
       ProductID
FROM Production.Product
WHERE MakeFlag = 1
AND ListPrice > 2000
AND Name NOT LIKE '%New%'

OPEN product_cursor

FETCH NEXT FROM product_cursor
INTO @product_id

WHILE @@FETCH_STATUS = 0
BEGIN
       INSERT INTO Production.Product
         (    Name ,
              ProductNumber ,
              MakeFlag ,
              FinishedGoodsFlag ,
              Color ,
              SafetyStockLevel ,
              ReorderPoint ,
              StandardCost ,
              ListPrice ,
              Size ,
              SizeUnitMeasureCode ,
              WeightUnitMeasureCode ,
              Weight ,
              DaysToManufacture ,
              ProductLine ,
              Class ,
              Style ,
              ProductSubcategoryID ,
              ProductModelID ,
              SellStartDate ,
              SellEndDate ,
              DiscontinuedDate ,
              rowguid ,
              ModifiedDate)
       SELECT
              Name + 'New',
              ProductNumber + '-N',
              MakeFlag ,
              FinishedGoodsFlag ,
              Color ,
              SafetyStockLevel ,
              ReorderPoint ,
              StandardCost ,
              ListPrice ,
              Size ,
              SizeUnitMeasureCode ,
              WeightUnitMeasureCode ,
              Weight ,
              DaysToManufacture ,
              ProductLine ,
              Class ,
              Style ,
              ProductSubcategoryID ,
              ProductModelID ,
              SellStartDate ,
              SellEndDate ,
              DiscontinuedDate ,
              (SELECT NEWID()) ,
              ModifiedDate
       FROM Production.Product
       WHERE ProductID = @product_id

       SELECT @new_id = SCOPE_IDENTITY()

       FETCH NEXT FROM product_cursor
       INTO @product_id
END
CLOSE product_cursor
DEALLOCATE product_cursor

Above, we create a cursor for the rows we wish to insert, loop through each row in order to collect our new ProductID, and then do it all over again.  A look at the execution plan shows operation after operation: a long string of inserts, each with an identical execution plan.  Clearly, this isn’t scalable.

Solution

Starting in SQL Server 2005, we were given the ability to use the OUTPUT clause in order to retrieve modified data on-the-fly, with no need for iteration.  Using this, we can place an OUTPUT INSERTED or OUTPUT DELETED in the DML statement in question and get ALL of the changed data that we want back, whether it is an identity or not.

To solve the dilemma above, we can use OUTPUT INSERTED to pull back the new ProductID as well as the Name:

DECLARE @temp_products TABLE
(productID INT, Name NVARCHAR(50))

INSERT INTO Production.Product
  (    Name ,
       ProductNumber ,
       MakeFlag ,
       FinishedGoodsFlag ,
       Color ,
       SafetyStockLevel ,
       ReorderPoint ,
       StandardCost ,
       ListPrice ,
       Size ,
       SizeUnitMeasureCode ,
       WeightUnitMeasureCode ,
       Weight ,
       DaysToManufacture ,
       ProductLine ,
       Class ,
       Style ,
       ProductSubcategoryID ,
       ProductModelID ,
       SellStartDate ,
       SellEndDate ,
       DiscontinuedDate ,
       rowguid ,
       ModifiedDate)
OUTPUT INSERTED.ProductID, INSERTED.Name
INTO @temp_products
SELECT
       Name + 'New',
       ProductNumber + '-N',
       MakeFlag ,
       FinishedGoodsFlag ,
       Color ,
       SafetyStockLevel ,
       ReorderPoint ,
       StandardCost ,
       ListPrice ,
       Size ,
       SizeUnitMeasureCode ,
       WeightUnitMeasureCode ,
       Weight ,
       DaysToManufacture ,
       ProductLine ,
       Class ,
       Style ,
       ProductSubcategoryID ,
       ProductModelID ,
       SellStartDate ,
       SellEndDate ,
       DiscontinuedDate ,
       (SELECT NEWID()) ,
       ModifiedDate
FROM Production.Product
WHERE MakeFlag = 1
AND ListPrice > 2000

SELECT
       *
FROM @temp_products

Note that the data we collect from our OUTPUT can be stored in a temp table or table variable, as shown above, or returned to the application by leaving out the INTO clause altogether.  In addition, we can use OUTPUT on a scalar value insert as well, if you wanted to INSERT INTO table(…)VALUES(…).

In terms of performance, we have replaced a huge batch of individual statements with a single SQL statement, greatly reducing our overhead, not to mention the sheer complexity of our execution plan.

Solution #2

There is an additional use case not covered above that can pose an additional challenge to us.  What if we want to return data beyond what is contained in INSERTED or DELETED?  What if we also wanted to return additional columns from other tables that we joined as part of a query?  The syntax for OUTPUT does not allow us to include data from other tables (though we can include scalar variables in here if we wish).  How do we mix & match a variety of data all in one SQL statement?

Here is what we might try to do if we started down this route with no knowledge of the limitations of OUTPUT:

DECLARE @temp_products TABLE
(productID INT, Name NVARCHAR(50), ProductModelName NVARCHAR(50))

INSERT INTO Production.Product
  (    Name ,
       ProductNumber ,
       MakeFlag ,
       FinishedGoodsFlag ,
       Color ,
       SafetyStockLevel ,
       ReorderPoint ,
       StandardCost ,
       ListPrice ,
       Size ,
       SizeUnitMeasureCode ,
       WeightUnitMeasureCode ,
       Weight ,
       DaysToManufacture ,
       ProductLine ,
       Class ,
       Style ,
       ProductSubcategoryID ,
       ProductModelID ,
       SellStartDate ,
       SellEndDate ,
       DiscontinuedDate ,
       rowguid ,
       ModifiedDate)
OUTPUT INSERTED.ProductID, INSERTED.Name, ProductModel.Name
INTO @temp_products
SELECT
       Product.Name + 'New',
       ProductNumber + '-N',
       MakeFlag ,
       FinishedGoodsFlag ,
       Color ,
       SafetyStockLevel ,
       ReorderPoint ,
       StandardCost ,
       ListPrice ,
       Size ,
       SizeUnitMeasureCode ,
       WeightUnitMeasureCode ,
       Weight ,
       DaysToManufacture ,
       ProductLine ,
       Class ,
       Style ,
       ProductSubcategoryID ,
       Product.ProductModelID ,
       SellStartDate ,
       SellEndDate ,
       DiscontinuedDate ,
       (SELECT NEWID()) ,
       Product.ModifiedDate
FROM Production.Product
LEFT JOIN Production.ProductModel
ON ProductModel.ProductModelID = Product.ProductModelID
WHERE MakeFlag = 1
AND ListPrice > 2000

SELECT
       *
FROM @temp_products

In my attempt to return the product model name, I throw an unlikely error:

The multi-part identifier "ProductModel.Name" could not be bound.

It turns out that the OUTPUT clause doesn’t allow us to pull data from additional tables as we may have thought it could.  OUTPUT only has access to INSERTED and DELETED, but not to any other tables that are joined to below.

The answer is the MERGE statement, which was added in SQL Server 2008.  MERGE allows us to update, insert, and delete data all at once, and is quite efficient at doing so.  Let’s take our failed attempt to get this data and rewrite it using MERGE:

DECLARE @temp_products TABLE
(productID INT, Name NVARCHAR(50), ProductModelName NVARCHAR(50))

MERGE INTO Production.Product AS target
USING (SELECT Product.*, ProductModel.Name AS ModelName FROM Production.Product
LEFT JOIN Production.ProductModel
ON ProductModel.ProductModelID = Product.ProductModelID
WHERE MakeFlag = 1
AND ListPrice > 2000) AS source
ON (1 = 2)
WHEN NOT MATCHED THEN
  INSERT (Name ,
          ProductNumber ,
          MakeFlag ,
          FinishedGoodsFlag ,
          Color ,
          SafetyStockLevel ,
          ReorderPoint ,
          StandardCost ,
          ListPrice ,
          Size ,
          SizeUnitMeasureCode ,
          WeightUnitMeasureCode ,
          Weight ,
          DaysToManufacture ,
          ProductLine ,
          Class ,
          Style ,
          ProductSubcategoryID ,
          ProductModelID ,
          SellStartDate ,
          SellEndDate ,
          DiscontinuedDate ,
          rowguid ,
          ModifiedDate)
       VALUES (
                     source.Name + 'New',
                     source.ProductNumber + '-N',
                     source.MakeFlag ,
                     source.FinishedGoodsFlag ,
                     source.Color ,
                     source.SafetyStockLevel ,
                     source.ReorderPoint ,
                     source.StandardCost ,
                     source.ListPrice ,
                     source.Size ,
                     source.SizeUnitMeasureCode ,
                     source.WeightUnitMeasureCode ,
                     source.Weight ,
                     source.DaysToManufacture ,
                     source.ProductLine ,
                     source.Class ,
                     source.Style ,
                     source.ProductSubcategoryID ,
                     source.ProductModelID ,
                     source.SellStartDate ,
                     source.SellEndDate ,
                     source.DiscontinuedDate ,
                     (SELECT NEWID()) ,
                     source.ModifiedDate)
       OUTPUT INSERTED.ProductID,
              INSERTED.Name,
              source.ModelName
       INTO @temp_products;

In this example, we can use OUTPUT to get back any columns from the source defined above.  This includes the table we are inserting to or any tables we are joining into.  The “ON (1 = 2)” is used as a placeholder as we are not actually comparing our source and target.  If we wanted to perform a conditional insert, we could change that to any comparison and act on it when MATCHED or NOT MATCHED.  Regardless of the complexity of the MERGE statement, the functionality to output any columns from DELETED, INSERTED, or SOURCE is preserved, not to mention very useful.

Conclusion

The ability to retrieve, store, and return data as soon as it is modified is critical to many software applications.  For a very small data set, an iterative solution may be adequate, but when we are working with larger row counts, performance will become a significant problem.  Using OUTPUT as part of your UPDATE, INSERT, or DELETE statements can allow you make changes and return those changes in a single set-based step, greatly improving efficiency.  In addition, using MERGE can allow you to craft even more flexible TSQL by defining a source that can also be OUTPUT from.  Using either of these features can eliminate the need for WHILE loops or cursors while greatly improving the performance and scalability of your code!

Total article views: 6289 | Views in the last 30 days: 7
 
Related Articles
FORUM

Cursor

cursor

FORUM

INSERT OUTPUT INTO (With an extra column) and SELECT statement

I need to select a column for the OUTPUT INTO from the SELECT table for the INSERT...

FORUM

Inserting rows Using Cursors in Stored Procedure

Inserting rows Using Cursors in Stored Procedure

FORUM

how do i supress cursor select set results

how do i supress cursor select set results

FORUM

select multirecs from single recs without cursor

select multirecs from single recs without cursor

Tags
cursor    
deleted    
identity    
inserted    
merge    
performance    
scope_identity    
t-sql    
 
Contribute

Join the most active online SQL Server Community

SQL knowledge, delivered daily, free:

Email address:  

You make SSC a better place

As a member of SQLServerCentral, you get free access to loads of fresh content: thousands of articles and SQL scripts, a library of free eBooks, a weekly database news roundup, a great Q & A platform… And it’s our huge, buzzing community of SQL Server Professionals that makes it such a success.

Join us!

Steve Jones
Editor, SQLServerCentral.com

Already a member? Jump in:

Email address:   Password:   Remember me: Forgotten your password?
Steve Jones