Cursors Be Gone!

Question

Cursors Be Gone!

GabyYYZ

SSCertifiable

Points: 7913
More actions
December 23, 2008 at 10:29 pm

#199392

Comments posted to this topic are about the item Cursors Be Gone!
Gaby________________________________________________________________"In theory, theory and practice are the same. In practice, they are not." - Albert Einstein

Viewing 15 posts - 1 through 15 (of 272 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1003863 More actions · Answer 1

Good idea, Gaby, and good article...

Just a couple of tips... if you simply add Forward_Only and Read_Only to the cursor declaration to make it a "fire hose" cursor, you've basically done the same thing with a lot less work.

Also, don't trust the % of batch in any execution plan... it lies... sometimes, it lies a lot! For example... in the following code, both the estimated and actual execution plans say that the first query in the following code will take 0% of the batch and the 2nd query will take 100% of the batch even though they do the same thing. AND, when you run the code just the opposite is true if you look at the output in the message tab... the 2nd query blows the doors off the first query.

SET NOCOUNT ON

--=======================================================================================

-- Recursive method shown by (Name with-held)

--=======================================================================================

PRINT '========== Recursive method =========='

--===== Turn on some performance counters ===============================================

SET STATISTICS IO ON

SET STATISTICS TIME ON

DECLARE @Bitbucket DATETIME --Holds display output so display times aren't measured.

--===== Execute the code being tested ===================================================

DECLARE @DateVal DATETIME

SET @DateVal = '2008-01-01'

;with mycte as

(

select @DateVal AS DateVal

union all

select DateVal + 1

from mycte

where DateVal + 1 < DATEADD(yy, 5, @DateVal)

)

select @Bitbucket = d.dateval

from mycte d

OPTION (MAXRECURSION 0)

--===== Turn off the performance counters and print a separator =========================

SET STATISTICS TIME OFF

SET STATISTICS IO OFF

PRINT REPLICATE('=',90)

GO

--=======================================================================================

-- Tally table method by Jeff Moden

--=======================================================================================

PRINT '========== Tally table method =========='

--===== Turn on some performance counters ===============================================

SET STATISTICS IO ON

SET STATISTICS TIME ON

DECLARE @Bitbucket DATETIME --Holds display output so display times aren't measured.

--===== Execute the code being tested ===================================================

DECLARE @StartDate AS DATETIME

SET @StartDate = '2008-01-01'

SELECT TOP (DATEDIFF(dd,@StartDate,DATEADD(yy,5,@StartDate)))

@Bitbucket = @StartDate-1+t.N

FROM Tally t

ORDER BY N

--===== Turn off the performance counters and print a separator =========================

SET STATISTICS TIME OFF

SET STATISTICS IO OFF

PRINT REPLICATE('=',90)

The reason why this occurs is because the first query actually contains RBAR in the form of a Recursive CTE and only the first "loop" is measured. Other anomolies also adversly affect the % of batch listings in execution plans.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1003863 More actions · Answer 2

Almost forgot... don't forget that sp_MsForEachDB is nothing but a big ol' cursor. Look at the code if you don't believe me. Also, in SQL Server 2005, you don't even need a While loop for this type of stuff anymore... here's the same thing using a "pseudo-cursor" to build all of the SQL for all of the databases all at once...

DECLARE @SQL NVARCHAR(MAX)

SELECT @SQL = COALESCE(@SQL+CHAR(10),'') + 'DBCC CheckDB(' + QUOTENAME(Name) + ')'

FROM Master.sys.Databases

WHERE Database_ID > 4

PRINT @SQL

EXEC (@SQL)

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Teddy Carebears SSC Journeyman Points: 97 More actions · Answer 3

As I wrote on an email sent to Gaby you also have to be careful cause sometimes (absolutely random) the variable of type table is not being dumped from the memory by SQL Server. If you use a query in a web page to compute some amounts base on real time results and you hit refresh several times you might see that after a while the page is not updating with the good results. To get rid of this kind of problem I have used a temporary table instead of a variable of type table.

SQL Joker Ten Centuries Points: 1256 More actions · Answer 4

SQL Joker

Ten Centuries

Points: 1256

December 24, 2008 at 1:15 am

#917261

Good idea and nice article. Thank you

Jonathan AC Roberts SSCoach Points: 18263 More actions · Answer 5

A bit of an odd example to pick. All that effort to avoid using a cursor on an example that executes dbcc checkdb for each of just a few rows. Surely the time it takes time to parse a cursor in this example is totally insignificant when compared to the time taken to do a 'dbcc checkdb' for each database?

Also, when you say "Imagine scaling the cursor to tables with millions of rows". I'm imagining it! Are you sure that inserting the rows into a temporary table with no index and deleting each row, in a loop, on an individual basis, and selecting the count(*) from the table for each row would actually be quicker than using a cursor to just scan through a recordset? :crazy:

Matt Whitfield SSCrazy Eights Points: 8107 More actions · Answer 6

As you can see in the image below, while the actual allocation/deallocation of the cursor doesn't have much of an impact, the inefficient way the query is executed is what causes this to be slow. Imagine scaling the cursor to tables with millions of rows.

The thought of scaling either RBAR to millions of rows fills my heart with sorrow.

But, making me even more sad is the fact that yes, we've replaced a cursor, but actually with a method that's even more inefficient. Say you did have 1,000,000 rows, for the cursor you would have the read of the data and cursor through it. For the other way, you would have the creation of the table (be it temporary or variable), the insert into it, and 2,000,000 selects and 1,000,000 deletes.

Try the following three queries - one of which is a cursor, one of which uses this method with a temp table, and one of which uses this method with a table variable. The query works on 100,000 rows and executes reasonably meaningless code so that we're testing the performance of the iteration method, rather than the performance of what is being executed.

Note also that in the temp table / table variable methods I have added a clustering PK to increase the performance of these methods - i got bored of waiting for the queries to finish without!

On my server, the cursor takes 12.562 seconds to execute, the temp table method takes 33.093 seconds to execute, and the table variable method takes 32.218 seconds to exeucte.

Cursor method:

SET NOCOUNT ON

declare @query varchar(100), @id int

declare BadCursor Cursor for

select top 100000 ID from [tblLogTransactions]

open BadCursor

fetch next from BadCursor into @id

while @@fetch_status = 0

begin

select @query = 'declare @i int; set @i = ' + convert(varchar,@id)

exec(@query)

fetch next from BadCursor into @id

end

close BadCursor

deallocate BadCursor

go

Temp Table method:

SET NOCOUNT ON

declare @query varchar(100), @id int

CREATE TABLE #muchWorseProblem (ID int primary key clustered)

INSERT INTO #muchWorseProblem (ID)

select top 100000 ID from [tblLogTransactions]

while (select count(*) from #muchWorseProblem) > 0

begin

select top 1 @id = ID from #muchWorseProblem

select @query = 'declare @i int; set @i = ' + convert(varchar,@id)

exec(@query)

delete from #muchWorseProblem where id = @id

end

Table Variable method:

SET NOCOUNT ON

declare @query varchar(100), @id int

DECLARE @muchWorseProblem TABLE (ID int primary key clustered)

INSERT INTO @muchWorseProblem (ID)

select top 100000 ID from [tblLogTransactions]

while (select count(*) from @muchWorseProblem) > 0

begin

select top 1 @id = ID from @muchWorseProblem

select @query = 'declare @i int; set @i = ' + convert(varchar,@id)

exec(@query)

delete from @muchWorseProblem where id = @id

end

:blink:

Atlantis Interactive - SQL Server Tools
My blog[/url]
Why I wrote a sql query analyzer clone

darren.flynn SSC Journeyman Points: 90 More actions · Answer 7

We've largely got rid of cursors, and have been using table variables for ages.

Only thing in the article that I'd change would be to ditch the Select Count and the delete and use a variable with @@rowcount (by doing a select x from table variable just before setting the variable)and a counter variable to control the loop, and to then select the record based on an IDENTITY column in your table variable (i.e. select x from table variable where ID = counter variable). Do what you need to do and then increment the counter so in effect you get a move next.

In practice this means you have a copy of the data and aren't getting rid of any, just in case you need to do something else with it later.

In our stored procedures working with hundreds, thousands and sometimes millions of records we've found this to be extremely efficient and fast. In one instance a stored procedure dropped from taking several minutes to run down to less than 10 seconds.

Jonathan AC Roberts SSCoach Points: 18263 More actions · Answer 8

Jeff Moden (12/24/2008)
Good idea, Gaby, and good article...

Could you list one thing that's a good idea in this article, spelling mistakes aside?

Eralper SSCarpal Tunnel Points: 4438 More actions · Answer 9

Hello all,

I agree with Matt since a select count(*) is not quickly handled operation if you do not have indexes on a huge table. So if the source table we aim to use cursors on is really big, I believe Cursor method will handle the situation better.

I aggree with Darren also because he eliminates the select count(*).

This method can be used I believe by making a few modifications as I have copied down the sql codes.

But I could not escape from deleting rows using column values

declare @i int;

set @i = 0;

while 1=1

begin

select top 1 @id2 = ID from @muchWorseProblem

if @id2 is null

break;

select @query = 'declare @i int; set @i = ' + convert(varchar,@id2)

exec(@query)

delete from @muchWorseProblem where id = @id2

set @id2 = null

end

Eralper
SQL Server and T-SQL Tutorials and Articles
Microsoft Certification and Certification Exams

Matt Whitfield SSCrazy Eights Points: 8107 More actions · Answer 10

Having modified that script a bit, and run it on my server - the time was 23.078 secs. Improved, certainly, but not better.

SET NOCOUNT ON

declare @query varchar(100), @id int

DECLARE @muchWorseProblem TABLE (ID int primary key clustered)

INSERT INTO @muchWorseProblem (ID)

select top 100000 ID from [tblLogTransactions]

declare @id2 int;

declare @i int;

set @i = 0;

while 1=1

begin

select top 1 @id2 = ID from @muchWorseProblem

if @id2 is null

break;

select @query = 'declare @i int; set @i = ' + convert(varchar,@id2)

exec(@query)

delete from @muchWorseProblem where id = @id2

set @id2 = null

end

The quickest I could get it to go was the following, which does come in slightly quicker than a cursor, at 10.703 secs:

SET NOCOUNT ON

declare @query varchar(100), @id int, @rowNum int

DECLARE @muchWorseProblem TABLE (RowNum int IDENTITY(1, 1) primary key clustered, ID int)

INSERT INTO @muchWorseProblem (ID)

select top 100000 ID from [tblLogTransactions]

declare @maxRowNum int

SELECT @maxRowNum = MAX(RowNum) FROM @muchWorseProblem

declare @i int;

set @i = 1;

while @i <= @maxRowNum

begin

select @id = ID from @muchWorseProblem WHERE rowNum = @i

select @query = 'declare @i int; set @i = ' + convert(varchar,@id)

exec(@query)

set @i = @i + 1

end

Atlantis Interactive - SQL Server Tools
My blog[/url]
Why I wrote a sql query analyzer clone

hfxDBA SSC Veteran Points: 295 More actions · Answer 11

I've always resisted responding to articles with obvious errors simply not wanting to get into arguments about which way to skin the cat. I hope people who read the original articles here also read all the comments and replies. I hear and see too many junior (and some not so junior) read articles then start using the techniques as THE way to do something.

I can spot several things I would change with a lot of the reply scripts. I won't list those right now because I think the most important thing for readers who visit here to know is that they MUST try different techniques and record the results so that they know which technique solves their problem best for them. They must decide on what trade-offs they can risk and which they can't.

Whenever I write an SP, USP, script, or even a TRIGGER, I comment exactly what I did and also comment exactly what alternate methods I used and why I chose the solution I did (for instance, why I chose a table variable over a temp table (or vice versa) or why I chose a cursor over a while loop). If you have several developers or DBAs who could be maintaining the code, this will help them understand your methods and may save them some time down the road.

Matt Whitfield SSCrazy Eights Points: 8107 More actions · Answer 12

If you can spot several things you would change, then please let us know. It's not about having an argument, it's about learning, and if you have a good method, then please do share it! 😀

Atlantis Interactive - SQL Server Tools
My blog[/url]
Why I wrote a sql query analyzer clone

Phil Factor SSC-Insane Points: 20244 More actions · Answer 13

I'm a bit puzzled by this article. Surely, in order to illustrate the dangers of using cursors or 'WHILE' loops, you would expect to see an example where there is an obvious performance or resource advantage in avoiding cursors. This routine is so trivial that it could be, and has been, written any number of ways without affecting its execution time at all. The trick, surely, is to take a problem that seems impossible to do without cursors and show how easy it actually is. Give the performance figures for a whole range of data sizes etc.

I'm perfectly happy to use WHILE loops for a job like the one described in this article, especially if it makes the routine more understandable. Am I wrong to do so?

Best wishes,
Phil Factor

Lawrence-865109 Grasshopper Points: 24 More actions · Answer 14

I am very happy that this article was written. However, the speed increases were compared to each other by using percentage of time relative to the batch, which is misleading. For example, 69% of 1 second is a faster process than 31% of 4 seconds.