RE: Replacing Cursors and While Loops

Mr or Mrs. 500

Points: 546

July 11, 2005 at 10:35 am

The problem is not really in the cursor itself. It is in the fact that operations are not set-oriented. Doing 1 million inserts is not the same at all as doing a single insert based on a select that returns 1 million rows and insert them in a single operation.

I converted recently a transact-sql batch that was doing about 8 million inserts. The execution time was about 8 hours. By making a big query using derived tables and case statement to apply some conversion/extraction logic in the query, and union all to bring together rows obtains under disjoin logic context, this time reduced from 8 hours to 30 minutes (a 16 to 1 improvement). By using such query it was also possible to reduce logging by doing a select INTO.

Even without doing a SELECT INTO, SQL Server optimize log writes operations. It seems that there is a lot less logging required when doing such massive single insert than when doing it row by row.

Cursors are an easy way to describe a job, but the net result is that it ends in row by row reads & writes. For example a single row read use almost all the time suitable indexes, it they exists, which may seems good. But it is very not good if you finally read all the rows from the table this way. When you match a lot of rows using a join SQL Server may evaluate that using an index is not good, and the match is going to be done with a better suited joining algorithm for this type of job. There is another optimization gain possible there.

The only problem with this approach, is that you have to work harder to define the big “do it all” query.