Large table cleanup with minimal locks

  • Thanks for the comments it has been a learning experience to see there is such a vibrant tech community. I am using the technique in a 7*24 shop with a very small maint window to remove obsolete rows - it is very good at minimizing contention and keeping the boss happy. If you adjusted the size of the inner table you'll find you can adjust the contention to fit your shop - it will take some testing to fine the optimal value but well worth it. Ed.

  • Nice feature to delete and archive in one step, thanks. In this article no need to archive as these are obsolete rows.

  • Chris - I like the approach of testing and sharing the findings, thanks. I'll modify the code in the article and publish the findings between where in, exists and inner join. In this case the deletes ran very fast anyway and on a practical point they weren't the performance issue - the obsolete rows were.

  • Thanks for the comments it has been interesting reading them. As to the dangers of nolock perhaps I should have added (if the editor would allow) that in this case they don't exists since there are no changes made to obsolete rows in these tables. Best to know your application and where data integrity can be jeopardized when using nolock.

  • Please see Jeff Moden's comments on page 5 concerning where in, exists and inner join, interesting reading. In practice most of the elapsed time is on building the outer temp table due to the obsolete rows, the deletes ran quick. Time allowing I will code an exists and inner join version and time the result - should be interesting.:-)

  • Thanks for the comments Ole. I use maxdop because this process has to run secondary to production jobs and I do that by limiting its processes. This job is intended to be a behind the scenes low impacting process. Without maxdop it will use all available processors and that will cause waits on subsequent production jobs. The jobs runs on a 16 processor server with maxdop=8.

  • Please read Jeff Moden's comments on pg 5 of this thread. Seems there is a difference of opinion on where in, exists and inner join. I will recode using join and exist - time all 3 and publish the results. At least for this case we will have an answer.

  • Edward.Polley 76944 (3/8/2012)


    Please read Jeff Moden's comments on pg 5 of this thread. Seems there is a difference of opinion on where in, exists and inner join. I will recode using join and exist - time all 3 and publish the results. At least for this case we will have an answer.

    Page 5 is relative depending on how many posts per page you display. I have mine set to 50.

  • Jeff Moden (3/6/2012)Like I said, the supposed performance problems about WHERE IN are mostly myths.

    Yep, you are absulutely right. I seem to have succumb to that myth.

    I even tried it on SQL 2000, thinking maybe there was a change in the db engine or the optimizer in the newer versions, but also in SQL 2000 WHERE..IN... and INNER JOIN get the absolute identical execution plan.

    And performance wise approx. 7 out of 10 times where...in... is faster.

  • I reran the process 10x as where in, inner join and exists. Overall the where in and inner join were statistically even with a slight edge to where in, elapsed was a bit slower. Metric used was elapsed time in seconds. Based upon this test I would recommend either where in or inner join.

    Where in Exists Inner Join

    Avg Elapsed 10 runs 267.2 283.1275.8

    Test ran on a benchmarking server no other activity.

Viewing 10 posts - 46 through 54 (of 54 total)

You must be logged in to reply to this topic. Login to reply