Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

delete top(8000) takes 1 second, delete top(9000) takes 42 seconds !! Expand / Collapse
Author
Message
Posted Tuesday, December 11, 2012 4:55 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Thursday, June 06, 2013 1:14 PM
Points: 72, Visits: 347

Hi,

Delete top(8000) takes 1 second, but delete top(9000) takes 42 seconds, any ideas why?

In management studio, if I do the following:

--delete any rows older than 48hrs in batches of 8000 rows
DECLARE @MaxBuffer DateTime
SELECT @MaxBuffer = MAX(DateTime) FROM dbo.mytable

DELETE
top(8000)
FROM dbo.mytable
FROM dbo.mytable
WHERE DATEDIFF(hour, [DateTime], @MaxBuffer) > 48

it takes 1 second.


If I do the following, making the batch 9000:

--delete any rows older than 48hrs in batches of 9000 rows
DECLARE @MaxBuffer DateTime
SELECT @MaxBuffer = MAX(DateTime) FROM dbo.mytable

DELETE
top(9000)
FROM dbo.mytable
FROM dbo.mytable
WHERE DATEDIFF(hour, [DateTime], @MaxBuffer) > 48

it takes 42 seconds.

In this case I notice the execution plans (estimated and actual) contain a parallelism node - it is only there if top(9000) is used, not when top(8000) is used.


If I then do the following, in the 9000 case adding "option (maxdop 1)"

--delete any rows older than 48hrs in batches of 9000 rows
DECLARE @MaxBuffer DateTime
SELECT @MaxBuffer = MAX(DateTime) FROM dbo.mytable

DELETE
top(9000)
FROM dbo.mytable
FROM dbo.mytable
WHERE DATEDIFF(hour, [DateTime], @MaxBuffer) > 48
option (maxdop 1)

it takes 1 second.

Any idea what causes the parallelism node to be added? It's affect seems to be to read all 13 million rows in the table in the non-clustered index scan using all 8 processors, rather than just read 9000 rows with 1 processor in the maxdop 1 case...


Any hints or tips appreciated.

Regards,
Dave
Post #1395051
Posted Tuesday, December 11, 2012 4:58 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Wednesday, June 05, 2013 2:40 AM
Points: 5,075, Visits: 4,833
The cost threshold must have breached the value which you have it set at. Typically this is set to 5 by default, so top 8000 must have a cost threshold < 5 and top 9000 > 5 which is why it creates a parallel plan.

If you cant change the maxdop setting using sp_configure, then the only thing is as you have done and use the OPTION (MAXDOP 1) clause if your query.




Want an answer fast? Try here
How to post data/code for the best help - Jeff Moden
Need a string splitter, try this - Jeff Moden
How to post performance problems - Gail Shaw
CrossTabs-Part1 & Part2 - Jeff Moden
SQL Server Backup, Integrity Check, and Index and Statistics Maintenance - Ola Hallengren
Managing Transaction Logs - Gail Shaw
Troubleshooting SQL Server: A Guide for the Accidental DBA - Jonathan Kehayias and Ted Krueger

Post #1395055
Posted Wednesday, December 12, 2012 6:30 AM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Yesterday @ 3:22 PM
Points: 3,678, Visits: 5,177
I wonder if the plan is EXACTLY the same between the parallel and non-parallel executions. I would imagine that one is doing an index seek and bookmark lookup for the delete and the other is doing a scan to get the rows to delete. Simple costing algorythm that in this case gets you a plan that "costs" less but runs slower.

I will add that you should NEVER wrap a column around a function in a WHERE clause if you can help it. And in this case you can easily rewrite your WHERE clause to avoid that function. That can allow the optimizer to get MUCH better statistics out of the WHERE clause and makes it SARGable, which can lead to DRAMATICALLY better plans in many cases.


Best,

Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru at GMail
Post #1395633
Posted Wednesday, December 12, 2012 2:15 PM


SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Thursday, June 13, 2013 5:09 PM
Points: 254, Visits: 1,032
This is not a precise answer to your question but it might be worth considering.

I do a lot of reading and research on SQL Server, et al. I remember reading that there is a threshold for row locking that will escalate to a table lock. I remember reading that it was at about 5000 row locks. I don't know if that is correct, only that I remember reading it.

When I perform a massive DELETE operation, I always perform it in a loop that deletes the TOP 4000 rows at a time. This technique has proved to be low impact in a high volume, production, OLTP environment. I usually accompany each iteration of the loop with a small delay (WAITFOR).

Jeff Moden once posted a different technique. He wrote his SQL code to obtain a table lock, then deleted a large number of rows at a time. I asked him about it and he said he thought it was more efficient than obtaining several thousand row locks.

You might want to experiment with both techniques.
Post #1395890
Posted Wednesday, December 12, 2012 4:11 PM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Thursday, June 06, 2013 1:14 PM
Points: 72, Visits: 347

Many thanks guys for all the replies and suggestions.

I have forced the serial plan for this one query using maxdop1.

I did also out of interest temporarily try upping the cost threshold from 5 to 6, and that also gave a non-parallel plan.

Now that I have solved the focing it to serial, for this particular case I guess I was also interested as to why a parallel operation should take longer than a serial one.

To Gail, thanks for the loop of 4000. Yes I use that one all the time. This question came out of my very experiments of upping the top(4000) to top(100,000) and seeing the various results of that...

To Kevin Boles, thanks for the WHERE clause re-write tip. Can you give me a simple re-write example so I can make sure I understand you properly? Do you mean something like this:

WHERE [DateTime] < @MaxBuffer_plus48

(The code is running in production - which I inherited. But I will be able to change it if it makes it run better.)

Many thanks again.

Dave.







Post #1395917
Posted Wednesday, December 12, 2012 8:15 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Yesterday @ 3:22 PM
Points: 3,678, Visits: 5,177
WHERE [DateTime] < DATEADD(hour, 48, @MaxBuffer), or something similar to that. Could be >, didn't dig too deeply into it.

Best,

Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru at GMail
Post #1395961
Posted Wednesday, December 12, 2012 10:42 PM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Thursday, June 13, 2013 7:03 AM
Points: 2,562, Visits: 3,453
david.moule (12/12/2012)
The code is running in production - which I inherited. But I will be able to change it if it makes it run better.
three things you need to keep in mind whenever you have this kind of stituation.

1) DONT use any function on columns involve in filter like WHERE , group by , order by etc (LIKE where substring ( CurrentDate, 1 , Len(Currentdate) = 23 )
2) USe batch approach while working on high volumed DML operations like

CREATE TABLE tab1
( col1 INT
, col2 CHAR(1)
)

INSERT INTO tab1 VALUES (1, 'A')
INSERT INTO tab1 VALUES (1, 'B')
INSERT INTO tab1 VALUES (1, 'C')
INSERT INTO tab1 VALUES (1, 'D')
INSERT INTO tab1 VALUES (1, 'E')
INSERT INTO tab1 VALUES (8, 'F')
INSERT INTO tab1 VALUES (9, 'G')

----------------------------

BEGIN

DECLARE @N INT -- Number of rows to delete per batch
DECLARE @cnt decimal(18,2) -- Total count of rows matching specified criterion
DECLARE @loops INT -- Number of times the DELETE statement should loop to delete all relevent records

SET @N = 2

SELECT @cnt = COUNT(*) FROM tab1 WHERE col1 = 1

SET @loops = CEILING(@cnt/@N)

WHILE @loops > 0
BEGIN
DELETE TOP (@N) FROM tab1 WHERE col1 = 1
SET @loops = @loops - 1
END

END

-----------------

SELECT * FROM tab1

DROP TABLE tab1

3) Last ..Always try to do these kind of task on off-peak hours


-------Bhuvnesh----------
While 1 = 1 (Learning SQL....)
Click to get fast response of your post
Post #1395988
Posted Thursday, December 13, 2012 5:02 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Thursday, June 06, 2013 1:14 PM
Points: 72, Visits: 347
Great, thanks, job done.

Sarging the WHERE clause to:

WHERE [DateTime] < DATEADD(hour, 48, @MaxBuffer)

did the trick and gave an index seek rather than scan, right upto "top(90,000)" (and no parallelism node appearing).

Only when I try "top(900,000)" does it go back to index scan. But still does not give a parallelism node

(I wonder how many more cases are lurking in my inherited system.....)

Regards,
Dave


Post #1396126
Posted Thursday, December 13, 2012 5:14 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Thursday, June 13, 2013 7:03 AM
Points: 2,562, Visits: 3,453
david.moule (12/13/2012)
Only when I try "top(900,000)" does it go back to index scan. But still does not give a parallelism node
NOt always parallelism is bad , sometimes you urself choose intentionally (MAXDOP) so that multiple cpu(threads) execute the query
see the link http://www.sqlservercentral.com/blogs/microsoft-business-intelligence-and-data-warehousing/2012/04/15/massively-parallel-processing-and-the-parallel-data-warehouse/
and
http://www.sqlservercentral.com/blogs/the-smiling-dba/2012/06/18/maxdop-and-cost-threshold-for-parallelism-an-example-for-a-parallel-query/


-------Bhuvnesh----------
While 1 = 1 (Learning SQL....)
Click to get fast response of your post
Post #1396128
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse