Performance tuning a self Join

  • I have this query where a delete operation is performed on a table.

    Delete from table where ID in (SELECT ID+1 from table where somecolumn='something')

    How can I optimize this query for better performance ?

  • You could use the EXISTS statement:

    DELETE FROM TableA WHERE EXISTS (SELECT 1 FROM TableB WHERE TableA.Key = TableB.Key)

  • also put rowlock on the table from which data is deleted

    put nolock on the tableB (referenced table)

    Pramod
    SQL Server DBA | MCSE SQL Server 2012/2014

    in.linkedin.com/in/pramodsingla/
    http://pramodsingla.wordpress.com/

  • Why do you need a self join? Assuming the sub query is the same table as the one you're deleting from (as per your example) you just need a straight forward DELETE...WHERE:

    Delete from table where somecolumn='something'

  • dkschill (1/12/2011)


    You could use the EXISTS statement:

    DELETE FROM TableA WHERE EXISTS (SELECT 1 FROM TableB WHERE TableA.Key = TableB.Key)

    This is just plain wrong. If TableA and TableB are the same tables, you might end up deleting all the rows.

    Furthermore, the constraint somecolumn='something' isn't present in your query.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • I was trying to demonstrate the EXISTS syntax as a replacement for IN.

  • dkschill (1/13/2011)


    I was trying to demonstrate the EXISTS syntax as a replacement for IN.

    That will only work if you are looking for the existence of rows that have a particular ID (returns true or false), instead of when you are looking for the actual IDs (what IN does in this case).

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Koen (da-zero) (1/13/2011)


    dkschill (1/12/2011)


    You could use the EXISTS statement:

    DELETE FROM TableA WHERE EXISTS (SELECT 1 FROM TableB WHERE TableA.Key = TableB.Key)

    This is just plain wrong. If TableA and TableB are the same tables, you might end up deleting all the rows.

    Furthermore, the constraint somecolumn='something' isn't present in your query.

    Sorry guys for the confusion. The query is given below.

    Delete from Table123 where id IN

    (Select ID+1 from Table123 where somecolumn='something')

  • This is just plain wrong. If TableA and TableB are the same tables, you might end up deleting all the rows.

    Furthermore, the constraint somecolumn='something' isn't present in your query.[/quote]

    I think the above poster is right. I dont think exists will work. Can somebody post the right query for the best performance.

  • Your query can't get any simpler for better performance. But HowardW has a point, why use the subselect at all?

    ______________________________________________________________________________
    How I want a drink, alcoholic of course, after the heavy lectures involving quantum mechanics.

  • If you want to keep the first record, keep the query as it is; if you want to use the last one in the set of dupes, use MAX(). The FieldWithDupes fields are for you to include any fields that you would need to determine if the record is a duplicate. Obviously, you need an identity-type column for this to work.

    DELETE FROM

    t1

    FROM

    MyTable t1

    INNER JOIN

    (

    SELECT

    MIN(FieldID) AS FieldID,

    FieldWithDupes1,

    FieldWithDupes2,

    FieldWithDupes3...

    FROM

    MyTable

    GROUP BY

    FieldWithDupes1,

    FieldWithDupes2,

    FieldWithDupes3...

    HAVING

    COUNT(*) > 1

    ) t2

    ON(

    t1.FieldWithDupes1 = t2.FieldWithDupes1

    AND t1.FieldWithDupes2 = t2.FieldWithDupes2

    AND t1.FieldWithDupes3 = t2.FieldWithDupes3...

    AND t1.FieldID <> t2.FieldID

    )

  • What is the script suppose to accomplish? From the way I read it you are trying to delete all rows that exists directly after (ID+1) a row that contains 'something'. Is that correct?

    Delete from Table123 where id IN

    (Select ID+1 from Table123 where somecolumn='something')

  • Keep in mind your "ID+1" solution will only work if you don't have gaps in your ID sequence.

    If you are using the identity property for this column, you may have gaps in your sequence because of rollbacks that occurred or others having manually deleted rows.

    Strange you don't have secondary common column values to determine you are actually dropping related rows.

    another alternative for writing your current delete query :

    set statistics io, time on;

    /* alternative */

    delete from Table123 T

    where exists (Select 1

    from Table123 T1

    where T1.somecolumn='something' and T1.ID+1 = T.ID)

    /* alternative */

    Delete T

    from Table123 T

    inner join Table123 T1

    on T1.ID+1 = T.ID

    where T1.somecolumn='something'

    TEST IT __ TEST IT

    compare the different solutions for performance

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply