Delete Records

  • Delete records from one table which do have a matching field in another table:

    delete from Tableone where field1 in

    (select field2 from TableTwo)

    But script looks for only matching of Field1 and deletes the records..

    But .. i want to write a script .. like .. for multplie feilds

    Can some send a sample please

  • Look at the article link in my signature, you need to post some sample data that would give ready data to test and come up with the solution.

    how can we suggest the solution without knowing your table structure and your requirement (conditions).


    Bru Medishetty

    Blog -- LearnSQLWithBru

    Join on Facebook Page Facebook.comLearnSQLWithBru

    Twitter -- BruMedishetty

  • Hi ,

    I want to delete the records from table2 which are in the table1

    there is a PK - Family ID.. and i have totally 30 Columns in each table

    2 tables are same ..

    Here is the Script i wrote

    DELETE FROM table2

    WHERE ([Family ID] IN

    (SELECT [Family ID]FROM table1))

    untill now working fine ...

    But .. The Script i am looking for is ..

    I want to delete the records from table2 which are in table1 ..mean matching all the columns

    So .. i updated script like

    Delete from table2

    where exists (select * from table1)

    Just want to know whether it is Correct or not ..please correct me if this is wrong Please ...

  • John Paul-702936 (2/11/2010)


    ...

    But .. The Script i am looking for is ..

    I want to delete the records from table2 which are in table1 ..mean matching all the columns

    So .. i updated script like

    Delete from table2

    where exists (select * from table1)

    Just want to know whether it is Correct or not ..please correct me if this is wrong Please ...

    Hi John Paul,

    No, the example you give will not do what you expect it to do. The delete-statement you have written is "if there are any rows in table1, delete all rows in table 2". You need to add some conditions, linking the rows in table1 to the rows in table2:

    delete from table2 t2

    where exists (select top 1 1 from table1 t1 where t1.col1 = t2.col1 and t1.col2 = t2.col2 [...])

    Note: Allways use TOP together with EXISTS to prevent the database engine to fetch all rows in table1 - which can be a lot of rows - when you only need to know that there are at least one...

    /Markus

  • Hi Mark .. thanks for reply ..

    Why TOP 1 is used .. can you explain .. please

  • Hi John,

    It´s allways a good rule of thumb to use TOP 1 together with EXISTS, because it prevents the database engine from doing unneccesary work.

    I you write:

    select *

    from OneTable

    where EXISTS (select * from SecondTable where OneCol = AnotherCol)

    The database engine will first get ALL rows in the SecondTable, then it will check if the result has any rows.

    If you instead specify TOP 1:

    select *

    from OneTable

    where EXISTS (select TOP 1 * from SecondTable where OneCol = AnotherCol)

    The database engine will only fetch ONE row from the SecondTable, then it will check if the result has any rows.

    The same thing is applicable for TSQL statements like:

    IF ((select count(*) from myTable) > 0)

    ...

    If you change this to:

    IF (EXISTS (select top 1 * from myTable))

    The database engine doesn´t have to count the rows, when you aren´t even interrested in how many there are...

    If the databases are small the performance difference might not be so huge, but it´s allways a good habbit. 🙂

    /Markus

  • Hi hunterwoord,

    Are you sure thats the way SQLServer works.?

    Cause i was thinking that when one was testing with IF EXISTS this would just be a logical question and that Sqlsvr would be smart enough to stop looking for more records once he had found one hence he knows the answer and the logical question could already be answered.

    Please confirmation from an expert on this, would be really appreciated

    .

    Cause if this is true i have a lot of sql procedures to reconsider.

    On the other hand i have statements including this logical question

    like :

    Select x, y, z= case when exists( select Key from table where a=b) then 'true' else 'false' end

    from table1 where ...

    And this logical question referencing a 100k+ table without any penalty, runs in 0.003 seconds

    I will on the other hand always make sure that for the IF EXISTS i always select 1 field only and mostly this would be the primary key (indexed)

    What i on the other hand always do is using SELECT TOP 1 when i want to fill up a variable

    Select TOP 1 @MyVar=Field1 from table where..

    for witch i know it could be dangerous without the use of select top1

    Wkr,

    Eddy

  • Hunterwood (2/16/2010)


    It´s allways a good rule of thumb to use TOP 1 together with EXISTS, because it prevents the database engine from doing unneccesary work.

    Got an example that proves that?

    I you write:

    select *

    from OneTable

    where EXISTS (select * from SecondTable where OneCol = AnotherCol)

    The database engine will first get ALL rows in the SecondTable, then it will check if the result has any rows.

    No it will not.

    Using an example in my testing DB... (250000 rows in BigTable, 3819 in SmallerTable)

    SELECT * FROM SmallerTable --(ALL the rows in SmallerTable)

    Table 'SmallerTable'. Scan count 1, logical reads 14, physical reads 0

    Hence I can conclude that a read of ALL the rows in SmallerTable does 14 logical reads

    SELECT *

    FROM dbo.BigTable

    WHERE EXISTS (SELECT * FROM dbo.SmallerTable WHERE dbo.BigTable.SomeColumn = dbo.SmallerTable.LookupColumn)

    Table 'BigTable'. Scan count 1, logical reads 3639, physical reads 0

    Table 'SmallerTable'. Scan count 1, logical reads 9, physical reads 0

    Only 9 logical reads this time, so it has not read the entire of SmallerTable

    If you instead specify TOP 1:

    select *

    from OneTable

    where EXISTS (select TOP 1 * from SecondTable where OneCol = AnotherCol)

    The database engine will only fetch ONE row from the SecondTable, then it will check if the result has any rows.

    SELECT *

    FROM dbo.BigTable

    WHERE EXISTS (SELECT TOP(1) * FROM dbo.SmallerTable WHERE dbo.BigTable.SomeColumn = dbo.SmallerTable.LookupColumn)

    Table 'BigTable'. Scan count 1, logical reads 3639, physical reads 0.

    Table 'SmallerTable'. Scan count 1, logical reads 9, physical reads 0

    Identical number of logical reads to the case where I didn't have the TOP 1.

    The same thing is applicable for TSQL statements like:

    IF ((select count(*) from myTable) > 0)

    ...

    If you change this to:

    IF (EXISTS (select top 1 * from myTable))

    The database engine doesn´t have to count the rows, when you aren´t even interrested in how many there are...

    The optimiser is smart enough to realise that when you say IF ((select count(*) from myTable) > 0) it doesn't have to count the rows anyway.

    SELECT COUNT(*) FROM dbo.BigTable

    Table 'BigTable'. Scan count 1, logical reads 438, physical reads 0

    Hence we can conclude that counting all the rows in BigTable requires 438 logical reads

    IF (SELECT COUNT(*) FROM dbo.BigTable) > 0

    PRINT 'Rows!'

    Table 'BigTable'. Scan count 1, logical reads 3, physical reads 0

    With only 3 logical reads, that did not count all the rows in the table

    IF EXISTS (SELECT * FROM dbo.BigTable)

    PRINT 'Rows!'

    Table 'BigTable'. Scan count 1, logical reads 3, physical reads 0

    Exactly the same number of reads as for the IF... COUNT

    Now, if you count the rows, assign that to a variable and then check if the variable is > 0, there is a performance difference as, in that case, SQL does have to count the rows as it can't tell what else you may do with the variable later.

    DECLARE @i INT

    SELECT @i = COUNT(*) FROM dbo.BigTable

    IF @i > 0

    PRINT 'Rows!'

    This construct does count all the rows in the table, and the IO stats proves that

    Table 'BigTable'. Scan count 1, logical reads 438, physical reads 0

    If the databases are small the performance difference might not be so huge, but it´s allways a good habbit. 🙂

    A good habit is to test alternatives and see exactly what the performance difference really is, if there's one in the first place.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • eddy-644184 (2/20/2010)


    I will on the other hand always make sure that for the IF EXISTS i always select 1 field only and mostly this would be the primary key (indexed)

    EXISTS doesn't care about the columns. The column references in the SELECT inside an EXISTS are ignored by SQL, it knows that all you want is to tell if the row exists or not, and column values aren't important there.

    EXISTS (SELECT * FROM Table2 ...)

    EXISTS (SELECT pk_field FROM Table2 ...)

    EXISTS (SELECT 1 FROM Table2 ...)

    are all treated identically.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Tnx Gila,

    If exist ( Select confirmation from expert)

    Do not rewrite you procedurs

    else

    AI AI AI

    You made my day

    Didnt know that about the select * vs the select Key

    But i try to minimize the use of select *, so its kinda a habit for just typing what i really need 😀

    Could you just confirm my statement on the use of select TOP 1 @PARAM = Value or am i wrong there .?

    Wkr

    Eddy

  • eddy-644184 (2/20/2010)


    Didnt know that about the select * vs the select Key

    But i try to minimize the use of select *, so its kinda a habit for just typing what i really need 😀

    I tend to write EXISTS (SELECT 1 FROM SomeTable ...), to make it clear that no columns are returned or necessary

    Could you just confirm my statement on the use of select TOP 1 @PARAM = Value or am i wrong there .?

    It'll work without the TOP 1, if the query returns more than 1 row, you'll get one of them, it isn't guaranteed which one. But then, if you use TOP 1 without an order by, it isn't guaranteed which one you'll get either.

    SET @var = (SELECT value FROM ...) will throw an error if the query returns more than one row.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • John Paul -

    SQL Server allows you to use joins in your DELETE statements, so perhaps having EXISTS is not necessary.

    Assuming the primary key in both tables is Pkey, you could use something like:

    DELETE Table1

    from

    Table1 inner join

    (Select * from Table1

    INTERSECT

    SELECT * from Table2

    ) DupeList on Table1.pkey=DupeList.Pkey

    Note: you want to use EXPLICIT column lists instead of * in the INTERSECT statement, just to make sure that the columns are appropriately lined up with each other.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Hi Gail,

    GilaMonster (2/20/2010)


    Hunterwood (2/16/2010)


    It´s allways a good rule of thumb to use TOP 1 together with EXISTS, because it prevents the database engine from doing unneccesary work.

    Got an example that proves that?

    No. I don't. But in some situations when trying to optimize queries, I have seen better performace when changing from "exists (select * from..." to "exists (select TOP 1 1 from ...". Don't have any specific example, though.

    My conclusion is also based on the article "Time Bomb Coding" by David Poole: http://www.sqlservercentral.com/articles/Performance+Tuning/69337/, wich addresses this topic (among others).

    If the databases are small the performance difference might not be so huge, but it´s allways a good habbit. 🙂

    A good habit is to test alternatives and see exactly what the performance difference really is, if there's one in the first place.

    Your'e right. Never knows what works best in a specific case without testing alternatives.

    I obviously was too fast to make my conclusion that it's allways better. Thanks for the good analyze and example!

    What is your opinion on the article?

    Have you ever seen a performance problem using the code bellow (copied from the article)?

    IF ( SELECT COUNT(*) FROM .... WHERE ...) > 0

    BEGIN

    ...etc

    END

    /Markus

  • Hunterwood (2/23/2010)


    Have you ever seen a performance problem using the code bellow (copied from the article)?

    IF ( SELECT COUNT(*) FROM .... WHERE ...) > 0

    BEGIN

    ...etc

    END

    Not on SQL 2005/2008. It may be that there's a difference in earlier versions that has since changed. In 2005/2008, the parser/optimiser is smart enough to realise that the above construct is an EXISTS, phrased differently.

    If the count and the IF are separate statements, then there is an obvious and blatant performance difference

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply