Delete Records

Question

Delete Records

John Paul-702936

SSCrazy

Points: 2182
More actions
February 10, 2010 at 7:31 pm

#229932

Delete records from one table which do have a matching field in another table:
delete from Tableone where field1 in
(select field2 from TableTwo)
But script looks for only matching of Field1 and deletes the records..
But .. i want to write a script .. like .. for multplie feilds
Can some send a sample please

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

Bru Medishetty SSC-Insane Points: 23606 More actions · Answer 1

Look at the article link in my signature, you need to post some sample data that would give ready data to test and come up with the solution.

how can we suggest the solution without knowing your table structure and your requirement (conditions).

Bru Medishetty

Blog -- LearnSQLWithBru

Join on Facebook Page Facebook.comLearnSQLWithBru

Twitter -- BruMedishetty

John Paul-702936 SSCrazy Points: 2182 More actions · Answer 2

Hi ,

I want to delete the records from table2 which are in the table1

there is a PK - Family ID.. and i have totally 30 Columns in each table

2 tables are same ..

Here is the Script i wrote

DELETE FROM table2

WHERE ([Family ID] IN

(SELECT [Family ID]FROM table1))

untill now working fine ...

But .. The Script i am looking for is ..

I want to delete the records from table2 which are in table1 ..mean matching all the columns

So .. i updated script like

Delete from table2

where exists (select * from table1)

Just want to know whether it is Correct or not ..please correct me if this is wrong Please ...

Hunterwood SSCrazy Points: 2074 More actions · Answer 3

John Paul-702936 (2/11/2010)
...
But .. The Script i am looking for is ..
I want to delete the records from table2 which are in table1 ..mean matching all the columns
So .. i updated script like
Delete from table2
where exists (select * from table1)
Just want to know whether it is Correct or not ..please correct me if this is wrong Please ...

Hi John Paul,

No, the example you give will not do what you expect it to do. The delete-statement you have written is "if there are any rows in table1, delete all rows in table 2". You need to add some conditions, linking the rows in table1 to the rows in table2:

delete from table2 t2

where exists (select top 1 1 from table1 t1 where t1.col1 = t2.col1 and t1.col2 = t2.col2 [...])

Note: Allways use TOP together with EXISTS to prevent the database engine to fetch all rows in table1 - which can be a lot of rows - when you only need to know that there are at least one...

/Markus

John Paul-702936 SSCrazy Points: 2182 More actions · Answer 4

Hi Mark .. thanks for reply ..

Why TOP 1 is used .. can you explain .. please

Hunterwood SSCrazy Points: 2074 More actions · Answer 5

Hi John,

It´s allways a good rule of thumb to use TOP 1 together with EXISTS, because it prevents the database engine from doing unneccesary work.

I you write:

select *

from OneTable

where EXISTS (select * from SecondTable where OneCol = AnotherCol)

The database engine will first get ALL rows in the SecondTable, then it will check if the result has any rows.

If you instead specify TOP 1:

select *

from OneTable

where EXISTS (select TOP 1 * from SecondTable where OneCol = AnotherCol)

The database engine will only fetch ONE row from the SecondTable, then it will check if the result has any rows.

The same thing is applicable for TSQL statements like:

IF ((select count(*) from myTable) > 0)

...

If you change this to:

IF (EXISTS (select top 1 * from myTable))

The database engine doesn´t have to count the rows, when you aren´t even interrested in how many there are...

If the databases are small the performance difference might not be so huge, but it´s allways a good habbit. 🙂

/Markus

Van Heghe Eddy SSCarpal Tunnel Points: 4449 More actions · Answer 6

Hi hunterwoord,

Are you sure thats the way SQLServer works.?

Cause i was thinking that when one was testing with IF EXISTS this would just be a logical question and that Sqlsvr would be smart enough to stop looking for more records once he had found one hence he knows the answer and the logical question could already be answered.

Please confirmation from an expert on this, would be really appreciated

.

Cause if this is true i have a lot of sql procedures to reconsider.

On the other hand i have statements including this logical question

like :

Select x, y, z= case when exists( select Key from table where a=b) then 'true' else 'false' end

from table1 where ...

And this logical question referencing a 100k+ table without any penalty, runs in 0.003 seconds

I will on the other hand always make sure that for the IF EXISTS i always select 1 field only and mostly this would be the primary key (indexed)

What i on the other hand always do is using SELECT TOP 1 when i want to fill up a variable

Select TOP 1 @MyVar=Field1 from table where..

for witch i know it could be dangerous without the use of select top1

Wkr,

Eddy

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 7

Hunterwood (2/16/2010)
It´s allways a good rule of thumb to use TOP 1 together with EXISTS, because it prevents the database engine from doing unneccesary work.

Got an example that proves that?

I you write:
select *
from OneTable
where EXISTS (select * from SecondTable where OneCol = AnotherCol)
The database engine will first get ALL rows in the SecondTable, then it will check if the result has any rows.

No it will not.

Using an example in my testing DB... (250000 rows in BigTable, 3819 in SmallerTable)

SELECT * FROM SmallerTable --(ALL the rows in SmallerTable)

Table 'SmallerTable'. Scan count 1, logical reads 14, physical reads 0

Hence I can conclude that a read of ALL the rows in SmallerTable does 14 logical reads

SELECT *

FROM dbo.BigTable

WHERE EXISTS (SELECT * FROM dbo.SmallerTable WHERE dbo.BigTable.SomeColumn = dbo.SmallerTable.LookupColumn)

Table 'BigTable'. Scan count 1, logical reads 3639, physical reads 0

Table 'SmallerTable'. Scan count 1, logical reads 9, physical reads 0

Only 9 logical reads this time, so it has not read the entire of SmallerTable

If you instead specify TOP 1:
select *
from OneTable
where EXISTS (select TOP 1 * from SecondTable where OneCol = AnotherCol)
The database engine will only fetch ONE row from the SecondTable, then it will check if the result has any rows.

SELECT *

FROM dbo.BigTable

WHERE EXISTS (SELECT TOP(1) * FROM dbo.SmallerTable WHERE dbo.BigTable.SomeColumn = dbo.SmallerTable.LookupColumn)

Table 'BigTable'. Scan count 1, logical reads 3639, physical reads 0.

Table 'SmallerTable'. Scan count 1, logical reads 9, physical reads 0

Identical number of logical reads to the case where I didn't have the TOP 1.

The same thing is applicable for TSQL statements like:
IF ((select count(*) from myTable) > 0)
...
If you change this to:
IF (EXISTS (select top 1 * from myTable))
The database engine doesn´t have to count the rows, when you aren´t even interrested in how many there are...

The optimiser is smart enough to realise that when you say IF ((select count(*) from myTable) > 0) it doesn't have to count the rows anyway.

SELECT COUNT(*) FROM dbo.BigTable

Table 'BigTable'. Scan count 1, logical reads 438, physical reads 0

Hence we can conclude that counting all the rows in BigTable requires 438 logical reads

IF (SELECT COUNT(*) FROM dbo.BigTable) > 0

PRINT 'Rows!'

Table 'BigTable'. Scan count 1, logical reads 3, physical reads 0

With only 3 logical reads, that did not count all the rows in the table

IF EXISTS (SELECT * FROM dbo.BigTable)

PRINT 'Rows!'

Table 'BigTable'. Scan count 1, logical reads 3, physical reads 0

Exactly the same number of reads as for the IF... COUNT

Now, if you count the rows, assign that to a variable and then check if the variable is > 0, there is a performance difference as, in that case, SQL does have to count the rows as it can't tell what else you may do with the variable later.

DECLARE @i INT

SELECT @i = COUNT(*) FROM dbo.BigTable

IF @i > 0

PRINT 'Rows!'

This construct does count all the rows in the table, and the IO stats proves that

Table 'BigTable'. Scan count 1, logical reads 438, physical reads 0

If the databases are small the performance difference might not be so huge, but it´s allways a good habbit. 🙂

A good habit is to test alternatives and see exactly what the performance difference really is, if there's one in the first place.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 8

eddy-644184 (2/20/2010)
I will on the other hand always make sure that for the IF EXISTS i always select 1 field only and mostly this would be the primary key (indexed)

EXISTS doesn't care about the columns. The column references in the SELECT inside an EXISTS are ignored by SQL, it knows that all you want is to tell if the row exists or not, and column values aren't important there.

EXISTS (SELECT * FROM Table2 ...)

EXISTS (SELECT pk_field FROM Table2 ...)

EXISTS (SELECT 1 FROM Table2 ...)

are all treated identically.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Van Heghe Eddy SSCarpal Tunnel Points: 4449 More actions · Answer 9

Tnx Gila,

If exist ( Select confirmation from expert)

Do not rewrite you procedurs

else

AI AI AI

You made my day

Didnt know that about the select * vs the select Key

But i try to minimize the use of select *, so its kinda a habit for just typing what i really need 😀

Could you just confirm my statement on the use of select TOP 1 @PARAM = Value or am i wrong there .?

Wkr

Eddy

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 10

eddy-644184 (2/20/2010)
Didnt know that about the select * vs the select Key
But i try to minimize the use of select *, so its kinda a habit for just typing what i really need 😀

I tend to write EXISTS (SELECT 1 FROM SomeTable ...), to make it clear that no columns are returned or necessary

Could you just confirm my statement on the use of select TOP 1 @PARAM = Value or am i wrong there .?

It'll work without the TOP 1, if the query returns more than 1 row, you'll get one of them, it isn't guaranteed which one. But then, if you use TOP 1 without an order by, it isn't guaranteed which one you'll get either.

SET @var = (SELECT value FROM ...) will throw an error if the query returns more than one row.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 11

John Paul -

SQL Server allows you to use joins in your DELETE statements, so perhaps having EXISTS is not necessary.

Assuming the primary key in both tables is Pkey, you could use something like:

DELETE Table1

from

Table1 inner join

(Select * from Table1

INTERSECT

SELECT * from Table2

) DupeList on Table1.pkey=DupeList.Pkey

Note: you want to use EXPLICIT column lists instead of * in the INTERSECT statement, just to make sure that the columns are appropriately lined up with each other.

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Hunterwood SSCrazy Points: 2074 More actions · Answer 12

Hi Gail,

GilaMonster (2/20/2010)
Hunterwood (2/16/2010)
It´s allways a good rule of thumb to use TOP 1 together with EXISTS, because it prevents the database engine from doing unneccesary work.
Got an example that proves that?

No. I don't. But in some situations when trying to optimize queries, I have seen better performace when changing from "exists (select * from..." to "exists (select TOP 1 1 from ...". Don't have any specific example, though.

My conclusion is also based on the article "Time Bomb Coding" by David Poole: http://www.sqlservercentral.com/articles/Performance+Tuning/69337/, wich addresses this topic (among others).

If the databases are small the performance difference might not be so huge, but it´s allways a good habbit. 🙂
A good habit is to test alternatives and see exactly what the performance difference really is, if there's one in the first place.

Your'e right. Never knows what works best in a specific case without testing alternatives.

I obviously was too fast to make my conclusion that it's allways better. Thanks for the good analyze and example!

What is your opinion on the article?

Have you ever seen a performance problem using the code bellow (copied from the article)?

IF ( SELECT COUNT(*) FROM .... WHERE ...) > 0

BEGIN

...etc

END

/Markus

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 13

Hunterwood (2/23/2010)
Have you ever seen a performance problem using the code bellow (copied from the article)?
IF ( SELECT COUNT(*) FROM .... WHERE ...) > 0
BEGIN
...etc
END

Not on SQL 2005/2008. It may be that there's a difference in earlier versions that has since changed. In 2005/2008, the parser/optimiser is smart enough to realise that the above construct is an EXISTS, phrased differently.

If the count and the IF are separate statements, then there is an obvious and blatant performance difference

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass