June 11, 2013 at 8:16 am
I've taken a look at the scripts and forum posts on the site regarding finding duplicate rows, but I haven't been able to adapt them to my situation. I have data being pulled from Oracle and placed in a table in my sql server db. The table consists of a primary key identity field, a field containing the oracle ID, and 4 fields containing data. I need to find all duplicates, with a duplicate being defined as a match across all 4 data fields.
I've tried queries along the following lines, but they don't work.
SELECTMAX(id), MAX(oracle_id), field1, field2, field3, field4
FROMoracleRecords
GROUP BYfield1, field2, field3, field4
HAVING COUNT(*) > 1
If I understand how this is supposed to work, only duplicate rows should be returned. Is this the case? I've gotten it to work when comparing only one field, but when I try to compare more than one I break it. Any suggestions?
June 11, 2013 at 8:19 am
You might want to try this sort of query instead
SELECTid, oracle_id, field1, field2, field3, field4,
COUNT(*) OVER(PARTITION BY field1, field2, field3, field4) AS NumberOfDuplicates
FROMoracleRecords
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537June 12, 2013 at 11:06 am
That worked perfectly. Thanks!
June 12, 2013 at 11:49 am
One further question: Is it possible to identify the newest record of each group and delete the others automatically? One of the fields is a datestamp, but I don't understand how to isolate the newest records.
For example.
Field1 Field2 Field3 Field4
AAA AAA AAA 18 Dec 2012 <- delete
AAA AAA AAA 19 Dec 2012
BBB BBB BBB 1 Jan 2013 <- delete
BBB BBB BBB 3 Jan 2013 <- delete
BBB BBB BBB 6 Jan 2013
Is this possible through tsql?
June 13, 2013 at 1:41 am
WITH CTE AS (
SELECTid, oracle_id, field1, field2, field3, field4,
ROW_NUMBER() OVER(PARTITION BY field1, field2, field3 ORDER BY field4 DESC) AS rn
FROMoracleRecords)
DELETE FROM CTE
WHERE rn > 1;
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy