Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Best Way to avoid duplicate rows insertion Expand / Collapse
Author
Message
Posted Thursday, December 13, 2012 2:56 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, December 06, 2013 3:10 AM
Points: 2, Visits: 49
Hi,

I am using sql server 2008 enterprise edition. I need to insert large number of rows to my table frequently. I have a primary key column in my table that has the hash value by computing 3 columns of the row. if i use bcp process it terminates the process because of duplicate rows in my data. What other best method to insert large datas to table without dublication. It should be as faster as bcp process.

Thanks in advance.
Post #1396074
Posted Thursday, December 13, 2012 3:01 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Friday, April 11, 2014 7:45 AM
Points: 251, Visits: 538
WITH CTE (
[ColumnName]
,[DuplicateCount]
)
AS (
SELECT
[ColumnName]
,[DuplicateCount]
,ROW_NUMBER() OVER(PARTITION BY [ColumnName]
ORDER BY [ColumnName]
AS DuplicateCount
FROM sample_table
)
DELETE
FROMCTE
WHERE (DuplicateCount > 1)



I think this an efficient way.
Post #1396076
Posted Thursday, December 13, 2012 3:09 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, December 06, 2013 3:10 AM
Points: 2, Visits: 49
But I dont want to delete the records after insertion. Instead I want to skip the duplicate rows while inserting large number of rows.
Post #1396082
Posted Thursday, December 13, 2012 3:19 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: 2 days ago @ 7:37 AM
Points: 860, Visits: 2,323
If the data is already in the database then you should be able to do

WITH CTE
AS
(
Select
ROW_NUMBER() OVER(Partition by UniqueColumn ORDER BY UniqueColumn) rn
,<columnlist>
From myTable
)
Insert into BULK Desttable
Select <colunlist>
from CTE
where Rn=1

If its not then you may have to preload it.


_________________________________________________________________________
SSC Guide to Posting and Best Practices
Post #1396087
Posted Thursday, December 13, 2012 4:48 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Friday, April 11, 2014 7:45 AM
Points: 251, Visits: 538
CTE can be modified to select non duplicate values .


Please Refer,
http://msdn.microsoft.com/en-us/library/ms190766(v=sql.105).aspx
Post #1396124
Posted Thursday, December 13, 2012 8:28 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 11:47 PM
Points: 35,959, Visits: 30,252
saravanakumar-485245 (12/13/2012)
But I dont want to delete the records after insertion. Instead I want to skip the duplicate rows while inserting large number of rows.


You could change your PK to "Ignore Dupes".

The real key is that I would never ever import data directly to its final resting spot. I ALWAYS use a staging table because I ALWAYS validate the data first. If there are dupes present, it's good to know what they are and how many there are. You might also find out that what you're calling a duplicate, really isn't and that you may have selected an incorrect set of columns as the PK.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

"Change is inevitable. Change for the better is not." -- 04 August 2013
(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1396462
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse