Sql Duplicate Values

Question

Sql Duplicate Values

kanna.brindavan

Grasshopper

Points: 13
More actions
June 26, 2016 at 3:05 am

#327663

Hu gurus,
I am inserting values through xml. The following is the stored procedure i am using.
i do not want to insert a duplicate id or name in the db table. (assume the table is empty)
when i try to insert the duplicate id or name, then it is inserting the values. how to avoid inserting duplicate values in table.
Note : please try this with empty rows in table.
CREATE TABLE [Employee](
[Id] int primary key,
[FirstName] nvarchar(100)
)
procedure:
--------------
CREATE PROCEDURE SP_Insert_MultipleRows
@xml XML,
@retValue nvarchar(100) OUTPUT
As
BEGIN
SET @retValue='Failed'
INSERT INTO [Employee](
[Id],
[FirstName]
)
SELECT ID,
FirstName
FROM
(
SELECT DISTINCT
COALESCE([Table].[Column].value('ID[1]', 'int'),0) as 'ID',
[Table].[Column].value('FirstName[1]', 'Nvarchar(100)') as 'FirstName'
FROM @xml.nodes('/Customers/customer') as [Table]([Column])
)t
WHERE not EXISTS ( SELECT 1
FROM Employee
WHERE ID = t.ID OR firstName = t.FirstName)
END
sp execute
-----------
Declare @retValue1 varchar(50);
Declare @XmlStr XML;
SET @XmlStr= N'<Customers>
<customer>
<ID>1</ID>
<FirstName>fname1</FirstName>
<LastName>Lname1</LastName>
<Company>DEF</Company>
</customer>
<customer>
<ID>1</ID>
<FirstName>fname2</FirstName>
<LastName>Lname2</LastName>
<Company>ABC</Company>
</customer>
<customer>
<ID>3</ID>
<FirstName>fname3</FirstName>
<LastName>Lname3</LastName>
<Company>ABC</Company>
</customer>
<customer>
<ID>4</ID>
<FirstName>fname3</FirstName>
<LastName>Lname3</LastName>
<Company>ABC</Company>
</customer>
<customer>
<ID>5</ID>
<FirstName>fname5</FirstName>
<LastName>Lname5</LastName>
<Company>ABC</Company>
</customer>
</Customers>';
EXEC [SP_Insert_MultipleRows] @xml=@XmlStr,@retValue=@retValue1 OUTPUT
print @retValue1
In xml node 1,2 id are duplicated, and in node 3,4 FirstName is duplicated, so only the node 5 should be inserted.
in the sstored proc, i am checking the condition with NOT exists (but still duplicate values are inserted). please help me with this.
output:
-------------
idfirstName
1fname1
1fname2
3fname3
4fname3
5fname5
required result:
-------------------
id firstName
5 fname5

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1003853 More actions · Answer 1

Apparently, the data quality of the XML you've been provided to work with is at an all time low. It seems to indicate the company has no clue what the proper name for EmployeeID X actually is nor what the ID for some given employee name is. It seems totally arbitrary to reject any duplicates either by ID or by Name.

Would you actually want your name to be "deleted" from the mix because of this mistake?

My recommendation would be to take the XML back to the people that created the problem and tell them the extreme danger they've managed to create in the XML. You are, after all, messing with employee data.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Eirikur Eiriksson SSC Guru Points: 182881 More actions · Answer 2

Must say that I agree with Jeff here, this seems to be very fragile and error prone, ie. what if two customers have the same name?

😎

The deduplication part is really the easy part, here is a quick example

DECLARE @XmlStr XML = N'<Customers>

<customer>

<ID>1</ID>

<FirstName>fname1</FirstName>

<LastName>Lname1</LastName>

<Company>DEF</Company>

</customer>

<customer>

<ID>1</ID>

<FirstName>fname2</FirstName>

<LastName>Lname2</LastName>

<Company>ABC</Company>

</customer>

<customer>

<ID>3</ID>

<FirstName>fname3</FirstName>

<LastName>Lname3</LastName>

<Company>ABC</Company>

</customer>

<customer>

<ID>4</ID>

<FirstName>fname3</FirstName>

<LastName>Lname3</LastName>

<Company>ABC</Company>

</customer>

<customer>

<ID>5</ID>

<FirstName>fname5</FirstName>

<LastName>Lname5</LastName>

<Company>ABC</Company>

</customer>

</Customers>';

;WITH BASE_DATA AS

(

SELECT

CUSTOMER.DATA.value('(ID/text())[1]' ,'INT' ) AS ID

,CUSTOMER.DATA.value('(FirstName/text())[1]' ,'NVARCHAR(100)') AS FirstName

,CUSTOMER.DATA.value('(LastName/text())[1]' ,'NVARCHAR(100)') AS LastName

,CUSTOMER.DATA.value('(Company/text())[1]' ,'NVARCHAR(100)') AS Company

FROM @XmlStr.nodes('Customers/customer') AS CUSTOMER(DATA)

)

,DUPES_FLAGGED AS

(

SELECT

-- Add one of these for each duplication rule

-- Duplicate ID values

COUNT(*) OVER

(

PARTITION BY BD.ID

) - 1

+

-- Duplicate FirstName & LastName values

COUNT(*) OVER

(

PARTITION BY BD.FirstName

,BD.LastName

) - 1 AS IS_DUPLICATE

,BD.ID

,BD.FirstName

,BD.LastName

,BD.Company

FROM BASE_DATA BD

)

SELECT

DF.ID

,DF.FirstName

,DF.LastName

,DF.Company

FROM DUPES_FLAGGED DF

WHERE DF.IS_DUPLICATE = 0;

;

Output

ID FirstName LastName Company

--- ----------- ----------- --------

5 fname5 Lname5 ABC

Jeff Moden SSC Guru Points: 1003853 More actions · Answer 3

Eirikur Eiriksson (6/27/2016)
Must say that I agree with Jeff here, this seems to be very fragile and error prone, ie. what if two customers have the same name?
😎
The deduplication part is really the easy part, here is a quick example
DECLARE @XmlStr XML = N'<Customers>
<customer>
<ID>1</ID>
<FirstName>fname1</FirstName>
<LastName>Lname1</LastName>
<Company>DEF</Company>
</customer>
<customer>
<ID>1</ID>
<FirstName>fname2</FirstName>
<LastName>Lname2</LastName>
<Company>ABC</Company>
</customer>
<customer>
<ID>3</ID>
<FirstName>fname3</FirstName>
<LastName>Lname3</LastName>
<Company>ABC</Company>
</customer>
<customer>
<ID>4</ID>
<FirstName>fname3</FirstName>
<LastName>Lname3</LastName>
<Company>ABC</Company>
</customer>
<customer>
<ID>5</ID>
<FirstName>fname5</FirstName>
<LastName>Lname5</LastName>
<Company>ABC</Company>
</customer>
</Customers>';
;WITH BASE_DATA AS
(
SELECT
CUSTOMER.DATA.value('(ID/text())[1]' ,'INT' ) AS ID
,CUSTOMER.DATA.value('(FirstName/text())[1]' ,'NVARCHAR(100)') AS FirstName
,CUSTOMER.DATA.value('(LastName/text())[1]' ,'NVARCHAR(100)') AS LastName
,CUSTOMER.DATA.value('(Company/text())[1]' ,'NVARCHAR(100)') AS Company
FROM @XmlStr.nodes('Customers/customer') AS CUSTOMER(DATA)
)
,DUPES_FLAGGED AS
(
SELECT
-- Add one of these for each duplication rule
-- Duplicate ID values
COUNT(*) OVER
(
PARTITION BY BD.ID
) - 1
+
-- Duplicate FirstName & LastName values
COUNT(*) OVER
(
PARTITION BY BD.FirstName
,BD.LastName
) - 1 AS IS_DUPLICATE
,BD.ID
,BD.FirstName
,BD.LastName
,BD.Company
FROM BASE_DATA BD
)
SELECT
DF.ID
,DF.FirstName
,DF.LastName
,DF.Company
FROM DUPES_FLAGGED DF
WHERE DF.IS_DUPLICATE = 0;
;
Output
ID FirstName LastName Company
--- ----------- ----------- --------
5 fname5 Lname5 ABC

Nice code, as always, Eirikur, but I'm curious why you would help someone do something that you agree they should not do. :blink: I hope the company the OP works for doesn't suffer a problem because of it.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Eirikur Eiriksson SSC Guru Points: 182881 More actions · Answer 4

Jeff Moden (6/27/2016)
Nice code, as always, Eirikur, but I'm curious why you would help someone do something that you agree they should not do. :blink: I hope the company the OP works for doesn't suffer a problem because of it.

What the OP had posted was not sufficient for enforcing even the broken/wrong business rules, chances are that OP's company would suffer even more without the assistance. At least now the OP has been warned and provided with a proper way of enforcing the logic.

😎

Jeff Moden SSC Guru Points: 1003853 More actions · Answer 5

Warned but armed.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)