Dual Duplicates redress.

Question

Post reply

Dual Duplicates redress.

jharvey6 76850

Old Hand

Points: 381
More actions
August 7, 2019 at 7:58 pm

#3669648
I have created a stored procedure that pulls personal information from several tables. The output is formatted into XML. I need to manipulate two of the elements to be. I have written a sub-query to return the duplicates within two data fields.I read the base query into
```
INSERT INTO #temp_dups SELECT [RowNumber], [Party], [EID], [BID], [Dups]
FROM (
  SELECT [RowNumber]
, [Party]
, [EID]
, [BID]
, COUNT(*) OVER (PARTITION BY [Party]) AS [Dups]
  FROM #temp) AS [D]
WHERE [D].[Dups] > 1
ORDER BY [RowNumber]
```
The data is typical of below:
RowNumber Party EID BID Dups
11 12345 LLS NP 2
12 12345 P3V TD 2
33 56789 UYL BS 2
34 56789 SFM NP 2
176 54080 D6V IY 3
177 54080 NFC NF 3
178 54080 USM NF 3

Duplicates for a Party can exist in both EID and BID. The minimum case of Dups is 2 and the maximum , so far, of 6 Dups.
But, obviously, there could be more in either EID or BID or any combination thereof.
From this data I need to
1. Return for each single Party, EID and BID as concatenations of the respective data but do not repeat any data.
2. Separate each value with a comma. no biggy.
3. Return only on set of data for each Party. Data representation below.
Sample of outputs from above.
Party EID BID
12345 LLS, P3V NP, TD
56789 UYL, SFM NP, BS
54080 D6V, NFC, USM IY, NF
I don't know what it is but I am completely brain-farting on this one. I have never manipulated duplicate data, only deleted it.

Any ideas on how to proceed?

Thanks in advance.

Viewing 9 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

Site Owners SSC Guru Points: 80411 More actions · Answer 1

Thanks for posting your issue and hopefully someone will answer soon.

This is an automated bump to increase visibility of your question.

jharvey6 76850 Old Hand Points: 381 More actions · Answer 2

Thanks,

This one is difficult. I have gone down many dead end methods on my own. If I come up with it I will post solution. But I suspect it's going to be really ugly.

Jeff Moden SSC Guru Points: 1004704 More actions · Answer 3

If you would post your data in a readily consumable format, I'm sure someone would jump on this problem and provide a correct solution in about two heart beats. Please see the first link in my signature line below for one way to provide "Readily Consumable Data".

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

pietlinden SSC Guru Points: 63640 More actions · Answer 4

Looks like this works. Here's some data... this is what Jeff is talking about... readily consumable data (CREATE TABLE and INSERT SCRIPTS.)

use tempdb;
go
-- set up your table
CREATE TABLE #SomeData (
 RowNum INT,
 Party INT,
 EID CHAR(3),
 BID CHAR(2),
 Dups TINYINT
);
GO
-- insert data
INSERT INTO #SomeData VALUES 
(11,12345,'LLS','NP',2)
,(12,12345,'P3V','TD',2)
,(33,56789,'UYL','BS',2)
,(34,56789,'SFM','NP',2)
,(176,54080,'D6V','IY',3)
,(177,54080,'NFC','NF',3)
,(178,54080,'USM','NF',3);

Now folks can run that and recreate your problem... and solve it. I did it because I'm trying to get my head around STUFF() and FOR XML.

-- now someone can just copy & paste this
-- and run it to get your setup.
SELECT Party 
 , EIDs = TRIM(STUFF((
            SELECT ', ' + EID
            FROM #SomeData sd1
   WHERE sd1.Party = sd2.Party
            FOR XML PATH('')
            ), 1, 1, ''))
 , BIDs = TRIM(STUFF((
   SELECT ', ' + BID
   FROM #SomeData sd1
   WHERE sd1.Party = sd2.Party
   FOR XML PATH('')
   ), 1,1, ''))
FROM #SomeData sd2
GROUP BY Party;

This reply was modified 6 years, 7 months ago by pietlinden.
This reply was modified 6 years, 7 months ago by pietlinden.

DesNorton SSC-Insane Points: 24684 More actions · Answer 5

You can add GROUP BY to Piet's code to ensure that you get distinct data for each EID and BID. Also change the 3rd parameter of STUFF from 1 to 2, so that you don't need to TRIM the data

SELECT Party 
 , EIDs = STUFF((
                  SELECT ', ' + sd1.EID
                  FROM #SomeData sd1
                  WHERE sd1.Party = sd2.Party
                  GROUP BY sd1.EID
                  FOR XML PATH('')
                ), 1, 2, '')
 , BIDs = STUFF((
                  SELECT ', ' + sd1.BID
                  FROM #SomeData sd1
                  WHERE sd1.Party = sd2.Party
                  GROUP BY sd1.BID
                  FOR XML PATH('')
                ), 1, 2, '')
FROM #SomeData sd2
GROUP BY Party;

How to post data/code on a forum to get the best help.

jharvey6 76850 Old Hand Points: 381 More actions · Answer 6

Thanks. That works fine. But the one issue that remains is there is a duplicate value for Bids in the NF is duplicated and there should be only one. I avoided using a cursor but I am using the DUPS number as an index @max-2 value and testing through all of the values for duplicate and dropping past that.

Actually the requirement for the Bids was dropped and the Eid's look like this:

SELECT Party 
 , EIDs = LTRIM(STUFF((
            SELECT ', ' + EID
            FROM #SomeData sd1
   WHERE sd1.Party = sd2.Party
            FOR XML PATH('')
            ), 1, 1, ''))
/* , BIDs = LTRIM(STUFF((
   SELECT ', ' + BID
   FROM #SomeData sd1
   WHERE sd1.Party = sd2.Party
   FOR XML PATH('')
   ), 1,1, '')) */
FROM #SomeData sd2
GROUP BY Party;

DesNorton SSC-Insane Points: 24684 More actions · Answer 7

jharvey6 76850 wrote:

Thanks. That works fine. But the one issue that remains is there is a duplicate value for Bids in the NF is duplicated and there should be only one. I avoided using a cursor but I am using the DUPS number as an index @max-2 value and testing through all of the values for duplicate and dropping past that.
Actually the requirement for the Bids was dropped and the Eid's look like this:
SELECT Party 
 , EIDs = LTRIM(STUFF((
            SELECT ', ' + EID
            FROM #SomeData sd1
   WHERE sd1.Party = sd2.Party
            FOR XML PATH('')
            ), 1, 1, ''))
/* , BIDs = LTRIM(STUFF((
   SELECT ', ' + BID
   FROM #SomeData sd1
   WHERE sd1.Party = sd2.Party
   FOR XML PATH('')
   ), 1,1, '')) */
FROM #SomeData sd2
GROUP BY Party;

See my previous reply. Add GROUP BY to remove the duplicates

How to post data/code on a forum to get the best help.

pietlinden SSC Guru Points: 63640 More actions · Answer 8

This appears to work... but my eyesight is terrible, so test it:

SELECT Party 
 , EIDs = STUFF((
            SELECT DISTINCT ', ' + EID
            FROM #SomeData sd1
   WHERE sd1.Party = sd2.Party
            FOR XML PATH('')
            ), 1, 2, '')
, BIDs = STUFF((
SELECT DISTINCT ', ' + BID
FROM #SomeData sd1
WHERE sd1.Party = sd2.Party
FOR XML PATH('')
), 1,2, '')
FROM #SomeData sd2
GROUP BY Party;

This reply was modified 6 years, 6 months ago by pietlinden. Reason: wrong code!