Looking For Duplication

  • Hi all - I have an issue I've been fighting with and can't seem to come up with a solution. Basically here is the pseudo code:

    For each RecordID in tmpEvalRecs I need to see if any other order records in tmpOrders have duplication based on ProductID, OrderDate, ProductClass, and ProductType.

    1. Gather the recordid from tmpEvalRecs

    2. Gather ProductID, OrderDate, ProductClass, and ProductType from tmpOrders where recordid = recordid from above

    3. Look for any other orders that have the same line items based on the fields above, in the same order (by orderdate).

    4. I need the original recordid and duplicate RecordID returned

    Any assistance would be greatly appreciated. I'm trying to avoud itterating through the records one by one.

    Here is some table scripts and data:

    CREATE TABLE tmpEvalRecs

    (

    RecordID INT,

    );

    CREATE TABLE tmpOrders

    (

    RecordID INT,

    ProductID VARCHAR(10),

    OrderDate DATETIME,

    ProductClass CHAR(1),

    ProductType VARCHAR(20),

    ProductRep VARCHAR(20),

    ReferralNumber INT

    );

    INSERT INTO tmpEvalRecs

    SELECT

    1

    UNION SELECT

    3

    UNION SELECT

    4;

    INSERT INTO tmpOrders

    SELECT

    1,

    'PD101',

    '07/01/2009',

    'C',

    'WidgetA',

    'Sally',

    21

    UNION SELECT

    1,

    'PD102',

    '07/02/2009',

    'B',

    'WidgetB',

    'Sally',

    21

    UNION SELECT

    1,

    'PD103',

    '07/03/2009',

    'A',

    'WidgetC',

    'Sally',

    21

    UNION SELECT

    2,

    'PD101',

    '07/01/2009',

    'C',

    'WidgetA',

    'Sally',

    21

    UNION SELECT

    2,

    'PD102',

    '07/02/2009',

    'B',

    'WidgetB',

    'Sally',

    21

    UNION SELECT

    2,

    'PD103',

    '07/03/2009',

    'A',

    'WidgetC',

    'Sally',

    21

    UNION SELECT

    3,

    'PD101',

    '07/01/2009',

    'C',

    'WidgetA',

    'Sally',

    21

    UNION SELECT

    3,

    'PD105',

    '07/08/2009',

    'B',

    'WidgetB',

    'Sally',

    21

    UNION SELECT

    4,

    'PD101',

    '07/01/2009',

    'C',

    'WidgetA',

    'Sally',

    21

    UNION SELECT

    5,

    'PD101',

    '07/01/2009',

    'C',

    'WidgetA',

    'Sally',

    21;

  • What is your expected output?

    See if this works for you:SELECT

    DISTINCT O2.*

    FROM

    tmpOrders AS O1

    INNER JOIN

    tmpOrders AS O2

    ON O1.RecordID O2.RecordID

    AND O1.ProductID = O2.ProductID

    AND O1.OrderDate = O2.OrderDate

    AND O1.ProductClass = O2.ProductClass

    AND O1.ProductType = O2.ProductType

    LEFT OUTER JOIN

    tmpEvalRecs AS Rec

    ON O2.RecordID = Rec.RecordID

    WHERE

    Rec.RecordID IS NULL

  • ;WITH CTEs

    AS (

    SELECTO.ProductID,

    O.OrderDate,

    O.ProductClass,

    O.ProductType,

    DENSE_RANK() OVER (PARTITION BY O.ProductID,O.OrderDate,O.ProductClass,O.ProductType ORDER BY NEWID()) 'Duplicate'

    FROMtmpEvalRecs R(NOLOCK) JOIN tmpOrders O(NOLOCK)

    ON(R.recordid = O.recordid)

    )

    SELECT * FROM CTEs WHERE Duplicate > 1

    Note:

    = 1 (Single Occurance)

    >1 (Multiple Occurance)

    Cheers

  • If the you looking for Duplicates based on productids then this should work:

    select COUNT(tmpOrders.productid) as duplicates,tmpOrders.productid,tmpOrders.productclass,tmpOrders.OrderDate,tmpOrders.ProductType

    from tmpOrders

    group by tmpOrders.productid,tmpOrders.productclass,tmpOrders.OrderDate,tmpOrders.ProductType

    having COUNT(tmpOrders.productid)>1

  • i think u needed the recordid as well,,,

    with CTE

    as

    (

    select *, ROW_NUMBER() over (PARTITION by productid,productclass,producttype,orderdate order by orderdate) as rn

    from tmpOrders

    )

    select * from CTE where rn>1;

  • First - Thank you to all that spent time working on a solution.

    Lamprey13's solution is hitting on the correct records. If you look at the tmpOrders data sorted by recordid and orderdate, recordid 2 is a match for recordid 1 based on ProductID,OrderDate,ProductClass, and ProductType. Also, recordid 5 is a match for recordid 4 based on the same fields sorted by orderdate.

    My next challenge is to include both the matched data (as the query is doing now) and the data from the recordid being evaluated.

    Ex: If recordid 2 is found as a match to recordid 1, return the order details for both record id's. Is this possible?

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply