Duplicate record count

  • I have duplicate records in table.I need to count duplicate records based upon Account number and count will be stored in a variable.i need to check whether count > 0 or not in stored procedure.I have used below query.It is not working please help

    SELECT @_Stat_Count= count(*),L1.AcctNo,L1.ReceivedFileID from Legacy L1,Legacy L2,ReceivedFiles where L1.ReceivedFileID = ReceivedFiles.ReceivedFileID

    and L1.AcctNo=L2.AcctNo group by L1.AcctNo,L1.ReceivedFileID having Count(*)> 0

    IF (@_Stat_Count >0)

    BEGIN

    SELECT @status = status_cd from status-table where status_id = 10

    END

  • Your SELECT statement could return more than one row (group) so it doesn't make to much sense to store the count into a variable because you do not know which one are you getting from the pull. Also, if there is no group matching the filter then the value of the variable will be the value before entering the execution of the aggregated query, which could be the NULL mark.

    You could use the EXISTS operator in this case.

    IF EXISTS (

    SELECT 1 AS dc

    FROM Legacy L1,Legacy L2,ReceivedFiles

    WHERE L1.ReceivedFileID = ReceivedFiles.ReceivedFileID AND L1.AcctNo=L2.AcctNo

    GROUP by L1.AcctNo, L1.ReceivedFileID

    HAVING COUNT(*)> 0

    )

    SELECT @status = status_cd

    FROM status-table

    WHERE status_id = 10;

  • SELECT @_Stat_Count = COUNT(*) -- count the dupesets

    FROM (

    SELECT n = 1 -- any output will do here: a row corresponds to a dupeset

    FROM Legacy L1

    INNER JOIN Legacy L2

    ON L1.AcctNo=L2.AcctNo

    INNER JOIN ReceivedFiles

    ON L1.ReceivedFileID = ReceivedFiles.ReceivedFileID

    GROUP BY L1.AcctNo, L1.ReceivedFileID

    HAVING COUNT(*) > 1 -- more than one row in the aggregate

    ) d

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • i could not understand below query.when i execute below query i am getting errors.

    SELECT COUNT(*) -- count the dupesets

    FROM (

    SELECT n = 1 -- i dont understand this statement

    FROM Legacy_crfcard L1

    INNER JOIN Legacy L2 ON L1.AcctNo=L2.AcctNo

    INNER JOIN Legacy L2 ON L1.ValidFrom=L2.ValidFrom

    INNER JOIN Legacy L2 ON L1.ValidTo=L2.ValidTo

    INNER JOIN ReceivedFiles ON L1.ReceivedFileID = ReceivedFiles.ReceivedFileID

    where L1.stat_cd = 100

    and L2.stat_cd <>181

    and L1.MarketCode NOT IN (SELECT MarketCode from Markets where (AllowDupes = '1' OR UPPER(AllowDupes) = 'Y' ))

    GROUP BY L1.AcctNo, L1.ReceivedFileID

    HAVING COUNT(*) > 1 -- more than one row in the aggregate

    )

    i need to put count of duplicate record in a variable.Please help as i am new to SP

  • pilla.sree85 (7/17/2014)


    i could not understand below query.when i execute below query i am getting errors.

    SELECT COUNT(*) -- count the dupesets

    FROM (

    SELECT n = 1 -- i dont understand this statement

    FROM Legacy_crfcard L1

    INNER JOIN Legacy L2 ON L1.AcctNo=L2.AcctNo

    INNER JOIN Legacy L2 ON L1.ValidFrom=L2.ValidFrom

    INNER JOIN Legacy L2 ON L1.ValidTo=L2.ValidTo

    INNER JOIN ReceivedFiles ON L1.ReceivedFileID = ReceivedFiles.ReceivedFileID

    where L1.stat_cd = 100

    and L2.stat_cd <>181

    and L1.MarketCode NOT IN (SELECT MarketCode from Markets where (AllowDupes = '1' OR UPPER(AllowDupes) = 'Y' ))

    GROUP BY L1.AcctNo, L1.ReceivedFileID

    HAVING COUNT(*) > 1 -- more than one row in the aggregate

    )

    i need to put count of duplicate record in a variable.Please help as i am new to SP

    I can't understand it either - you've joined the Legacy table three times (the error is from using the same table alias for each).

    Are the dupes in a single table or are they a product of the first query you posted?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • My Query is i want to pull duplicate records from the table and get the count of each duplicate record from the table.i wrote below query

    IF EXISTS(

    SELECT L1.AcctNo,L1.ReceivedFileID from Legacy_crfcard L1,Legacy_crfcard L2,ReceivedFiles

    where L1.ReceivedFileID = ReceivedFiles.ReceivedFileID

    and L1.AcctNo=L2.AcctNo

    and L1.MarketCode NOT IN (SELECT MarketCode from Markets where (AllowDupes = '1' OR UPPER(AllowDupes) = 'Y' ))

    group by L1.AcctNo,L1.ReceivedFileID having Count(*)>0

    )

    BEGIN

    SELECT @status =status from status_table where status_Id = 102

    END

    its giving null result eventhough table having duplicate records.

  • pilla.sree85 (7/17/2014)


    My Query is i want to pull duplicate records from the table and get the count of each duplicate record from the table.i wrote below query

    IF EXISTS(

    SELECT L1.AcctNo,L1.ReceivedFileID from Legacy_crfcard L1,Legacy_crfcard L2,ReceivedFiles

    where L1.ReceivedFileID = ReceivedFiles.ReceivedFileID

    and L1.AcctNo=L2.AcctNo

    and L1.MarketCode NOT IN (SELECT MarketCode from Markets where (AllowDupes = '1' OR UPPER(AllowDupes) = 'Y' ))

    group by L1.AcctNo,L1.ReceivedFileID having Count(*)>0

    )

    BEGIN

    SELECT @status =status from status_table where status_Id = 102

    END

    its giving null result eventhough table having duplicate records.

    Where you are having NULL ?

    1- EXISTS statement is on the whole result, which mean any kind of duplication is found this IF statement will be executed.

    2- What is the reference of @status variable? as far as i know you want to get the Duplicate rowcount.

    Please share sample record and your desire output so that we can help you in this regard as this is getting very confusing.

  • pilla.sree85 (7/17/2014)


    My Query is i want to pull duplicate records from the table and get the count of each duplicate record from the table.i wrote below query

    IF EXISTS(

    SELECT L1.AcctNo,L1.ReceivedFileID from Legacy_crfcard L1,Legacy_crfcard L2,ReceivedFiles

    where L1.ReceivedFileID = ReceivedFiles.ReceivedFileID

    and L1.AcctNo=L2.AcctNo

    and L1.MarketCode NOT IN (SELECT MarketCode from Markets where (AllowDupes = '1' OR UPPER(AllowDupes) = 'Y' ))

    group by L1.AcctNo,L1.ReceivedFileID having Count(*)>0

    )

    BEGIN

    SELECT @status =status from status_table where status_Id = 102

    END

    its giving null result eventhough table having duplicate records.

    I understand that you are attempting to get a count of dupes and that the query isn't working for you. That part is easy enough to fix. The problem is - we don't understand exactly what it is you are counting because, by joining tables together in your query, you may be introducing cardinality changes which will affect the dupe count.

    Is there any reason why you shouldn't count the dupes using just table Legacy_crfcard?

    SELECT AcctNo, COUNT(*)

    FROM Legacy_crfcard

    GROUP BY AcctNo

    HAVING COUNT(*) > 1

    This will output the accounts in table Legacy_crfcard which have more than one row per account.

    This modification to the query will count the rows for you:

    SELECT COUNT(*)

    FROM (

    SELECT AcctNo, COUNT(*)

    FROM Legacy_crfcard

    GROUP BY AcctNo

    HAVING COUNT(*) > 1

    ) d

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • SELECT

    L1.AcctNo,

    L1.ReceivedFileID

    FROM Legacy_crfcard L1

    -- Why this? If you have any dupes, it will double the number of rows output.

    -- That's ALL it will do - slow up your query.

    INNER JOIN Legacy_crfcard L2

    ON L1.AcctNo = L2.AcctNo

    INNER JOIN ReceivedFiles r

    ON L1.ReceivedFileID = r.ReceivedFileID

    WHERE L1.MarketCode NOT IN (

    SELECT m.MarketCode FROM Markets m WHERE m.AllowDupes IN ('1', 'y', 'Y')

    )

    GROUP BY L1.AcctNo, L1.ReceivedFileID

    -- HAVING COUNT(*) > 0 will output ALL rows.

    -- Don't you only want dupes on L1.AcctNo/L1.ReceivedFileID?

    -- You filter for dupes with HAVING COUNT(*) > 1

    HAVING COUNT(*) > 0

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (7/17/2014)


    I understand that you are attempting to get a count of dupes and that the query isn't working for you. That part is easy enough to fix. The problem is - we don't understand exactly what it is you are counting because, by joining tables together in your query, you may be introducing cardinality changes which will affect the dupe count.

    Is there any reason why you shouldn't count the dupes using just table Legacy_crfcard?

    +1

  • SELECT count(L1.AcctNo) totalcount,L1.AcctNo,L1.ReceivedFileID from Legacy L1,Legacy L2,ReceivedFiles

    where L1.ReceivedFileID = ReceivedFiles.ReceivedFileID

    and L1.AcctNo=L2.AcctNo

    and L1.MarketCode NOT IN (SELECT MarketCode from Markets where (AllowDupes = '1' OR UPPER(AllowDupes) = 'Y' ))

    group by L1.AcctNo,L1.ReceivedFileID having Count(*)>0

    when i execute above query it is showing duplicate records count as total count for each account number based upon recieved file id.

    i need to put total count in a variable for futher checking .if totalcount is >0 i need to execute some query.

    My requirement is

    if any new record before inserting in a database i need to check the condition whether duplicate record is present or not in DB .

  • If duplicate record count >0

    i need to execute below staement

    SELECT @status =status from status_table where status_Id = 102

    desired output would be @status is Duplicate

  • pilla.sree85 (7/17/2014)


    SELECT count(L1.AcctNo) totalcount,L1.AcctNo,L1.ReceivedFileID from Legacy L1,Legacy L2,ReceivedFiles

    where L1.ReceivedFileID = ReceivedFiles.ReceivedFileID

    and L1.AcctNo=L2.AcctNo

    and L1.MarketCode NOT IN (SELECT MarketCode from Markets where (AllowDupes = '1' OR UPPER(AllowDupes) = 'Y' ))

    group by L1.AcctNo,L1.ReceivedFileID having Count(*)>0

    when i execute above query it is showing duplicate records count as total count for each account number based upon recieved file id.

    i need to put total count in a variable for futher checking .if totalcount is >0 i need to execute some query.

    My requirement is

    if any new record before inserting in a database i need to check the condition whether duplicate record is present or not in DB .

    The above query is broken. Before you can put a total count in a variable for further checking or whatever, you need to fix the above query so it returns the correct result. Yes? If yes, then we'll help you correct the broken query. Then we can help you to process the dupe count.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • yes need to fix the baove query

  • pilla.sree85 (7/17/2014)


    yes need to fix the baove query

    -- This is the BASELINE QUERY. It's your starting point.

    -- I've formatted it and added a new column [RowType] to help you to understand what's happening.

    -- Run it and investigate the output.

    -- Comment out the HAVING COUNT(*) > 0 line with two hyphens, same as this line,

    -- and run it again. Investigate the output. It should be the same as the last run.

    -- Uncomment HAVING COUNT(*) > 0 and change it to HAVING COUNT(*) > 1.

    -- run it and investigate the output. Report back what you see.

    SELECT

    COUNT(L1.AcctNo) totalcount,

    [RowType] = CASE

    WHEN COUNT(*) = 1 THEN 'Not a dupe'

    WHEN COUNT(*) > 1 THEN 'AcctNo duplicate'

    ELSE 'Nonsense' END,

    L1.AcctNo,

    L1.ReceivedFileID

    FROM Legacy L1,

    Legacy L2,

    ReceivedFiles

    WHERE L1.ReceivedFileID = ReceivedFiles.ReceivedFileID

    AND L1.AcctNo=L2.AcctNo

    AND L1.MarketCode NOT IN (SELECT MarketCode FROM Markets WHERE (AllowDupes = '1' OR UPPER(AllowDupes) = 'Y' ))

    GROUP BY L1.AcctNo, L1.ReceivedFileID

    HAVING COUNT(*) > 0

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 15 posts - 1 through 15 (of 22 total)

You must be logged in to reply to this topic. Login to reply