Merge count by grouping variables back into source table

Question

Merge count by grouping variables back into source table

Scott In Sydney

Hall of Fame

Points: 3331
More actions
July 15, 2014 at 11:00 pm

#397062

Summary: I need to identify which records are violating a unique index to get the source file fixed.
Details: Apologies, I know this should be simple, and I've Googled before posting. I'm used to a different version of SQL where this approach would work:
select t2.cnt, t1.*
from #t1 t1
left join
(
select caseid, claimnum, linenum, invoicenum, batchnum, count(*) as cnt
from #t1
group by caseid, claimnum, linenum, invoicenum, batchnum
) t2
on t1.caseid=t2.caseid
and t1.claimnum=t2.claimnum
and t1.linenum=t2.linenum
and t1.invoicenum=t2.invoicenum
and t1.batchnum=t2.batchnum
having t2.cnt > 1
I need a count by the grouping (index) variables, then re-join that number into the source data, then filter for count > 1. If my source file is 10K records, the output should be 10K records. I need all source columns in the output to analyze if it really is a duplicate record, or if another column needs to be added to the index.
The above query returns NULL for cnt for all records.
If you want a post with sample data to work with just let me know; I thought the above sufficient to explain the issue.
Thanks...

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply

Hunterwood SSCrazy Points: 2074 More actions · Answer 1

Scott In Sydney (7/15/2014)
Summary: I need to identify which records are violating a unique index to get the source file fixed.
Details: Apologies, I know this should be simple, and I've Googled before posting. I'm used to a different version of SQL where this approach would work:
select t2.cnt, t1.*
from #t1 t1
left join
(
select caseid, claimnum, linenum, invoicenum, batchnum, count(*) as cnt
from #t1
group by caseid, claimnum, linenum, invoicenum, batchnum
) t2
on t1.caseid=t2.caseid
and t1.claimnum=t2.claimnum
and t1.linenum=t2.linenum
and t1.invoicenum=t2.invoicenum
and t1.batchnum=t2.batchnum
having t2.cnt > 1
I need a count by the grouping (index) variables, then re-join that number into the source data, then filter for count > 1. If my source file is 10K records, the output should be 10K records. I need all source columns in the output to analyze if it really is a duplicate record, or if another column needs to be added to the index.
The above query returns NULL for cnt for all records.
If you want a post with sample data to work with just let me know; I thought the above sufficient to explain the issue.
Thanks...

Hi,

your idea is correct; you have just made a "typo".

You perform the grouping in the nested query and not in the main query. Thus you can't use HAVING in the main query.

So: Just change the last row of your query to use a WHERE instead and it will work properly.

Good luck!

/M