Partition by used to count records (to find dupes)

  • I'm getting an error in the following query, can anyone help with? Of course ConcatCount is not allowed in the last two rows:

    SELECT [DOMAIN],[DATE], [TOTAL TIME], COUNT([AGENT NAME] + '    ' + [TOTAL TIME]) 
    Over (PARTITION BY([AGENT NAME] + ' ' + [TOTAL TIME])) AS ConcatCount

    FROM [a2hr].[dbo].[stgOneConvoHrs] where date > '2021-04-18'
    group by ConcatCount
    having ConcatCount > 1

     

  • First step would be to tell us the error as the error likely says how to fix it.

    My GUESS is that you are getting a "invalid column name" error which makes sense as you are trying to group by a column ALAIS, not a column.

    So, to do what you want, you are going to need to either put it into a CTE OR put it into a nested select.  Something along the lines of:

    WITH cte AS (
    SELECT [DOMAIN],[DATE], [TOTAL TIME], COUNT([AGENT NAME] + ' ' + [TOTAL TIME])
    Over (PARTITION BY([AGENT NAME] + ' ' + [TOTAL TIME])) AS ConcatCount

    FROM [a2hr].[dbo].[stgOneConvoHrs] where date > '2021-04-18'
    )
    SELECT *
    FROM cte
    group by ConcatCount
    having ConcatCount > 1

     

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Okay this is confusing. You're using both a windowing function and GROUP BY.  I don't think that's right. If all you're doing is looking for duplicates, the simple pattern is like this:

    USE tempdb;
    go

    SELECT *
    FROM
    (
    SELECT Subjects.Person
    ,Subjects.Subj
    ,rn = ROW_NUMBER() OVER (PARTITION BY Subjects.Person, Subjects.Subj
    ORDER BY Subjects.Person, Subjects.Subj)
    FROM (VALUES ('John','Math'),('John','History'),('John','Math'),
    ('Fred','Math'),('Fred','History'),('Fred','Biology')) Subjects(Person,Subj)
    ) dupes
    WHERE dupes.rn > 1;

    The combination of Person and Total Time is strange. What are you trying to accomplish?

  • I'm getting this error from the CTE solution (which is on the right track I think): Column 'cte.DOMAIN' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

     

  • This idea worked... and is simple... thanks!

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply