Group By and JOIN within SQL Server (when dealing with multiple columns)

  • I have 2 tables, df1, and df2. I would like to join the two tables and then perform a GROUP BY aggregation.  I am unsure if I am performing the groupby correctly because my values are not adding up correctly as desired.

    DATA:

    df1

    host purpose verge name    date
    123 sub hello hello1 1/1/2021
    123 sub peace peace1 1/1/2021
    123 sub love love1 1/1/2021
    123 sub hey hey1 1/1/2021
    455 dem no no1 1/1/2021
    455 dem corp corp1 1/1/2021

    df2

    pc  cap free
    123 10 2
    123 20 1
    123 10 2
    123 5 1
    455 5 1
    455 4 3


    DESIRED

    host date     pc  cap free purpose 
    123 1/1/2021 123 45 6 sub
    455 1/1/2021 455 9 4 dem

    DOING

     

    SELECT df1.host, 
    df1.date,
    df2.sum(cap),
    df2.sum(free),
    df1.purpose,
    df2.pc
    FROM df1
    JOIN df2 ON
    df1.host = df2.pc
    GROUP BY df1.purpose, df1.host, df1.date, df2.pc

    I get this error:

    column is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause

    However, I am not getting the desired output and my values do not add up. Any suggestion is appreciated

    • This topic was modified 3 years, 2 months ago by  lynhudson1.
  • Maybe as below.  No directly usable data so I can't test it.  This code takes advantage of the fact that df1.host is known to be equal to df2.pc.

    SELECT df1.host, 
    df1.date,
    sum(df2.cap) AS cap,
    sum(df2.free) AS free,
    df1.purpose,
    df1.host AS pc
    FROM df1
    JOIN df2 ON
    df1.host = df2.pc
    GROUP BY df1.purpose, df1.host, df1.date

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply