Find Similar rows and add totals together.

  • Hello all,

    I have a question and just cant figure it out.

    For example I have a table with 5 columns A, B, C, D and E

    I would like to find the rows that all have identical Values in A, B and C and return the SUM(D) and MIN(E)

  • Is this a homework q?

    Look at GROUP BY clause ... it will provide what you need.

    Scott Pletcher, SQL Server MVP 2008-2010

  • LoL.. this is not home work... well i guess it is since its for work, but i am at home. 🙂

    to clarify a little:

    we have a table in our DB that keeps stats of the number of transactions.

    the stats are stored in the table every hour and keep track of Who, Where, why, number/hour and the rowID, time

    these hourly stats are flooding our system and we are looking to reduce them to daily stats.

    so the ultimate plan is to take the grouping of (who, where and why) A.K.A. A, B and C

    and total the number/hour (D) and the min rowID (E) for each grouping. i can deal with the time.

    ---------------------------

    you are right i got the results needed by using the Group By clause

    , for some reason i got lost in the Over(partition by.... ) stuff

    here is what i used

    Select A, B, C, sum(D) as Total, min(E) as MinID

    from statsTable

    group by A, B, C

    now here is what i believe to be the tricky part.

    I need to take these results and save the Total - sum(E) to the row with the MinID

    and delete the rows that are not the minID for each group.

  • Looks like you were on the right track at the start - does this help?

    ;WITH RowPicker AS (

    SELECT A, B, C, D, E,

    Total = SUM(D) OVER (PARTITION BY A, B, C),

    MinID = MIN(E) OVER (PARTITION BY A, B, C)

    FROM statsTabl

    )

    SELECT *

    FROM RowPicker WHERE MinID = E

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • thanks for your iput on that..

    i am currently working with:

    Select A, B, C, sum(D) as Total, min(E) as MinID

    from statsTable

    group by A, B, C

    as this query retuns exactly what i want my final data to ook like.

    the issue now, is to make the data in the 'Statstable' end up exactly like the data the query returns.

    so i am not sure how to modify the T-SQL to end up like that.

    bassically i need to take the datafrom the above query and:

    for each row returned in the above query i need to

    update the row where E = Min(E) and set D with the value of SUM(D)

    delete all rows not equal to minID's

  • leroy-1092048 (12/1/2010)


    ...

    update the row where E = Min(E) and set D with the value of SUM(D)

    delete all rows not equal to minID's

    ...

    So...how will you identify those rows? The query I posted above provides this information.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Barring some type of "fuzzy" update, the fastest way is probably to save your results as a temporary table, then use that table to UPDATE the original data and DELETE the unneeded rows.

    For example:

    IF OBJECT_ID('tempdb.dbo.#totals') IS NOT NULL

    DROP TABLE tempdb.dbo.#totals

    SELECT A, B, C, SUM(D) AS Total, MIN(E) AS MinID, MAX(E) AS MaxID

    INTO #totals

    FROM tablename

    GROUP BY A, B, C

    UPDATE tablename

    SET D = t.Total

    FROM tablename

    INNER JOIN #totals t ON

    t.A = tablename.A AND t.B = tablename.B AND t.C = tablename.C AND

    t.MinID = tablename.E

    DELETE FROM tablename

    FROM tablename

    INNER JOIN #totals t ON

    t.A = tablename.A AND t.B = tablename.B AND t.C = tablename.C AND

    t.MinID <> tablename.E AND tablename.E <= t.MaxID

    Scott Pletcher, SQL Server MVP 2008-2010

  • scott.pletcher (12/2/2010)


    Barring some type of "fuzzy" update, the fastest way is probably to save your results as a temporary table, then use that table to UPDATE the original data and delete the unneeded rows.

    For example:

    IF OBJECT_ID('tempdb.dbo.#totals') IS NOT NULL

    DROP TABLE tempdb.dbo.#totals

    SELECT A, B, C, SUM(D) AS Total, MIN(E) AS MinID, MAX(E) AS MaxID

    INTO #totals

    FROM tablename

    GROUP BY A, B, C

    UPDATE tablename

    SET D = t.Total

    FROM tablename

    INNER JOIN #totals t ON

    t.A = tablename.A AND t.B = tablename.B AND t.C = tablename.C AND

    t.MinID = tablename.E

    DELETE FROM tablename

    FROM tablename

    INNER JOIN #totals t ON

    t.A = tablename.A AND t.B = tablename.B AND t.C = tablename.C AND

    t.MinID <> tablename.E AND tablename.E <= t.MaxID

    What happens when two rows share the same minimum value for 'E'?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • I thought E was an id value and thus unique. Maybe I'm wrong there.

    Scott Pletcher, SQL Server MVP 2008-2010

  • scott.pletcher (12/2/2010)


    I thought E was an id value and thus unique. Maybe I'm wrong there.

    Maybe not - there's not enough info.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Chris ad scott,

    let me test what you two have posted and i will give you an update.

    E is a time stamp for each hour of the day. so there are no duplicates of E.

    so the idea here is to take each time stamp for the day (upto 24 of them) and take the sum(D), that is the transaction count for each hour and reduce it down to daily totals instead of the hourly totals... 1 entry per day. reducing the storage needs and rowcount... especially on older data.

    the reason for the Min(E) on the time stamp is because if a unique (user, location, action --- A, B, C) did not happen then there would be no entry that hour... so i could not assume that each grouping would have a 00:00:00 time stamp

    but i really appreciate your help and you have given me a lot of ammo here to continue. Thank you.

    I will keep you posted and also if all works well i will post the final code that was used.

    Leroy L

  • Thanks for the explanation, that makes sense.

    Scott Pletcher, SQL Server MVP 2008-2010

  • @leroy - thanks.

    @scott - your method will almost certainly perform better than mine, with two analytical functions.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply