Combining rows in table with 3m rows

  • 1. Move duplicates to temp tables.

    2. Delete values from base tables.

    3. Insert from temp tables into base tables and use SUM and GROUP BY to perform your roll up.

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • Are you inserting all 3 million records into a temp table?

    If so, how about inserting ONLY records into a temp table having count(*) > 1 (after the grouping on the main table of course)?

    E.g. Susan, 22 from your example

    select someName, sum(age) into myTempTable

    from mainTable

    group by someName having count(*) > 1

  • I just started trying that (always think of an answer right after I ask for help) so thanks for confirming I'm on the right track.  Much appreciated!

  • Unless I'm missing something, everything about the dupe rows is duped except for the column that you want summed. If that's the case, why not just SELECT INTO a new table, summing the one column, and grouping by the rest of the columns. Unless you have a ton of things tied to this table, such as low level permissions, triggers, indexes, etc.

    To use your example:

    SELECT

      NameField

     ,Sum(NumberField)

    INTO

     NewTableName

    FROM

     OldTableName

    GROUP BY

     NameField

    You'll obviously need to replace your old table with your new one, apply any indexes, triggers, permissions that were affected, but it would do the trick quickly.

    If you go down this path, save the old table until you have fully verified the new one.

  • -- It may be a good Idea to create an index on the Grouping/joining Column

    -- create index idx_someName on SomeTame(someName)

    -- go

    select someName, sum(age) X

    into #tmp

    from mainTable

    group by someName

    having count(*) > 1

     

    delete d

    from SomeTable d

         join

         #Tmp t on d.someName = t.someName

    insert into someTable( SomeName, age )

    select somename, X

    from #Tmp

    drop table #tmp

    -- Drop the index if you created it

    -- drop index SomeTable.idx_someName

    -- go


    * Noel

Viewing 5 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply