Challenge for SQL Gurus: Grouping and outer joins

  • How about ?

    select

    group1, group2, xref, sum(value1), sum(value2)

    from

    (select x.Xref, d.group1, d.group2, d.value1, d.value2

     from @x x left join @data d on x.xref = d.xref) Subq

    group by

    group1,

    group2,

    xref

     


    * Noel

  • create table #xref ( xref int, name nvarchar(32))

    create table #data (group1 nvarchar(32),group2 nvarchar(32),xref int,value1 int,value2 int)

    insert into #xref (xref, name) values (1, 'X1')

    insert into #xref (xref, name) values (2, 'X2')

    insert into #xref (xref, name) values (3, 'X3')

    insert into #data (group1, group2, xref, value1, value2) values ('Group1a', 'Group2a', 1, 100, 200)

    insert into #data (group1, group2, xref, value1, value2) values ('Group1a', 'Group2a', 1, 101, 201)

    insert into #data (group1, group2, xref, value1, value2) values ('Group1a', 'Group2a', 2, 102, 202)

    insert into #data (group1, group2, xref, value1, value2) values ('Group1a', 'Group2b', 1, 103, 203)

    insert into #data (group1, group2, xref, value1, value2) values ('Group1b', 'Group2a', 3, 104, 204)

    select Groups.group1 , Groups.Group2 , #xref.xref , #xref.name

    , COALESCE( GroupSum.value1 ,0 ) as Value1 , COALESCE( GroupSum.value2,0 ) as Value2

    from (select distinct group1, group2

    from #data

    ) as Groups (group1, group2)

    cross join #xref

    left outer join

    (select group1, group2, xref, sum(value1), sum(value2)

    from#data

    group by group1, group2, xref

    ) as GroupSum ( group1, group2, xref, value1, value2 )

    on GroupSum.group1= Groups.group1

    and GroupSum.group2= Groups.group2

    and GroupSum.xref= #xref.xref

    SQL = Scarcely Qualifies as a Language

  • OOPS!

    My previous post was wrong! Didn't read the isntructions correctly

     


    * Noel

  • WOW... and I thaught I was starting to know something about t-sql.

    Where did you learn to do that?? (I know it's just a little trick but still it's not that obvious to do).

  • Wow! I am impressed!

    So soon!

    I'll start "digesting" what you submitted.

    Would you be so kind in explaining your solution?

    I knew it was possible to do this. It only took one guru like you to do it!

    Thank you very much!

  • Ah! the grouping problems... always love them.

    I also taught I knew a lot about SQL but there is always something new to learn lurking around the corner.

  • Solution :

    Select the distinct groups

    cross join to the ref so you get all possible combinaisons

    left join that to the calculated sum of your first partial solution

    isnull() the missing columns with 0 and that's your resultset.

  • Unless I am mistaken, Carl is given credit in Joe Celko's "SQL for Smarties".

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • I really got to get that book...

    and read it eventually .

  • Do this! I certainly have learned a lot from it. And I for myself, enjoy Joe's style of writing.

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • Ya the least we can say is that he has a colorfull style .

  • The real stump the SQL Server guru question is to ask him the correct syntax for a cursor...

    OK, I have to stop now, starting to sound like a broken record.

  • Hi guys,

    Try this one (simplified version):

    select x.group1,x.group2,x.xref,sum(d.value1), sum(d.value2) from

    (select distinct group1,group2,x.xref from @xref x,@data d) as X

    LEFT OUTER JOIN @data d

    on x.[group1] = d.[group1] and x.[group2]=d.group2 and x.xref = d.xref

    group by x.group1,x.group2,x.xref

    order by x.group1,x.group2,x.xref

    Logic is same as (cross join on @xref and @data and a OUTER JOIN with @data).

     


    Kindest Regards,

    Hari

  • Hi guys a quick question on the posts by Carl and Hariharan. First Carl's post uses a cross join without a where statement which will produce a Cartesian product of the tables involved in the join. The size of a Cartesian product result set is the number of rows in the first table multiplied by the number of rows in the second table. On a large data set this would not seem to be desirable. Hariharan's post would seem to be the better option. I have not had a chance to test either option and will not until much later in the day. (But I will have time to check back to the forum ) so my question is which option is most efficient? I apologize for asking without giving this more thought but it will drive me crazy (a short trip) until I can resolve this and I have to leave for work? I would not mind being unemployed if the pay was not so bad.

     

    Thanks

    Mike

  • I checked the execution plans for each query, the first one by Carl has a cost of 49.92% vs 50.08% for Hariharan. But without the real indexed data it's hard to say which is really the fastest.

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply