It looks like it should be a Simple Select query... But its not!

  • Dear SQL Guru's

    I'd like to do this as a single SQL statement as it needs to be efficient as possible... I have included the SQL to make it easy to try out

    Many thanks in advance

    CREATE TABLE Table1 (

    Id1 numeric (18,0) PRIMARY KEY NOT NULL,

    Reference int NULL,

    Status int NULL)

    go

    INSERT INTO Table1 Values(15,2372,-1)

    go

    INSERT INTO Table1 Values(16,2373,-1)

    go

    INSERT INTO Table1 Values(17,2374,-1)

    go

    INSERT INTO Table1 Values(18,2375,-1)

    go

    INSERT INTO Table1 Values(19,2376,-1)

    go

    INSERT INTO Table1 Values(20,2377,-1)

    go

    INSERT INTO Table1 Values(21,2378,-1)

    go

    INSERT INTO Table1 Values(22,2379,-1)

    go

    INSERT INTO Table1 Values(23,2380, 1)

    go

    INSERT INTO Table1 Values(24,2380, 1)

    go

    INSERT INTO Table1 Values(25,2380, 1)

    go

    INSERT INTO Table1 Values(26,2380, 0)

    go

    INSERT INTO Table1 Values(27,2380, 0)

    go

    INSERT INTO Table1 Values(28,2380, 0)

    go

    INSERT INTO Table1 Values(29,2380, 0)

    go

    INSERT INTO Table1 Values(30,2380, 0)

    go

    INSERT INTO Table1 Values(31,2380, 0)

    go

    INSERT INTO Table1 Values(32,2381, 0)

    go

    INSERT INTO Table1 Values(33,2382, 0)

    go

    INSERT INTO Table1 Values(34,2382, 0)

    go

    INSERT INTO Table1 Values(35,2382, 0)

    go

    INSERT INTO Table1 Values(36,2383, 0)

    go

    These numbers are arbitary and while the Id1 is unique, Reference may be repeated many times with a Status field that contains -1, 0 or >= 1

    The full table once loaded looks like this:

    1> select Id1, Reference, Status from Table1

    2> go

    Id1 Reference Status

    -------------------- ----------- -----------

    15 2372 -1

    16 2373 -1

    17 2374 -1

    18 2375 -1

    19 2376 -1

    20 2377 -1

    21 2378 -1

    22 2379 -1

    23 2380 1

    24 2380 1

    25 2380 1

    26 2380 0

    27 2380 0

    28 2380 0

    29 2380 0

    30 2380 0

    31 2380 0

    32 2381 0

    33 2382 0

    34 2382 0

    35 2382 0

    36 2383 0

    (22 rows affected)

    What I would like is to execute two queries

    1> select FIRST-QUERY

    2> go

    Id1 Reference Status

    -------------------- ----------- -----------

    32 2381 0

    33 2382 0

    34 2382 0

    35 2382 0

    36 2383 0

    Where SUM(group-by Reference / Status = 0)

    Therefore for any row where Reference has a status > 0 must be excluded

    If I use

    1> select COUNT(Reference), Reference from Table1 Group By Reference having (SUM(Status) = 0)

    2> go

    Reference

    ----------- -----------

    1 2381

    3 2382

    1 2383

    But I need the Id1 returned... if I add Id1, I get an SQL error or the wrong result

    e.g.

    1> select Id1, COUNT(Reference), Reference from Table1 Group By Reference,Id1 having (SUM(Status) = 0)

    2> go

    Id1 Reference

    -------------------- ----------- -----------

    26 1 2380

    27 1 2380

    28 1 2380

    29 1 2380

    30 1 2380

    31 1 2380

    32 1 2381

    33 1 2382

    34 1 2382

    35 1 2382

    36 1 2383

    Doesn't return the correct result.

    Second Problem... looking for Reference where any of the corresponding status fields are > 0

    1> select SECOND-QUERY

    2> go

    Id1 Reference Status

    -------------------- ----------- -----------

    23 2380 1

    24 2380 1

    25 2380 1

    26 2380 0

    27 2380 0

    28 2380 0

    29 2380 0

    30 2380 0

    31 2380 0

    ie where SUM(group-by Reference / Status > 0)

    1> select COUNT(Reference), Reference from Table1 Group By Reference having (SUM(Status) > 0)

    2> go

    Reference

    ----------- -----------

    9 2380

    But doesn't return the Id... similar problem to the above

    Many thanks for your help in advance...

    Paul

  • Hi,

    I'm a little confused.

    If you return the ID's then your count will be affect because of the group by.

    Unless what you looking to do is a partition?

    Could you give me an example of what you would like you result set to look like when you have the correct query?

    Thanks

    Chris

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life
  • Hey, paul, it's me, Jiangfeng. I figured out the solution already. Try below, should be working....

    select t0.* from table1 t0 where

    not exists (

    select t2.id1 from table1 t2 where

    t0.reference = t2.reference

    and t0.reference!=t2.reference

    )

    and t0.status>0;

    If the first one does not work, try the second one, but will be a bit slow

    select * from table1 t0 where

    t0.id1 not in

    (

    select t1.id1 from table1 t1 inner join table1 t2 on

    t1.reference = t2.reference

    and t1.status!=t2.status

    )

    and t0.status>0;

    Try it, man. Good luck!

  • Hi Chris,

    Thank you kindly for your interest/reply...

    To make it clearer,

    I need the query to return a set of values which includes Id1 and Reference where the Status value = 0 and a second query where Status value > 0. If Id1 and Reference were both unique I could simply say

    Select Id1,Reference from Table1 where Status=0

    OR

    Select Id1,Reference from Table1 where Status>0

    Unfortunately however the entries for Reference are not unique

    There may be multiple instances of Reference with different status values, some zero and some with positive values.

    If the table contains

    23 2380 1

    24 2380 1

    25 2380 1

    26 2380 0

    27 2380 0

    28 2380 0

    29 2380 0

    30 2380 0

    31 2380 0

    32 2381 0

    33 2382 0

    34 2382 0

    35 2382 0

    36 2383 0

    I need the first query to return

    32 2381

    33 2382

    34 2382

    35 2382

    36 2383

    Since the Status values are all zero

    And the second query to return

    23 2380

    24 2380

    25 2380

    26 2380

    27 2380

    28 2380

    29 2380

    30 2380

    31 2380

    Since some of the status values for the same reference number are > 0

    I hope this helps - sincerely & thanks... Paul

  • Hi, Paul

    The first query can be like this

    select * from table1 t0

    where t0.status=0

    and not exists (

    select t2.id1 from table1 t2

    where

    t0.reference = t2.reference

    and (t0.status+t2.status)!=0

    )

    Not sure it is the data you want or not. Please try...

  • The second query can be

    select * from table1 t0

    where t0.status>=0

    and exists (

    select t2.id1 from

    table1 t2 where

    t0.reference = t2.reference

    and (t2.status+t0.Status) >0

    )

    See how it works

    🙂

  • SELECT t1.ID1, t1.Reference

    FROM Table1 t1

    INNER JOIN (

    SELECT Reference

    FROM Table1

    GROUP BY Reference

    HAVING SUM(Status) = 0

    ) t2

    ON t1.Reference = t2.Reference

    SELECT t1.ID1, t1.Reference

    FROM Table1 t1

    INNER JOIN (

    SELECT Reference

    FROM Table1

    GROUP BY Reference

    HAVING SUM(Status) > 0

    ) t2

    ON t1.Reference = t2.Reference

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • John's solution should run faster since it's using a derived table concept instead of a correlated sub-query.

    CSQ's (meaning - the sub query uses elements from the outer query to run) are slow since they have to be re-run once for each row in the outer query. The derived table concept (where the sub-query is "static", and the results are used as a static table) tends to be much faster, since it is run once only and then join operations take over the matching (which is a lot more efficient).

    Of course - on really small datasets - the difference will be so small you might not notice the difference. A healthy-sized set of data will tend to start plowing a lot faster with CSQ than derived tables.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Excellent - works perfectly! The table will expand to several thousand records and the query executed every few seconds or so therefore efficiency is very important.

    Thank you kindly to ALL for your help and detailed explanation - Your help is truly appreciated.

    Best regards, Paul.

Viewing 9 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply