Compare multiple rows in same column

  • Hi,

    I was hoping to get some thoughts on the best approach for a query. It would probably be easier to show a scaled version of my dataset. The real columns are varchar(50), but the below will suffice (feel free to replace the select *... section with any suggestions on the best way to return the result set).

    Below, I need to return all Persons with 1 and 2 but no 3. I clarify which Persons (letter in this case) should be returned in the comment section.

    --return all Persons with 1 and 2 but no 3

    /*

    a - Has all three

    b - Has 1 and 2 but no 3 (RETURN)

    c - Has 1

    d - Has 1 and 2 but no 3 (RETURN)

    e - Has 2 and 3 but no 1

    f - Has 3

    */

    with t as (

    select 1 as Comp, 'a' as Person union all

    select 2 as Comp, 'a' as Person union all

    select 3 as Comp, 'a' as Person union all

    select 1 as Comp, 'b' as Person union all

    select 2 as Comp, 'b' as Person union all

    select 1 as Comp, 'c' as Person union all

    select 1 as Comp, 'd' as Person union all

    select 2 as Comp, 'd' as Person union all

    select 2 as Comp, 'e' as Person union all

    select 3 as Comp, 'e' as Person union all

    select 3 as Comp, 'f' as Person

    )

    select *

    from t

    Does anyone have any thoughts or suggestions? The only thing I came up with was to pivot 1, 2, and 3 as columns in a subquery so I could use a simple WHERE clause. However, I feel like there has to be a more efficient way to accomplish this.

    If I need to generate a create that would create a real table with my sample data please let me know and I'll be glad to do that.

  • Check the following post: http://www.sqlservercentral.com/Forums/FindPost1275453.aspx

    You could follow the whole discussion as well.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • assuming you have a guarantee that only 1 and 3 or the other possible values and you can't have more than one of either it is simply this:

    select person

    from table

    where comp in (1,3)

    group by person

    having count(*) = 2

    Updated to change > to = ...

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • What you originally proposed isn't that complicated...

    with t as (

    select 1 as Comp, 'a' as Person union all

    select 2 as Comp, 'a' as Person union all

    select 3 as Comp, 'a' as Person union all

    select 1 as Comp, 'b' as Person union all

    select 2 as Comp, 'b' as Person union all

    select 1 as Comp, 'c' as Person union all

    select 1 as Comp, 'd' as Person union all

    select 2 as Comp, 'd' as Person union all

    select 2 as Comp, 'e' as Person union all

    select 3 as Comp, 'e' as Person union all

    select 3 as Comp, 'f' as Person

    )

    SELECT

    Person

    FROM (

    SELECT

    person,

    MAX(CASE WHEN comp = 1 THEN 'YES' END) AS One,

    MAX(CASE WHEN comp = 2 THEN 'YES' END) AS Two,

    MAX(CASE WHEN comp = 3 THEN 'YES' END) AS Three

    FROM t

    GROUP BY person

    ) x

    WHERE One = 'YES' AND Two = 'YES' AND Three IS NULL


    SELECT quote FROM brain WHERE original = 1
    0 rows returned

  • SELECT

    t.Person

    FROM

    t

    GROUP BY

    t.Person

    HAVING

    SUM(t.Comp) = 3

    AND COUNT(*) = 2;

    or

    SELECT

    t.Person

    FROM

    t

    INNER JOIN t AS t2

    ON t2.Person = t.Person

    AND t2.Comp = 2

    AND t.Comp = 1

    LEFT JOIN t AS t3

    ON t3.Person = t.Person

    AND t3.Comp = 3

    WHERE

    t3.Person IS NULL;

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Y.B. (7/14/2016)


    What you originally proposed isn't that complicated...

    I suggest that you follow the link I posted. There's a similar version of your code which runs about 4 times slower than the fastests solutions.

    select 1 as Comp, 'a' as Person INTO #t union all

    select 2 as Comp, 'a' as Person union all

    select 3 as Comp, 'a' as Person union all

    select 1 as Comp, 'b' as Person union all

    select 2 as Comp, 'b' as Person union all

    select 1 as Comp, 'c' as Person union all

    select 1 as Comp, 'd' as Person union all

    select 2 as Comp, 'd' as Person union all

    select 2 as Comp, 'e' as Person union all

    select 3 as Comp, 'e' as Person union all

    select 3 as Comp, 'f' as Person;

    --Option 1

    SELECT Person FROM #t WHERE Comp = 1

    INTERSECT

    SELECT Person FROM #t WHERE Comp = 2

    EXCEPT

    SELECT Person FROM #t WHERE Comp = 3;

    --Option 2

    SELECT Person FROM #t t1 WHERE Comp = 1

    AND EXISTS(

    SELECT Person FROM #t t2

    WHERE Comp = 2

    AND t1.Person = t2.Person

    AND NOT EXISTS(

    SELECT Person FROM #t t3

    WHERE Comp = 3

    AND t2.Person = t3.Person));

    GO

    DROP TABLE #t;

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Luis Cazares (7/14/2016)


    Y.B. (7/14/2016)


    What you originally proposed isn't that complicated...

    I suggest that you follow the link I posted. There's a similar version of your code which runs about 4 times slower than the fastests solutions.

    select 1 as Comp, 'a' as Person INTO #t union all

    select 2 as Comp, 'a' as Person union all

    select 3 as Comp, 'a' as Person union all

    select 1 as Comp, 'b' as Person union all

    select 2 as Comp, 'b' as Person union all

    select 1 as Comp, 'c' as Person union all

    select 1 as Comp, 'd' as Person union all

    select 2 as Comp, 'd' as Person union all

    select 2 as Comp, 'e' as Person union all

    select 3 as Comp, 'e' as Person union all

    select 3 as Comp, 'f' as Person;

    --Option 1

    SELECT Person FROM #t WHERE Comp = 1

    INTERSECT

    SELECT Person FROM #t WHERE Comp = 2

    EXCEPT

    SELECT Person FROM #t WHERE Comp = 3;

    --Option 2

    SELECT Person FROM #t t1 WHERE Comp = 1

    AND EXISTS(

    SELECT Person FROM #t t2

    WHERE Comp = 2

    AND t1.Person = t2.Person

    AND NOT EXISTS(

    SELECT Person FROM #t t3

    WHERE Comp = 3

    AND t2.Person = t3.Person));

    GO

    DROP TABLE #t;

    Busy post but interesting nonetheless. I chose that method primarily for the simplicity in which the logic can be changed for different scenarios. I always try to post the best performing code (to my knowledge) but experience tells me someone can/will always come out with a faster/better solution. Heck most of the great techniques I've learned were from this site. i.e. Using cross tabs over pivots. 😉


    SELECT quote FROM brain WHERE original = 1
    0 rows returned

  • I found a solution that seems to perform very well given certain pre-conditions.

  • There must be an index on Person/Comp or Person that includes Comp.
  • There must be fewer than 31 (or 63) possible distinct Comp values.
  • There cannot be duplicates for any Person/Comp value.
  • I tried creating a million row test, but I had trouble getting the right distribution of values to provide a valid test (specifically getting rid of duplicates).

    The following code works with the sample, but will need to be manipulated in order to work with the live data. Specifically, a zero-based DENSE_RANK needs to be assigned to the distinct Comp values, and you may need to use a CTE/derived table to get unique combinations of Person/Comp.

    SELECT t.Person

    FROM #t t

    GROUP BY t.Person

    HAVING SUM(POWER(2, t.Comp)) = 6

    It basically creates a bitmap of the possible values and then chooses only the records that match the required bitmap.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply