need help and getting unique set

  • Hi all,

    I need a help on getting the distinct set.

    My Data look like this:

    SET 1

    SET ID Product Attribute

    1 A 10

    1 A 11

    SET 2

    SET ID Product Attribute

    2 A 10

    2 A 12

    SET 3

    SET ID Product Attribute

    3 A 10

    3 A 11

    Since set ID 1 and 3 has the same product and attribute I just need to get distinct set having the same product and attribute. The result set will look like this

    SET ID Product Attribute

    1 A 10

    1 A 11

    2 A 10

    2 A 12

    Thank you all in advance for your help.

  • Walton (8/19/2013)


    Hi all,

    I need a help on getting the distinct set.

    My Data look like this:

    SET 1

    SET ID Product Attribute

    1 A 10

    1 A 11

    SET 2

    SET ID Product Attribute

    2 A 10

    2 A 12

    SET 3

    SET ID Product Attribute

    3 A 10

    3 A 11

    Since set ID 1 and 3 has the same product and attribute I just need to get distinct set having the same product and attribute. The result set will look like this

    SET ID Product Attribute

    1 A 10

    1 A 11

    2 A 10

    2 A 12

    Thank you all in advance for your help.

    So what happens to Set 3? The product/attribute combination is both Set 1 and 3 but how to decide which Set to use?

    Maybe as simple as using MIN?

    You didn't provide anything usable so my code probably doesn't work as is.

    select min(setid), Product, Attribute

    From SomeTable

    group by Product, Attribute

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange (8/19/2013)


    Walton (8/19/2013)


    Hi all,

    I need a help on getting the distinct set.

    My Data look like this:

    SET 1

    SET ID Product Attribute

    1 A 10

    1 A 11

    SET 2

    SET ID Product Attribute

    2 A 10

    2 A 12

    SET 3

    SET ID Product Attribute

    3 A 10

    3 A 11

    Since set ID 1 and 3 has the same product and attribute I just need to get distinct set having the same product and attribute. The result set will look like this

    SET ID Product Attribute

    1 A 10

    1 A 11

    2 A 10

    2 A 12

    Thank you all in advance for your help.

    So what happens to Set 3? The product/attribute combination is both Set 1 and 3 but how to decide which Set to use?

    Maybe as simple as using MIN?

    You didn't provide anything usable so my code probably doesn't work as is.

    select min(setid), Product, Attribute

    From SomeTable

    group by Product, Attribute

    Hi Sean,

    I do not care if I am getting set 1 or set 3 either one would be fine. But the result set should return the complete list for the set. The result set should look like this :

    SET ID Product Attribute

    1 A 10

    1 A 11

    2 A 10

    2 A 12

    But with the solution you provided, result will like this:

    SET_ID Product Attribute

    1 A 10

    1 A 11

    2 A 12

    Thanks for looking onto it.

    Thanks

  • Since the pseudocode doesn't work I assume that means you need or at least want some additional help? You have been around here long enough to know that the next question is...can you post ddl and sample data? If you have somehow managed to miss what it takes to post a complete question please take a few minutes and read the first link in my signature.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Hi Sean,

    Here are the DDL and sample data.

    Thanks a lot for spending your time on it

    CREATE TABLE #temp

    (set_id INT

    ,product CHAR(5)

    ,attribute INT)

    --insert set 1

    INSERT INTO #temp

    (set_id, product, attribute)

    VALUES

    (1,'A',10)

    ,(1,'A',11)

    --insert set 2

    INSERT INTO #temp

    (set_id, product, attribute)

    VALUES

    (2,'A',10)

    ,(2,'A',12)

    --insert set 2

    INSERT INTO #temp

    (set_id, product, attribute)

    VALUES

    (3,'A',10)

    ,(3,'A',11)

    DROP TABLE #temp

  • Not sure this is a particularly efficient way to do this but this might work:

    SELECT set_id, product, attribute=CAST(item AS INT)

    FROM (

    SELECT set_id=MIN(set_id), product, allattributes

    FROM (

    SELECT set_id, product, attribute

    ,allattributes=STUFF((

    SELECT ',' + STR(attribute, 5)

    FROM #temp b

    WHERE a.set_id = b.set_id AND a.product = b.product

    ORDER BY attribute

    FOR XML PATH('')), 1, 1, '')

    FROM #temp a) a

    GROUP BY product, allattributes) a

    CROSS APPLY dbo.DelimitedSplit8K(allattributes, ',') b

    DelimitedSplit8K is (as the name implies) a delimited string splitter that can be found here[/url].


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • And here's another way (that I am less confident of because it requires that the SUM of attributes for a set is unique), but maybe it'll give you some additional ideas.

    SELECT set_id, product, attribute

    FROM (

    SELECT set_id, product, attribute, m, c

    ,n=ROW_NUMBER() OVER (PARTITION BY product, m ORDER BY set_id)

    FROM (

    SELECT set_id, product, attribute

    ,m=SUM(attribute) OVER (PARTITION BY product, set_id)

    ,c=COUNT(attribute) OVER (PARTITION BY product, set_id)

    FROM #temp) a) a

    WHERE n <= c

    ORDER BY set_id, attribute;


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Hi Walton

    How many sets will you have?

    How many rows will each set have?

    Thanks.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • dwain.c (8/19/2013)


    Not sure this is a particularly efficient way to do this but this might work:

    SELECT set_id, product, attribute=CAST(item AS INT)

    FROM (

    SELECT set_id=MIN(set_id), product, allattributes

    FROM (

    SELECT set_id, product, attribute

    ,allattributes=STUFF((

    SELECT ',' + STR(attribute, 5)

    FROM #temp b

    WHERE a.set_id = b.set_id AND a.product = b.product

    ORDER BY attribute

    FOR XML PATH('')), 1, 1, '')

    FROM #temp a) a

    GROUP BY product, allattributes) a

    CROSS APPLY dbo.DelimitedSplit8K(allattributes, ',') b

    DelimitedSplit8K is (as the name implies) a delimited string splitter that can be found here[/url].

    It's about as inefficient as the brute-force method:

    SELECT s1.*

    FROM #temp s1 -- have to start somewhere.

    WHERE s1.set_id = 1

    UNION ALL

    SELECT s2.*

    FROM (SELECT * FROM #temp WHERE set_id = 2) s2

    WHERE EXISTS ( -- check for a row mismatch between set 2 and set 1.

    SELECT 1

    FROM (SELECT * FROM #temp WHERE s2.set_id = 2) s2

    FULL OUTER JOIN (SELECT * FROM #temp WHERE set_id = 1) s1

    ON s1.Product = s2.Product AND s1.Attribute = s2.Attribute

    WHERE (s2.SET_ID IS NULL OR s1.SET_ID IS NULL)

    )

    UNION ALL

    SELECT s3.*

    FROM #SET3 s3

    WHERE s3.set_id = 3

    AND EXISTS ( -- check for a row mismatch between set 3 and set 1.

    SELECT 1

    FROM (SELECT * FROM #temp WHERE set_id = 3) s3

    FULL OUTER JOIN (SELECT * FROM #temp WHERE set_id = 1) s1

    ON s1.Product = s3.Product AND s1.Attribute = s3.Attribute

    WHERE s3.SET_ID IS NULL OR s1.SET_ID IS NULL

    )

    AND EXISTS ( -- check for a row mismatch between set 3 and set 2.

    SELECT 1

    FROM (SELECT * FROM #temp WHERE set_id = 3) s3

    FULL OUTER JOIN (SELECT * FROM #temp WHERE set_id = 2) s2

    ON s2.Product = s3.Product AND s2.Attribute = s3.Attribute

    WHERE s3.SET_ID IS NULL OR s2.SET_ID IS NULL

    )

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Hi

    These will probably perform like a dog on larger sets, but here is 2 more options

    SELECT set_id, product, attribute

    FROM #temp t

    INNER JOIN

    (

    SELECT DISTINCT MIN(set_id) minid

    FROM #temp

    GROUP BY product, attribute

    ) s ON t.set_id = s.minid

    SELECT set_id, product, attribute

    FROM #temp t

    WHERE set_id in (

    SELECT MIN(set_id) minid

    FROM #temp

    GROUP BY product, attribute

    )

  • I didn't realize that brute force was allowed.

    CREATE TABLE #temp

    (set_id INT

    ,product CHAR(5)

    ,attribute INT)

    INSERT INTO #temp (set_id, product, attribute) VALUES (1,'A',10) ,(1,'A',11)

    INSERT INTO #temp (set_id, product, attribute) VALUES (2,'A',10),(2,'A',12)

    INSERT INTO #temp (set_id, product, attribute) VALUES (3,'A',10),(3,'A',11);

    INSERT INTO #temp (set_id, product, attribute) VALUES (4,'A',10),(4,'A',11);

    INSERT INTO #temp (set_id, product, attribute) VALUES (5,'A',10),(5,'A',12);

    INSERT INTO #temp (set_id, product, attribute) VALUES (6,'A',10),(6,'A',12),(6,'A',13);

    DECLARE @setid INT = 1, @maxid INT = (SELECT MAX(set_id) FROM #temp);

    -- Gotta start somewhere

    SELECT * INTO #temp2 FROM #temp WHERE set_id = @setid;

    WHILE @setid < @maxid

    BEGIN

    SELECT @setid = @setid + 1;

    INSERT INTO #temp2

    SELECT *

    FROM #temp

    WHERE set_id = @setid AND EXISTS (

    SELECT product, attribute

    FROM #temp

    WHERE set_id = @setid

    EXCEPT

    SELECT product, attribute

    FROM #temp2

    GROUP BY set_id, product, attribute

    );

    END

    SELECT * FROM #temp2 ORDER BY set_id, product, attribute;

    GO

    DROP TABLE #temp;

    DROP TABLE #temp2;


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply