Aggregate Query question

  • I'm sure I'm overlooking the obvious, and this is simplified, but given a table...

    Items

    Item, INT

    Quantity, INT

    ...with a dataset...

    Item Qty[/b]

    1000 3

    1001 5

    1002 6

    1003 5

    1004 8

    1005 6

    ...one way to return the rows where a sum of Quantity = 10 would be...

    SELECT *

    FROM Items t1

    WHERE EXISTS ( SELECT t2.quantity

    FROM Items t2

    WHERE t1.quantity + t2.quantity = 10 )

    ...returns...

    1001

    1003

    ...but is limited to the number of instances enumerated. Similarly...

    SELECT *

    FROM Items t1 ,

    Items t2 ,

    Items t3 ,

    Items t4

    WHERE (t1.quantity + t2.quantity + t3.quantity + t4.quantity) = 22

    ...is also feasible, but is still bound by the number of table instances. I'm looking for a way to say...

    SELECT *

    FROM Items t1

    WHERE SUM(t1.quantity) = <int> --(say 22)

    ...so that the set returned is not bound to the number of rows by the number of table instances.

    TIA,

    Les

    Dr. Les Cardwell, DCS-DSS
    Enterprise Data Architect
    Central Lincoln PUD

  • I am not sure what the requirement is, but with the look of things, i sense that you need find the ID whose sum of quantity equals or greater than an predefined number. This can be acheived by HAVING clause.

  • As an example:

    IF OBJECT_ID('TempDB..#Temp') IS NOT NULL

    DROP TABLE #Temp;

    CREATE TABLE #Temp

    (

    iD INT IDENTITY(1,1)

    ,ProductID INT

    ,Quantity SMALLINT

    );

    INSERT INTO #Temp (ProductID, Quantity)

    SELECT 1000 , 2

    UNION ALL SELECT 1000 , 3

    UNION ALL SELECT 1000 , 4

    UNION ALL SELECT 1001 , 3

    UNION ALL SELECT 1001 , 4

    UNION ALL SELECT 1002 , 1

    UNION ALL SELECT 1002 , 0

    UNION ALL SELECT 1003 , 5

    UNION ALL SELECT 1004 , 10

    UNION ALL SELECT 1005 , 4

    UNION ALL SELECT 1005 , 4

    UNION ALL SELECT 1005 , 3

    ;

    SELECT T.iD , T.ProductID , T.Quantity

    FROM #Temp T

    ;

    DECLARE @Quantity_Limit SMALLINT ;

    SET @Quantity_Limit = 10 ;

    SELECT T.ProductID , SumOfQuantity = SUM(T.Quantity)

    FROM #Temp T

    GROUP BY T.ProductID

    HAVING SUM(T.Quantity) >= @Quantity_Limit

    ;

    IF OBJECT_ID('TempDB..#Temp') IS NOT NULL

    DROP TABLE #Temp;

  • Are you looking for ANY combination of Items where the quantity = 10 (or whatever number), or per-item?

    For example, if item 1000 has qty 6 and item 1001 has qty 2 and item 1002 has qty 2, then the sum of the quantities would be 10, but not for an item.

    I would assume, as ColdCoffee did, that you want items where the quantity for that item is the number you're looking for, but it's not clear from the question or the sample data.

    Also, would you consider it valid, if looking for a quantity per item, if the quantity comes up to 11 and you're looking for 10?

    If the specific number, and higher values aren't allowed, how about if you have 6, 4, and 5, for one item, and you want 10. Two of the three (6 and 4) would come out to 10, but the 5 takes it over the total allowed.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • >>Are you looking for ANY combination of Items where the quantity = 10 (or whatever number), or per-item?

    Any combination of Items.

    >>Also, would you consider it valid, if looking for a quantity per item, if the quantity comes up to 11 and you're looking for 10?

    No. Has to equal 10.

    >>If the specific number, and higher values aren't allowed, how about if you have 6, 4, and 5, for one item, and you want 10. Two of the three (6 and 4) would come out to 10, but the 5 takes it over the total allowed.

    I just want the 6 and 4.

    ~Les

    Dr. Les Cardwell, DCS-DSS
    Enterprise Data Architect
    Central Lincoln PUD

  • CC - the problem is that they don't have a common ID to group on, which is why I was struggling with the HAVING clause. I've considered adding a 'constant' attribute for that purpose, but haven't tried it yet.

    ~Les

    Dr. Les Cardwell, DCS-DSS
    Enterprise Data Architect
    Central Lincoln PUD

  • So, I tried adding a Constant, but no tuples are returned...

    IF OBJECT_ID('TempDB..#Temp') IS NOT NULL

    DROP TABLE #Temp;

    CREATE TABLE #Temp

    (

    iD INT IDENTITY(1,1)

    ,ProductID INT

    ,Quantity SMALLINT

    ,Constant SMALLINT

    );

    INSERT INTO #Temp (ProductID, Quantity, Constant)

    SELECT 1000 , 2, 1

    UNION ALL SELECT 1000 , 3, 1

    UNION ALL SELECT 1000 , 4, 1

    UNION ALL SELECT 1001 , 3, 1

    UNION ALL SELECT 1001 , 4, 1

    UNION ALL SELECT 1002 , 1, 1

    UNION ALL SELECT 1002 , 0, 1

    UNION ALL SELECT 1003 , 5, 1

    UNION ALL SELECT 1004 , 10, 1

    UNION ALL SELECT 1005 , 4, 1

    UNION ALL SELECT 1005 , 4, 1

    UNION ALL SELECT 1005 , 3, 1

    ;

    SELECT T.iD , T.ProductID , T.Quantity, Constant

    FROM #Temp T

    ;

    DECLARE @Quantity_Limit SMALLINT ;

    SET @Quantity_Limit = 10 ;

    SELECT T.Constant, MAX(T.ProductID) , SumOfQuantity = SUM(T.Quantity)

    FROM #Temp T

    GROUP BY T.Constant

    HAVING SUM(T.Quantity) = @Quantity_Limit

    ;

    IF OBJECT_ID('TempDB..#Temp') IS NOT NULL

    DROP TABLE #Temp;

    Dr. Les Cardwell, DCS-DSS
    Enterprise Data Architect
    Central Lincoln PUD

  • Hi Les

    Good challenge, top work. One question: sooner or later someone will post a solution and you will have to test it. Will you use your sample data set or something a little more rigorous? How about posting some more sample data rows, allowing perhaps seven or eight rows summing to the chosen value? As with your vestigial current data set, you will want to state which row combinations meet your criteria. If you can do this, I think we can all look forward to some creative fun.

    Cheers

    ChrisM


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • Okay, given the clarifications provided, this isn't something SQL is good at, so it's going to be awkward to implement in it.

    What you have to do build a recursive function and Cross Apply it. You first select every row that has a quantity less than or equal to the target value. Then you have to take each of those, and combine it with every other row that would add up to the target value or less. Continue iterating till you have all combinations that add up to exactly the value needed.

    Recursive UDFs and CTEs are about the only way to do this in T-SQL, and those are complicated and usually end up being quite slow on any significant number of rows of data.

    Procedural languages, like VB.NET or C# or even C, will do this better, simply because they have more efficient looping and recursion options.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Chris,

    OK. I have to give a bit more info while trying to keep the complexity from exploding. Essentially, what I'm looking for can be expressed against ColdCase's DDL with...

    DECLARE @Quantity_Limit SMALLINT ;

    SET @Quantity_Limit = 5 ;

    SELECT * ,

    t1.Quantity + t2.Quantity + t3.Quantity AS Total

    FROM #Temp t1 ,

    #Temp AS t2 ,

    #Temp AS t3

    WHERE t1.Quantity + t2.Quantity + t3.Quantity = @Quantity_Limit

    AND t1.ProductID NOT IN ( t2.ProductID, t3.ProductID )

    AND t2.ProductID NOT IN ( t1.ProductID, t3.ProductID )

    AND t3.ProductID NOT IN ( t1.ProductID, t2.ProductID )

    However, instead of returning a row of data, it would just return the tuples, though I then have to come up with a means of identifying to which 'set' each of the tuples belongs ....or.... use the above method, but don't limit the number of tuples each output row contains (bounded by the number of #Temp instances in this example).

    ~Les

    Dr. Les Cardwell, DCS-DSS
    Enterprise Data Architect
    Central Lincoln PUD

  • G[]...

    I'd come to the same conclusion, but thought it might be worth a toss up here. I thumbed through all of Celko's books, but nothing jumped out. Intuitively I think there's a way to do it multidimensionally, but haven't come up with anything yet since the predicate (sum) is an abstract. What I have works...to a point. But recursing through a dozen instances sends the permutations over the hill.

    Thx,

    ~Les

    Dr. Les Cardwell, DCS-DSS
    Enterprise Data Architect
    Central Lincoln PUD

  • Hmmmm....

    Let's say you had 50 rows, all with a quantity of 1. The target quantity was 5.

    How would you want that brought back? All the possible combinations of those rows that added to 5?

    Just the rows that could possibly be a component of the 5? You'd end up bringing them all back. What would that tell you? Would you want all the combinations grouped by some artificial designator (like a solution_number)?

    I agree with the idea that t-sql isn't the best vehicle for generating solution combinations like this. A number crunching package like SAS or SPSS would be a better bet ( or find someone who has an old APL compiler).


    And then again, I might be wrong ...
    David Webb

  • If I may, what is the business case for this? I don't understand why you would be looking for a set of unrelated items whose sum(quantity) = aconstantvalue.

  • >>All the possible combinations of those rows that added to 5?

    All the possible DISTINCT combinations.

    ~Les

    Dr. Les Cardwell, DCS-DSS
    Enterprise Data Architect
    Central Lincoln PUD

  • Lynn,

    It's a redistricting (route) requirement, where we're looking to contain each district to roughly the same number of meters (service locations). Urban growth and all that.

    Dr. Les Cardwell, DCS-DSS
    Enterprise Data Architect
    Central Lincoln PUD

Viewing 15 posts - 1 through 15 (of 109 total)

You must be logged in to reply to this topic. Login to reply