How to group the total count of similar items?

  • We sell & ship packages that contain multiple items within them. The actual package (we call it the "parent item") is in the same table as the items within it ("child items"). If the record is a child item within a package, its "ParentId" field will contain the ItemId of the package.

    So some sample records of a complete package would look like this:

    ItemId | ParentId | Name | QtyAvailable

    ----------------------------------------

    1 | NULL | Package A | 10

    2 | 1 | Item 1 | 2

    3 | 1 | Item 2 | 3

    ItemId's 2 & 3 are items contained within the ItemId 1 package.

    Now however, the client wants us to build a report showing all packages (all items where ParentId is NULL) however, they want to see the QtyAvailable of not only the package but the items as well (a total of 15 when using the example above), all grouped into a single line. So a sample report line would look like this:

    Name | Available Qty

    --------------------------

    Package A | 15

    Package B | 100

    How can I do a SELECT statement that SUMS the "QtyAvailable" of both the parent & child items and displays them along with the package name?

    Thanks

  • I would probably do it this way:

    Package---(1,M)--Contains--(M,1)--Items

    CREATE TABLE Contents (

    PackageID INT NOT NULL,

    ItemID INT NOT NULL,

    Qty INT

    FOREIGN KEY fkItemID ItemID REFERENCES Item(ItemID),

    FOREIGN KEY fkPkgID PackageID REFERENCES Package(PackageID));

    I'm assuming that the Items are generic/non-unique. If you do it this way, getting counts is stupid easy.

  • Quick question.

    Can the packages contain packages?

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • This is one query that should at least get the data as you want it.

    It's pretty inefficient, as it requires a couple table scans, so you might want to look for alternate ways to implement the logic that would perform better.

    At any rate, this should give you something to start with (I put about 20 million rows of data similar to yours in a temp table to test; without the exact DDL for your table and indexes that's about as much as I can do):

    SELECT Packages.Name, Packages.ItemID, Item.TotalQtyAvailable

    FROM #tmp Packages

    INNER JOIN (SELECT COALESCE(ParentID,ItemID) AS PackageID,

    SUM(QtyAvailable) as TotalQtyAvailable

    FROM #tmp

    GROUP BY COALESCE(ParentID,ItemID)) Item

    ON Packages.ItemID=Item.PackageID

    WHERE Packages.ParentID IS NULL

    That took a couple minutes to run against 20 million rows on my machine. Your mileage may vary, as always 🙂

    I hope this helps!

    Cheers!

    EDIT: Luis posted his important question while I was typing this up. That does indeed matter, as this solution assumes all packages have a null ParentID. If that's not true, then the solution would have to change.

  • create table tab(

    ItemId int,

    ParentId int,

    Name varchar(10),

    QtyAvailable int);

    insert into tab (

    ItemId,

    ParentId,

    Name,

    QtyAvailable)

    select

    1 as ItemId,

    null as ParentId,

    'Package A' as Name,

    10 as QtyAvailable

    union all

    select

    2,

    1,

    'Item 1',

    2

    union all

    select

    3,

    1,

    'Item 2',

    3;

    with

    subquery_cte as (

    select

    coalesce(ParentId, ItemId) as Id,

    case

    when

    ParentId is null

    then

    Name

    else

    null

    end as Name,

    sum(QtyAvailable) as QtyAvailable

    from

    tab

    group by

    coalesce(ParentId, ItemId),

    case

    when

    ParentId is null

    then

    Name

    else

    null

    end)

    select

    min(Name) as Name,

    sum(QtyAvailable) as QtyAvailable

    from

    subquery_cte

    group by

    Id;

    drop table tab;

    Assuming, you mean simple parent-child relation, this is just a bit different solution. I believe it's rather efficient because references the table only once.

  • I'd have to go with something similar to Rafal here but no need for two GROUP BYs:

    WITH SampleData AS

    (

    SELECT ItemID=1, ParentID=NULL, Name='Package A', QtyAvailable=10

    UNION ALL SELECT 2, 1, 'Item 1', 2

    UNION ALL SELECT 3, 1, 'Item 2', 3

    UNION ALL SELECT 4, NULL, 'Package B', 80

    UNION ALL SELECT 5, 4, 'Item 1', 10

    UNION ALL SELECT 6, 4, 'Item 2', 10

    )

    SELECT ItemID=ISNULL(ParentID, ItemID)

    ,QtyAvailable=SUM(QtyAvailable)

    ,Name=MIN(CASE WHEN ParentID IS NULL THEN Name END)

    FROM SampleData

    GROUP BY ISNULL(ParentID, ItemID);

    Edit: Assuming of course that the answer to Luis's question is no.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Thanks for all of the responses. I went with your solution Dwain. Worked great!

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply