SQL Problem requires a solve ???

  •  

     

    Data table is like this, for example

     

    TABLE : Sales

     

    TransBasket           Sku                        Qty

    A100                     123456                  10

    A100                     456789                  5

    A100                     753489                  2

    B200                      456789                  9

    B200                      951235                  6

    B200                      753489                  1

    C200                     111555                  1

    C200                     222555                  4

    C200                     333555                  7

     

    The TransBasket field is the identity of a basket transaction of a customers purchases goods in a super market, a customer will buy a lot of goods ( ie Skus ) within the one transactions ( ie Basket).

     

    So the idea is to do Basket Analysis to determine which skus have a consistent relationship within customers baskets purchases. As you can see from the example data I have made bold two skus that appear in multiple baskets.

     

    I know I can do a WHERE clause that looks like this “WHERE Sku  IN ( 456789, 753489 )” to find all the skus in the sales table. But how can one do a SELECT statement with a WHERE clause that ensures that the two skus are found in the same basket as well.

     

    The only way I can figure out how to do this is to create a KEY that looks like this..

     

    KEY using SQL  like

     

    SELECT TransBasket , @Str = @Str + ‘-‘+Sku+’-‘ AS KEY, SUM(QTY)

    FROM Sales

     

     

    TransBasket           KEY                                           SumQTY

    A100                     -123456-456789-753489-            17

    B200                      -456789-951235-753489-            16

    C200                     -111555-222555-333555-            12

     

    I get the data like so I can do a LIKE clause on 456789, 753489 to get the TransBaskets numbers for analysis. However this takes a long time to run over millions of records.

     

    Any ideas? 

     

  • If I understand your requirement correctly, you're looking for a query that will return all TransBaskets that contain the specified two items.  This should do it for you:

    --Create the table

    CREATE TABLE #Sales (TransBasket CHAR(4), Sku INTQty INT)

    --Enter the data

    INSERT INTO #Sales VALUES ('A100'12345610)

    INSERT INTO #Sales VALUES ('A100'4567895)

    INSERT INTO #Sales VALUES ('A100'7534892)

    INSERT INTO #Sales VALUES ('B200'4567899)

    INSERT INTO #Sales VALUES ('B200'9512356)

    INSERT INTO #Sales VALUES ('B200'7534891)

    INSERT INTO #Sales VALUES ('C200'1115551)

    INSERT INTO #Sales VALUES ('C200'2225554)

    INSERT INTO #Sales VALUES ('C200'3335557)

    --Query

    SELECT DISTINCT s1.TransBasket

    FROM #Sales s1 

    JOIN (SELECT TransBasket

          FROM #Sales WHERE 

          Sku IN (456789753489)

         &nbsp s2

    ON s1.TransBasket s2.TransBasket

    John

  • So simple, works a treat, thanks.

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply