September 18, 2006 at 5:10 pm
Data table is like this, for example
TABLE : Sales
TransBasket Sku Qty
A100 123456 10
A100 456789 5
A100 753489 2
B200 456789 9
B200 951235 6
B200 753489 1
C200 111555 1
C200 222555 4
C200 333555 7
The TransBasket field is the identity of a basket transaction of a customers purchases goods in a super market, a customer will buy a lot of goods ( ie Skus ) within the one transactions ( ie Basket).
So the idea is to do Basket Analysis to determine which skus have a consistent relationship within customers baskets purchases. As you can see from the example data I have made bold two skus that appear in multiple baskets.
I know I can do a WHERE clause that looks like this “WHERE Sku IN ( 456789, 753489 )” to find all the skus in the sales table. But how can one do a SELECT statement with a WHERE clause that ensures that the two skus are found in the same basket as well.
The only way I can figure out how to do this is to create a KEY that looks like this..
KEY using SQL like
SELECT TransBasket , @Str = @Str + ‘-‘+Sku+’-‘ AS KEY, SUM(QTY)
FROM Sales
TransBasket KEY SumQTY
A100 -123456-456789-753489- 17
B200 -456789-951235-753489- 16
C200 -111555-222555-333555- 12
I get the data like so I can do a LIKE clause on 456789, 753489 to get the TransBaskets numbers for analysis. However this takes a long time to run over millions of records.
Any ideas?
September 19, 2006 at 4:53 am
If I understand your requirement correctly, you're looking for a query that will return all TransBaskets that contain the specified two items. This should do it for you:
--Create the table
CREATE TABLE #Sales (TransBasket CHAR(4), Sku INT, Qty INT)
--Enter the data
INSERT INTO #Sales VALUES ('A100', 123456, 10)
INSERT INTO #Sales VALUES ('A100', 456789, 5)
INSERT INTO #Sales VALUES ('A100', 753489, 2)
INSERT INTO #Sales VALUES ('B200', 456789, 9)
INSERT INTO #Sales VALUES ('B200', 951235, 6)
INSERT INTO #Sales VALUES ('B200', 753489, 1)
INSERT INTO #Sales VALUES ('C200', 111555, 1)
INSERT INTO #Sales VALUES ('C200', 222555, 4)
INSERT INTO #Sales VALUES ('C200', 333555, 7)
--Query
SELECT DISTINCT s1.TransBasket
FROM #Sales s1
JOIN (SELECT TransBasket
FROM #Sales WHERE
Sku IN (456789, 753489)
  s2
ON s1.TransBasket = s2.TransBasket
John
September 21, 2006 at 5:48 pm
So simple, works a treat, thanks.
Viewing 3 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply