Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12345»»»

Recursive cross join to get all available combinaisons Expand / Collapse
Author
Message
Posted Wednesday, April 7, 2010 2:40 PM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Thursday, November 20, 2014 8:50 PM
Points: 20,584, Visits: 9,623
I think the title says it all. This little step is part of a much larger procedure.

Bascially I have 3 documents : orders, shippings and invoices. All with headers and lines obviously.

The fun starts with the fact that shippings and invoices have the column orderNo, but no link between the shipping and invoice. From that little misdesign, I need to match all 3 document # in a list of invoices...

The challenge: each order can have n shippings and the major PITA is that any # of shippings (for a single order) can then be merged into a single invoice... or not.

I have cases where a single order has 5 shippings and only 3 invoices which is where I'm hitting a wall.

In the final query I need to have only 1 line per invoice, so when I have more than 1 shipping, I need to display it like so : "order #" "invoice #" "VBL-001234, VBL-001236"

The only way I can safely match all documents is to go at the details level of shipments and invoices and check that I have 100% match on all rows for sku and Qty which proves I have the correct document links.

This method works excellent if I have only 1 invoice per order even if I have n shippings.

I have adapted my query to work with a list of possible combinaisons (ship #1 + ship #3).


Now what I need to do is generate that list of combos. I need to generate a list of all distinct combinaisons of two or more shipping # having without containing all possible documents. I have the feeling a recursive cte could do it but I've never worked with that and I'm a little lost on how to make it work.

So far, the worst case scenario I have is 16 shippings for a single order hence the reason I'm not building this with a series of self joins.


Here's some sample data and required output :


DECLARE @Base TABLE (OrderNo VARCHAR(20) NOT NULL, ShippingNo VARCHAR(20) NOT NULL, GroupRowID INT NULL, PRIMARY KEY CLUSTERED (OrderNo, ShippingNO))

INSERT INTO @Base (OrderNo, ShippingNO)
VALUES
('VCB-000070', '1'),
('VCB-000070', '2'),
('VCB-000070', '3'),

('VCB-000071', '1'),
('VCB-000071', '2'),
('VCB-000071', '3'),
('VCB-000071', '4')

for 'VCB-000070' :

--1 --exclude
--2 --exclude
--3 --exclude
1,2
1,3
2,3
--1,2,3 --exclude


for 'VCB-000071'

--1 --exclude
--2 --exclude
--3 --exclude
--4 --exclude
1,2
1,3
1,4
2,3
2,4
3,4
1,2,3
1,2,4
2,3,4
--1,2,3,4 --exclude
Post #899004
Posted Wednesday, April 7, 2010 3:12 PM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Today @ 3:51 AM
Points: 20,807, Visits: 32,742
It may help to see the whole problem not just a piece of it. Can you put together a test suite including the expected output? I know I work better when I know the whole problem, and can see what the end result should be.



Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #899025
Posted Wednesday, April 7, 2010 3:45 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 3:41 PM
Points: 6,842, Visits: 13,369
I'm not sure if I fully understood the requirement...
But I would consider building a permanent table holding all variations and query that one based on the number of shippings.
Maybe the following code helps to understand what I'd do (@lvl is used to define the number of shipments):
DECLARE @lvl INT
SET @lvl=4

DECLARE @tbl TABLE
(
id INT
)
INSERT INTO @tbl
SELECT 1 UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
SELECT 5

;WITH cte AS
(
SELECT
t1.id AS id1,
t2.id AS id2,
t3.id AS id3,
t4.id AS id4,
CASE
WHEN t2.id=t3.id THEN 3
WHEN t2.id<t3.id AND t3.id=t4.id THEN 4
ELSE 5 END AS lvl
FROM @tbl t1
CROSS JOIN @tbl t2
CROSS JOIN @tbl t3
CROSS JOIN @tbl t4
WHERE t1.id<=t2.id
AND t2.id<=t3.id
AND t3.id<=t4.id
AND t1.id<>t2.id AND t1.id<>t3.id AND t1.id<>t4.id
), cte2 AS
(
SELECT
id1,
id2,
CASE WHEN @lvl>3 AND id2<>id3 AND id2<>id4 THEN id3 ELSE 8 END AS id3,
CASE WHEN @lvl>4 AND id3<>id4 AND id2<> id4 AND id3<>id4 THEN id4 ELSE 9 END AS id4
FROM cte
WHERE lvl<= @lvl
AND id2<= @lvl
AND id3<= @lvl
AND id4<= @lvl
)
SELECT
id1,
id2,
CASE WHEN id3<8 THEN id3 ELSE 0 END AS id3,
CASE WHEN id4<9 THEN id4 ELSE 0 END AS id4
FROM cte2
WHERE id2>id1 AND id3>id2 AND id4>id3
GROUP BY id1,id2,id3,id4





Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question
How to post performance related questions
Links for Tally Table , Cross Tabs and Dynamic Cross Tabs , Delimited Split Function
Post #899043
Posted Wednesday, April 7, 2010 4:38 PM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Thursday, November 20, 2014 8:50 PM
Points: 20,584, Visits: 9,623
Lynn Pettis (4/7/2010)
It may help to see the whole problem not just a piece of it. Can you put together a test suite including the expected output? I know I work better when I know the whole problem, and can see what the end result should be.



That post would contain about 500 lines of code and another 2000 to build the relevant objects and test data which I'm sure I can't provide because it contains personal data. And of course there's even more to it than what I said but those missing pieces are basic joins to other tables, nothing relevant to my little problem.


I have everything working except the piece to build all possible combinaisons which is where I need help. Other than just going with a loop I don't see any easy way out of this one.
Post #899087
Posted Wednesday, April 7, 2010 4:42 PM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Thursday, November 20, 2014 8:50 PM
Points: 20,584, Visits: 9,623
lmu92 (4/7/2010)
I'm not sure if I fully understood the requirement...
But I would consider building a permanent table holding all variations and query that one based on the number of shippings.
Maybe the following code helps to understand what I'd do (@lvl is used to define the number of shipments):
DECLARE @lvl INT
SET @lvl=4

DECLARE @tbl TABLE
(
id INT
)
INSERT INTO @tbl
SELECT 1 UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
SELECT 5

;WITH cte AS
(
SELECT
t1.id AS id1,
t2.id AS id2,
t3.id AS id3,
t4.id AS id4,
CASE
WHEN t2.id=t3.id THEN 3
WHEN t2.id<t3.id AND t3.id=t4.id THEN 4
ELSE 5 END AS lvl
FROM @tbl t1
CROSS JOIN @tbl t2
CROSS JOIN @tbl t3
CROSS JOIN @tbl t4
WHERE t1.id<=t2.id
AND t2.id<=t3.id
AND t3.id<=t4.id
AND t1.id<>t2.id AND t1.id<>t3.id AND t1.id<>t4.id
), cte2 AS
(
SELECT
id1,
id2,
CASE WHEN @lvl>3 AND id2<>id3 AND id2<>id4 THEN id3 ELSE 8 END AS id3,
CASE WHEN @lvl>4 AND id3<>id4 AND id2<> id4 AND id3<>id4 THEN id4 ELSE 9 END AS id4
FROM cte
WHERE lvl<= @lvl
AND id2<= @lvl
AND id3<= @lvl
AND id4<= @lvl
)
SELECT
id1,
id2,
CASE WHEN id3<8 THEN id3 ELSE 0 END AS id3,
CASE WHEN id4<9 THEN id4 ELSE 0 END AS id4
FROM cte2
WHERE id2>id1 AND id3>id2 AND id4>id3
GROUP BY id1,id2,id3,id4




No you're missing a small part of it. I don't know ahead of time how many levels of cross joins I'll need and it's going to be different for every order # (which a single client can have as much as 300 in a single day... and this report I'm building can display all the client's order since day 1 of the ERP).

So in essence I need to build all the possible combinaisons for all orders at run time. So that leaves me with 2 obvious answers

1 - Dynamic SQL which is really hard to implement

2 - Looping 1 level at the time until everything is resolved... which would be my next try assuming a recursing cross join is not possible is SQL.
Post #899091
Posted Wednesday, April 7, 2010 4:50 PM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Thursday, November 20, 2014 8:50 PM
Points: 20,584, Visits: 9,623
Think of it this way...

You have 4 persons in a room, how many different "couples" can do a handshake.

p1 can shake hands with p2,p3,p4

p2 can't shake hands again with p1 so only p2 with p3 and p4 remain

then only p3 can shake hands with p4.



Now that you have that done, how many 3 way hand shaking can you have (this is where recursion kicks in)?

p1,p2,p3
p1,p2,p4
p2,p3,p4


In my current solution I'm already adressing the possibilities that a single hand shake or everybody at the same time will resolve the join. Now I need everything in between.


Again I have cases with as much as 16 peoples in the room (16 shippings for 1 single order). So that's 14 levels of recursions to build all possible groups of handshakes. I wish I could give you a formula but I just can't figure that one out atm!!!


Post #899099
Posted Wednesday, April 7, 2010 4:54 PM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Today @ 3:51 AM
Points: 20,807, Visits: 32,742
Ninja's_RGR'us (4/7/2010)
Lynn Pettis (4/7/2010)
It may help to see the whole problem not just a piece of it. Can you put together a test suite including the expected output? I know I work better when I know the whole problem, and can see what the end result should be.



That post would contain about 500 lines of code and another 2000 to build the relevant objects and test data which I'm sure I can't provide because it contains personal data. And of course there's even more to it than what I said but those missing pieces are basic joins to other tables, nothing relevant to my little problem.


I have everything working except the piece to build all possible combinaisons which is where I need help. Other than just going with a loop I don't see any easy way out of this one.


Not asking for actual data, but sample data; something that at least represents the problem domain. From what you have posted I'll have to leave it to others who are more imaginative. I need to see the problem and what the ultimate result is going to be to be of any help.




Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #899102
Posted Wednesday, April 7, 2010 5:00 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 3:41 PM
Points: 6,842, Visits: 13,369
Do you know the max number of cross joins ahead of time?
What I'd probably do in this case is to build a static table like the one I posted covering a "safe" level. When processing new data I would check if the nesting level would be sufficient and, if not, call as separate proc that would rebuild that table with expanded levels (most probably max_level + 1).
To query the data I would use an iTVF with the number of shippings as an input parameter to return the reduced matrix.
Basically, the "matrix" table would either be created based on a dynamic SQL statement to cover the unknown levels (including re-creation of the iTVF) or I would throw an exception and redo it script based (depending on the business requirement).
I might even add a pre-warning alert to send me an email if I'd have a level of 20 covered and there will be data 18 levels deep... It depends.

But I agree trying to avoid looping for each shipment.




Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question
How to post performance related questions
Links for Tally Table , Cross Tabs and Dynamic Cross Tabs , Delimited Split Function
Post #899109
Posted Wednesday, April 7, 2010 6:05 PM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Friday, November 14, 2014 4:49 PM
Points: 1,093, Visits: 1,175
This is a really interesting problem. I cannot think of a way to do it completely dynamic in terms of the maximum size of your combination collections and I'm hoping someone will come along with an amazing method I never would have considered.

However, if you can limit the maximum element grouping to a certain number, I have a possible suggestion for you. Basically, it's a tally table ... the numbers would join back to a ROW_NUMBER or identity or however you want to join back to your actual data. The 0's would be non-joins (so as to account for groupings smaller than the maximum.

My tiny tally table to avoid getting too many results back:
declare @test table (myInt int)
insert into @test(myInt)
select 0
union select 1
union select 2
union select 3
union select 4
union select 5
union select 6
union select 7
union select 8
union select 9
union select 10

And my logic to build my complex-er tally table:
select t1.myInt col1,
t2.myInt col2,
t3.myInt col3,
t4.myInt col4,
t5.myInt col5,
t6.myInt as col6
from @test t1
join @test t2
on t2.myInt > t1.myInt
join @test t3
on t3.myInt > t2.myInt or t3.myInt = 0
join @test t4
on (t4.myInt > t3.myInt and t3.myInt > 0) or t4.myInt = 0
join @test t5
on (t5.myInt > t4.myInt and t4.myInt > 0) or t5.myInt = 0
join @test t6
on (t6.myInt > t5.myInt and t5.myInt > 0) or t6.myInt = 0
where t1.myInt > 0
order by t1.myInt, t2.myInt

With only the 10 initial values, it should return 837 possible combination of between 2 and 6 elements.

I would be interested to hear feedback.

Thanks,

-Ben


└> bt


Forum Etiquette: How to post data/code on a forum to get the best help
Post #899159
Posted Wednesday, April 7, 2010 6:39 PM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Thursday, November 20, 2014 8:50 PM
Points: 20,584, Visits: 9,623
bteraberry (4/7/2010)
This is a really interesting problem. I cannot think of a way to do it completely dynamic in terms of the maximum size of your combination collections and I'm hoping someone will come along with an amazing method I never would have considered.

However, if you can limit the maximum element grouping to a certain number, I have a possible suggestion for you. Basically, it's a tally table ... the numbers would join back to a ROW_NUMBER or identity or however you want to join back to your actual data. The 0's would be non-joins (so as to account for groupings smaller than the maximum.

My tiny tally table to avoid getting too many results back:
declare @test table (myInt int)
insert into @test(myInt)
select 0
union select 1
union select 2
union select 3
union select 4
union select 5
union select 6
union select 7
union select 8
union select 9
union select 10

And my logic to build my complex-er tally table:
select t1.myInt col1,
t2.myInt col2,
t3.myInt col3,
t4.myInt col4,
t5.myInt col5,
t6.myInt as col6
from @test t1
join @test t2
on t2.myInt > t1.myInt
join @test t3
on t3.myInt > t2.myInt or t3.myInt = 0
join @test t4
on (t4.myInt > t3.myInt and t3.myInt > 0) or t4.myInt = 0
join @test t5
on (t5.myInt > t4.myInt and t4.myInt > 0) or t5.myInt = 0
join @test t6
on (t6.myInt > t5.myInt and t5.myInt > 0) or t6.myInt = 0
where t1.myInt > 0
order by t1.myInt, t2.myInt

With only the 10 initial values, it should return 837 possible combination of between 2 and 6 elements.

I would be interested to hear feedback.

Thanks,

-Ben



The problem with this approach is that it assumes a max # of recursions... I could of course go on a limb and go all the way to 25 levels up front but I can't even estimate the # of rows this dataset would include. I adjusted your code to have 25 values instead of 10 and still do only 6 joins and the result had almost 250 000 rows... I can't imagine going even only to 16 with this.
Post #899174
« Prev Topic | Next Topic »

Add to briefcase 12345»»»

Permissions Expand / Collapse