|
|
|
SSC-Enthusiastic
      
Group: General Forum Members
Last Login: Wednesday, May 01, 2013 1:17 PM
Points: 158,
Visits: 252
|
|
Ok, so I think I should know the answer to this but I'm getting nothing but what looks like a cartesian product... yuk.
I have 1 table that has a list of master document numbers... for the sake of argument, there is 1 doc num, 1 record. There are 3 tables that have values that relate to the DocNum in the Doc Tbl; however, they all don't totally relate to each other. Confused yet?? 
The DocNums are related. However, the Seqence numbers in the other tables determine how they should appear in the final results table.
Doc Tbl GR Tbl GE tbl Ref tbl
DocNums Docnum GR Seq DocNum GE Seq DocNum Ref Seq 1 1 A 1 1 A 1 1 A 1 1 B 2 1 B 2 1 B 2 1 C 3 1 C 3 1 D 4 1 E 5
If I ignore the master list of Doc Numbers in the Doc tbl and Left Join GR to GE on DocNum and Seq, I get:
Docnum GR GR_Seq GE GE_Seq 1 A 1 A 1 1 B 2 B 2 1 NULL NULL C 3
It starts getting whonky if I add the third table because the Sequence number relates to the other two tables. There is no gaurantee that the quantity of sequence numbers is going to be the same in each table. GE could have 5 in the sequence, while GR has 1, and REF has 15. What's REALY chunking up the whole thing is if I try to refer it back to the master list of DocNums. When I try to do that the whole thing blows up and I get what looks to be a cartesian product...
My end result should look like this:
Docnum GR GR_Seq GE GE_Seq Ref Ref_Seq 1 A 1 A 1 A 1 1 B 2 B 2 B 2 1 NULL NULL C 3 C 3 1 NULL NULL NULL NULL D 4 1 NULL NULL NULL NULL E 5
My recordsets could have 1000-1000000 records on any given day...
I feel like I should be able to do this; however, I feel like I've overlooked something basic. Any ideas anyone? Bueller?
Thanks Crusty.
|
|
|
|
|
SSCrazy
      
Group: General Forum Members
Last Login: Today @ 6:03 AM
Points: 2,345,
Visits: 3,191
|
|
|
|
|
|
SSC-Enthusiastic
      
Group: General Forum Members
Last Login: Wednesday, May 01, 2013 1:17 PM
Points: 158,
Visits: 252
|
|
| I will provide some DDL for the tables in the morning... Thanks for the positive reply... :)
|
|
|
|
|
Say Hey Kid
      
Group: General Forum Members
Last Login: Monday, May 13, 2013 4:06 AM
Points: 681,
Visits: 298
|
|
CptCrusty1 (8/27/2012)
Ok, so I think I should know the answer to this but I'm getting nothing but what looks like a cartesian product... yuk. I have 1 table that has a list of master document numbers... for the sake of argument, there is 1 doc num, 1 record. There are 3 tables that have values that relate to the DocNum in the Doc Tbl; however, they all don't totally relate to each other. Confused yet??  The DocNums are related. However, the Seqence numbers in the other tables determine how they should appear in the final results table. Doc Tbl GR Tbl GE tbl Ref tbl
DocNums Docnum GR Seq DocNum GE Seq DocNum Ref Seq 1 1 A 1 1 A 1 1 A 1 1 B 2 1 B 2 1 B 2 1 C 3 1 C 3 1 D 4 1 E 5
If I ignore the master list of Doc Numbers in the Doc tbl and Left Join GR to GE on DocNum and Seq, I get: Docnum GR GR_Seq GE GE_Seq 1 A 1 A 1 1 B 2 B 2 1 NULL NULL C 3
It starts getting whonky if I add the third table because the Sequence number relates to the other two tables. There is no gaurantee that the quantity of sequence numbers is going to be the same in each table. GE could have 5 in the sequence, while GR has 1, and REF has 15. What's REALY chunking up the whole thing is if I try to refer it back to the master list of DocNums. When I try to do that the whole thing blows up and I get what looks to be a cartesian product... My end result should look like this: Docnum GR GR_Seq GE GE_Seq Ref Ref_Seq 1 A 1 A 1 A 1 1 B 2 B 2 B 2 1 NULL NULL C 3 C 3 1 NULL NULL NULL NULL D 4 1 NULL NULL NULL NULL E 5
Hi you are joining "DocNums" as in first part (query) there is no repeat of the value of "DocNum" 1
in second query there is repeat of doc no .
if there is repeat of joining data returns multiple record.
As it works row by row...
|
|
|
|
|
SSC-Enthusiastic
      
Group: General Forum Members
Last Login: Wednesday, May 01, 2013 1:17 PM
Points: 158,
Visits: 252
|
|
@ Sandeep,
I'm sorry, but I'm not quite sure what it is you're telling me; however, in regards to the data, it's correct. Let me get into some detail.
This is the end result of an ETL process using some pretty mixed up text files. Lets say I have one record in a file that is Pipe-Delimited with double quotes indicated text.
Using the examples in previous post, when I receive the information I have a text file with the following columns:
DocNum GR GE REF
The raw data for the examples provided in the original post:
1|"A,B"|"A,B,C"|"A,B,C,D,E,F"
The providor of the data has essentially combined multiple rows of data into 1 row. For the GR (Second column), "A,B" is actually 2 rows of data broken by the comman. It should actually look like:
Docnum GR 1 A 1 B
The Sequence number is an attempt at recombining the data and is not apart of the original data. The process in place now uses multiple steps to achieve individual tables for the broken out data exactly as it's shown in the previous examples. The problem is that they combine it all back together with Cartesian joins creating a massive landing table.
I've consolidated the cleaning of the data into a single procedure; however, I'm trying to recombine the data so that it's not a cartesian join. All the information in the single record pertains to DocNum 1; however, the end result doesn't necessarily have to be totally organized ABC, etc. Every value needs to be in a record with DocNum 1.
I know this is a very confusing process. The final destination for the data is totally wierd, and will make a dba have a heart attack; however, there is a method to the madness and it works. I'm trying to streamline the process of getting their data to the final step. Also, the example I provided is significantly simplified... but it's the root of the problem.
I'm trying to make it clearer.... .. sorry.
Crusty. 
|
|
|
|
|
SSC-Enthusiastic
      
Group: General Forum Members
Last Login: Wednesday, May 01, 2013 1:17 PM
Points: 158,
Visits: 252
|
|
dwain.c (8/27/2012)
However you have not provided DDL for your tables, nor INSERTs (setups) for your sample data. Remember we're all volunteers here. 
Ok Volunteers... here's your script as promised... I know, I'm slow...
Create table Doc ( DocNum nvarchar(15) NOT NULL );
Insert into Doc (DocNum) Values ('1');
------------------------------ Create Table GR ( DocNum nvarchar(15) NOT NULL, GR nvarchar(15) NOT NULL, Seq int NOT NULL )
Insert into GR (DocNum, GR, Seq) Values (1,'A',1) Insert into GR (DocNum, GR, Seq) Values (1,'B',2)
-------------------------------- Create Table GE ( DocNum nvarchar(15) NOT NULL, GE nvarchar(15) NOT NULL, Seq int NOT NULL )
Insert into GE (DocNum, GE, Seq) Values (1,'A',1) Insert into GE (DocNum, GE, Seq) Values (1,'B',2) Insert into GE (DocNum, GE, Seq) Values (1,'C',3)
--------------------------------------------------
Create Table Ref ( DocNum nvarchar(15) NOT NULL, Ref nvarchar(15) NOT NULL, Seq int NOT NULL )
Insert into Ref (DocNum, Ref, Seq) Values (1,'A',1) Insert into Ref (DocNum, Ref, Seq) Values (1,'B',2) Insert into Ref (DocNum, Ref, Seq) Values (1,'C',3) Insert into Ref (DocNum, Ref, Seq) Values (1,'D',4) Insert into Ref (DocNum, Ref, Seq) Values (1,'E',5)
select * from Doc select * from GR Select * from GE select * from Ref
OH GREAT WIZARDS OF UBER-CODE! HEAR MY SCHREAKING AND BID ME GOOD TIDINGS!!!
|
|
|
|
|
Say Hey Kid
      
Group: General Forum Members
Last Login: Monday, May 13, 2013 4:06 AM
Points: 681,
Visits: 298
|
|
hiC rusty hop u will get it by reading below code
Create table Doc ( DocNum nvarchar(15) NOT NULL );
Insert into Doc (DocNum) Values ('1');
------------------------------ Create Table GR ( DocNum nvarchar(15) NOT NULL, GR nvarchar(15) NOT NULL, Seq int NOT NULL )
Insert into GR (DocNum, GR, Seq) Values (1,'A',1) Insert into GR (DocNum, GR, Seq) Values (1,'B',2)
-------------------------------- Create Table GE ( DocNum nvarchar(15) NOT NULL, GE nvarchar(15) NOT NULL, Seq int NOT NULL )
Insert into GE (DocNum, GE, Seq) Values (1,'A',1) Insert into GE (DocNum, GE, Seq) Values (1,'B',2) Insert into GE (DocNum, GE, Seq) Values (1,'C',3)
--------------------------------------------------
Create Table Ref ( DocNum nvarchar(15) NOT NULL, Ref nvarchar(15) NOT NULL, Seq int NOT NULL )
Insert into Ref (DocNum, Ref, Seq) Values (1,'A',1) Insert into Ref (DocNum, Ref, Seq) Values (1,'B',2) Insert into Ref (DocNum, Ref, Seq) Values (1,'C',3) Insert into Ref (DocNum, Ref, Seq) Values (1,'D',4) Insert into Ref (DocNum, Ref, Seq) Values (1,'E',5)
-------
select * from Ref left join GE on ref.DocNum=GE.DocNum
See the output of bold part of query as Ref have 5 values "1" in DocNum
and GE have 3 values "1" in DocNum
so output will have 15 rows as join behave row by row.
|
|
|
|
|
SSCommitted
      
Group: General Forum Members
Last Login: Today @ 10:29 AM
Points: 1,501,
Visits: 18,198
|
|
WITH AllSeq AS ( SELECT Seq FROM GR UNION SELECT Seq FROM GE UNION SELECT Seq FROM Ref)
SELECT d.DocNum, gr.GR, gr.Seq AS GR_Seq, ge.GE, ge.Seq AS GE_Seq, rf.Ref, rf.Seq AS Ref_Seq FROM Doc d CROSS JOIN AllSeq sq LEFT OUTER JOIN GR gr ON gr.DocNum = d.DocNum AND gr.Seq = sq.Seq LEFT OUTER JOIN GE ge ON ge.DocNum = d.DocNum AND ge.Seq = sq.Seq LEFT OUTER JOIN Ref rf ON rf.DocNum = d.DocNum AND rf.Seq = sq.Seq ORDER BY sq.Seq;
____________________________________________________
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537
Never approach a goat from the front, a horse from the rear, or a fool from any direction.
|
|
|
|
|
SSC Veteran
      
Group: General Forum Members
Last Login: 2 days ago @ 8:56 AM
Points: 274,
Visits: 787
|
|
select Doc.DocNum, GR.GR, GR_Seq = GR.Seq, GE.GE, GE_Seq = GE.Seq, Ref.Ref, Ref_Seq = Ref.Seq from Doc cross join ( values (1),(2),(3),(4),(5),(6),(7),(8),(9) ) N (n) left outer join GR on GR.DocNum = Doc.DocNum and N.n = GR.Seq left outer join GE on GE.DocNum = DOC.DocNum and N.n = GE.Seq left outer join Ref on Ref.DocNum = DOC.DocNum and N.n = Ref.Seq where Ref.Ref is not null order by Doc.DocNum, N.n
|
|
|
|
|
SSCoach
         
Group: General Forum Members
Last Login: 2 days ago @ 1:55 PM
Points: 15,442,
Visits: 9,571
|
|
What you need to do is Full Outer Joins instead of Left Outer Joins. Join on Doc ID and Row Number.
SELECT COALESCE(dbo.GR.DocNum, dbo.GE.DocNum, dbo.Ref.DocNum) AS DocNum, GR, dbo.GR.Seq AS GR_Seq, GE, dbo.GE.Seq AS GE_Seq, Ref, dbo.Ref.Seq AS Ref_Seq FROM dbo.GR FULL OUTER JOIN dbo.GE ON dbo.GR.DocNum = dbo.GE.DocNum AND dbo.GR.Seq = dbo.GE.Seq FULL OUTER JOIN dbo.Ref ON dbo.GE.DocNum = dbo.Ref.DocNum AND dbo.GE.Seq = dbo.Ref.Seq OR dbo.GR.DocNum = dbo.Ref.DocNum AND dbo.GR.Seq = dbo.Ref.Seq;
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
|
|
|
|