Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««123»»

How do I join tables, but select top 1 from 1-many tbl? Expand / Collapse
Author
Message
Posted Sunday, October 14, 2012 2:53 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Today @ 1:49 AM
Points: 1,074, Visits: 6,360
Don. (10/13/2012)
Hi Guys,

Unfortunately neither of those scripts worked as there are multiple identical accountno and lastdate fields. Ideally Id need to filter on the History tables Recid field.

I've dummied up some data for you to look at

 CREATE TABLE #CONTACT 
(
ACCOUNTNO VARCHAR(5),
COMPANY VARCHAR(50),
CONTACT VARCHAR(50),
RECID VARCHAR(5)
)

CREATE TABLE #HISTORY
(
ACCOUNTNO VARCHAR(5),
LASTUSER VARCHAR(8),
LASTDATE DATE,
RECID VARCHAR(5)
)

INSERT INTO #CONTACT (ACCOUNTNO, COMPANY, CONTACT, RECID)
SELECT 'C1', 'ACME', 'WILEY COYOTE', 'CR1' UNION ALL
SELECT 'C2', 'Beta Ltd', 'Joe Soap', 'CR2'

INSERT INTO #HISTORY (ACCOUNTNO, LASTUSER, LASTDATE, RECID)
SELECT 'C1', 'JOE', '20120101', 'HR1' UNION ALL
SELECT 'C1', 'BOB', '20120101', 'HR2' UNION ALL
SELECT 'C1', 'JOE', '20120201', 'HR3' UNION ALL
SELECT 'C2', 'JOE', '20120202', 'HR4' UNION ALL
SELECT 'C2', 'BOB', '20120202', 'HR5'

How would I get a History select script to return HR1 and HR4?

Thanks


Don. (10/12/2012)
I'd like to get the join the tables, but only show the most recent record in History.


Why HR1? HR3 is on the most recent row. Assuming you mean HR3, then this works against the sample data set;

SELECT c.*, h.*  
FROM #Contact c
INNER JOIN (
SELECT h1.Accountno, h1.LastDate, RECID = MIN(h1.RECID)
FROM #History h1
INNER JOIN (
SELECT Accountno, MAX(LastDate) LastDate
FROM #History
GROUP BY Accountno
) l
ON l.Accountno = h1.Accountno
AND l.LastDate = h1.LastDate
GROUP BY h1.Accountno, h1.LastDate
) h
ON h.Accountno = c.Accountno




Low-hanging fruit picker and defender of the moggies





For better assistance in answering your questions, please read this.




Understanding and using APPLY, (I) and (II) Paul White

Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Post #1372490
Posted Sunday, October 14, 2012 5:14 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Thursday, July 24, 2014 6:30 AM
Points: 46, Visits: 229
L' Eomot Inversé (10/13/2012)Do you expect to get that? There's nothing in what you've told us about what you are doing that tells us that getting HR3 and HR5 wouldn't be OK (or for that matter HR3 and HR4 or HR2 and HR5 or ...)
You seem to have some rule that isn't at all clear from the data and what you've said - or I may be wrong, and you just haven't explained all the rules to us in language that I can understand.


Apologies, I was a bit hasty there.
In my example it should have returned HR3 and HR4 OR HR5.
Post #1372495
Posted Sunday, October 14, 2012 5:22 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Thursday, July 24, 2014 6:30 AM
Points: 46, Visits: 229
ChrisM@home (10/14/2012)


Don. (10/12/2012)
I'd like to get the join the tables, but only show the most recent record in History.


Why HR1? HR3 is on the most recent row. Assuming you mean HR3, then this works against the sample data set;

SELECT c.*, h.*  
FROM #Contact c
INNER JOIN (
SELECT h1.Accountno, h1.LastDate, RECID = MIN(h1.RECID)
FROM #History h1
INNER JOIN (
SELECT Accountno, MAX(LastDate) LastDate
FROM #History
GROUP BY Accountno
) l
ON l.Accountno = h1.Accountno
AND l.LastDate = h1.LastDate
GROUP BY h1.Accountno, h1.LastDate
) h
ON h.Accountno = c.Accountno





Chris, thats great.
From the testing I've done so far, that looks perfect.

Thank you very much. :D
Post #1372496
Posted Monday, October 15, 2012 10:26 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Wednesday, July 23, 2014 9:00 AM
Points: 1,595, Visits: 4,585
Don. (10/13/2012)
Hi Guys,

Unfortunately neither of those scripts worked as there are multiple identical accountno and lastdate fields. Ideally Id need to filter on the History tables Recid field.
...
...
In my example it should have returned HR3 and HR4 OR HR5.

Thanks

If running on SQL Server 2005+, the ROW_NUMBER() or RANK() function would be perfect for this.

select C_RECID, H_RECID from
(
select C.RECID C_RECID, H.RECID H_RECID,
row_number() over (partition by H.ACCOUNTNO order by H.RECID desc) history_rank
from #CONTACT C
join #HISTORY H on H.ACCOUNTNO = C.ACCOUNTNO
) x
where history_rank = 1;

C_RECID H_RECID
------- -------
CR1 HR3
CR2 HR5

You can simulate the functional equivalent in SQL Server 2000 less gracefully like so:

select C_RECID, H_RECID from
(
select C.RECID C_RECID, H.RECID H_RECID,
(select count(*) from #HISTORY HX
where HX.ACCOUNTNO = H.ACCOUNTNO and HX.RECID >= H.RECID) history_rank
from #CONTACT C
join #HISTORY H on H.ACCOUNTNO = C.ACCOUNTNO
) x
where history_rank = 1;

C_RECID H_RECID
------- -------
CR1 HR3
CR2 HR5

Either way, you would probably want a compound index on the HISTORY table's ACCOUNTNO and RECID columns. This would perhaps be the best suited primary key on the HISTORY table as well.
Post #1372812
Posted Tuesday, October 16, 2012 4:36 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Thursday, July 24, 2014 6:30 AM
Points: 46, Visits: 229
Eric M Russell (10/15/2012)
Don. (10/13/2012)
Hi Guys,

Unfortunately neither of those scripts worked as there are multiple identical accountno and lastdate fields. Ideally Id need to filter on the History tables Recid field.
...
...
In my example it should have returned HR3 and HR4 OR HR5.

Thanks

If running on SQL Server 2005+, the ROW_NUMBER() or RANK() function would be perfect for this.

select C_RECID, H_RECID from
(
select C.RECID C_RECID, H.RECID H_RECID,
row_number() over (partition by H.ACCOUNTNO order by H.RECID desc) history_rank
from #CONTACT C
join #HISTORY H on H.ACCOUNTNO = C.ACCOUNTNO
) x
where history_rank = 1;

C_RECID H_RECID
------- -------
CR1 HR3
CR2 HR5

You can simulate the functional equivalent in SQL Server 2000 less gracefully like so:

select C_RECID, H_RECID from
(
select C.RECID C_RECID, H.RECID H_RECID,
(select count(*) from #HISTORY HX
where HX.ACCOUNTNO = H.ACCOUNTNO and HX.RECID >= H.RECID) history_rank
from #CONTACT C
join #HISTORY H on H.ACCOUNTNO = C.ACCOUNTNO
) x
where history_rank = 1;

C_RECID H_RECID
------- -------
CR1 HR3
CR2 HR5

Either way, you would probably want a compound index on the HISTORY table's ACCOUNTNO and RECID columns. This would perhaps be the best suited primary key on the HISTORY table as well.



Thanks Eric, thats really informative.

I'll have a play with those scripts later.
Post #1373120
Posted Wednesday, October 17, 2012 2:37 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Today @ 1:49 AM
Points: 1,074, Visits: 6,360
Eric M Russell (10/15/2012)
You can simulate the functional equivalent in SQL Server 2000 less gracefully like so:

select C_RECID, H_RECID from
(
select C.RECID C_RECID, H.RECID H_RECID,
(select count(*) from #HISTORY HX
where HX.ACCOUNTNO = H.ACCOUNTNO and HX.RECID >= H.RECID) history_rank
from #CONTACT C
join #HISTORY H on H.ACCOUNTNO = C.ACCOUNTNO
) x
where history_rank = 1;

C_RECID H_RECID
------- -------
CR1 HR3
CR2 HR5


If Don has time, it would be interesting to know how this triangular join method compares to the aggregate methods. With small partitions it could be quite performant; with large partitions it won't. TJ's scale poorly.



Low-hanging fruit picker and defender of the moggies





For better assistance in answering your questions, please read this.




Understanding and using APPLY, (I) and (II) Paul White

Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Post #1373678
Posted Wednesday, October 17, 2012 7:40 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Wednesday, July 23, 2014 9:00 AM
Points: 1,595, Visits: 4,585
ChrisM@home (10/17/2012)
Eric M Russell (10/15/2012)
You can simulate the functional equivalent in SQL Server 2000 less gracefully like so:

select C_RECID, H_RECID from
(
select C.RECID C_RECID, H.RECID H_RECID,
(select count(*) from #HISTORY HX
where HX.ACCOUNTNO = H.ACCOUNTNO and HX.RECID >= H.RECID) history_rank
from #CONTACT C
join #HISTORY H on H.ACCOUNTNO = C.ACCOUNTNO
) x
where history_rank = 1;

C_RECID H_RECID
------- -------
CR1 HR3
CR2 HR5


If Don has time, it would be interesting to know how this triangular join method compares to the aggregate methods. With small partitions it could be quite performant; with large partitions it won't. TJ's scale poorly.

I agree the 2000 method would be potentially problematic in terms of optimization when compared to the the 2005+ windowing function method, although both of them could take considerable resources when dealing with million+ row tables. It's essential that HISTORY table be indexed in a way that supports it, perhaps even a covered indexed just to support this particular report, if it's called multiple times daily.
Post #1373792
Posted Wednesday, October 17, 2012 8:54 AM


SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Thursday, April 24, 2014 9:12 AM
Points: 285, Visits: 504
If the records in history have the same date without a timestamp there needs to be a unique way to identify the latest record. If RecID is sequential then you don't even need the date.
From your dummy data and what you're looking for (H1, H4) it looks as if the RecID is sequential but in decending order which I'm assuming it's not.
Here's a modified script of the one above using the RecID.


CREATE TABLE #CONTACT 
(
ACCOUNTNO VARCHAR(5),
COMPANY VARCHAR(50),
CONTACT VARCHAR(50),
RECID VARCHAR(5)
)

CREATE TABLE #HISTORY
(
ACCOUNTNO VARCHAR(5),
LASTUSER VARCHAR(8),
LASTDATE DATE,
RECID VARCHAR(5)
)

INSERT INTO #CONTACT (ACCOUNTNO, COMPANY, CONTACT, RECID)
SELECT 'C1', 'ACME', 'WILEY COYOTE', 'CR1' UNION ALL
SELECT 'C2', 'Beta Ltd', 'Joe Soap', 'CR2'

INSERT INTO #HISTORY (ACCOUNTNO, LASTUSER, LASTDATE, RECID)
SELECT 'C1', 'JOE', '20120101', 'HR1' UNION ALL
SELECT 'C1', 'BOB', '20120101', 'HR2' UNION ALL
SELECT 'C1', 'JOE', '20120201', 'HR3' UNION ALL
SELECT 'C2', 'JOE', '20120202', 'HR4' UNION ALL
SELECT 'C2', 'BOB', '20120202', 'HR5'


--Assuming that the RecID is sequential and ascending
select C.*, H.*
from #CONTACT C
left join (
select Hi.*
from #HISTORY Hi
inner join (
select max(recid) Min_RecID, ACCOUNTNO
from #HISTORY
group by ACCOUNTNO
) S
on Hi.ACCOUNTNO = S.ACCOUNTNO
and Hi.RECID = S.Min_RecID
) H
on C.ACCOUNTNO = H.ACCOUNTNO

--This works to get H1 and H4 but it implies that the recid is in descending order...
select C.*, H.*
from #CONTACT C
left join (
select Hi.*
from #HISTORY Hi
inner join (
select min(recid) Min_RecID, ACCOUNTNO
from #HISTORY
group by ACCOUNTNO
) S
on Hi.ACCOUNTNO = S.ACCOUNTNO
and Hi.RECID = S.Min_RecID
) H
on C.ACCOUNTNO = H.ACCOUNTNO

drop table #Contact
drop table #History



---------------------------------------------------------------
Mike Hahn - Future MCM 2025
Right way to ask for help!!
http://www.sqlservercentral.com/articles/Best+Practices/61537/
I post so I can see my avatar
I want a personal webpage
I want to win the lotto
I want a gf like Tiffa
Post #1373868
Posted Wednesday, October 17, 2012 10:03 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Thursday, July 24, 2014 6:30 AM
Points: 46, Visits: 229
Hi Guys,

I'll try and make some time during the week to have a play with this and report back. Otherwise I'll spend some time over the weekend.

Thanks again, for all your help.
Post #1373922
Posted Sunday, October 28, 2012 3:46 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Thursday, July 24, 2014 6:30 AM
Points: 46, Visits: 229
ChrisM@home (10/17/2012)
Eric M Russell (10/15/2012)



If Don has time, it would be interesting to know how this triangular join method compares to the aggregate methods. With small partitions it could be quite performant; with large partitions it won't. TJ's scale poorly.


Hi guys,

I've converted the 3 scripts so they now run and return the same data.

The script results were:
Script 1: 05 seconds returning 5343 records (ChrisMs script)
Script 2: 02 seconds returning 5343 records (Erics 2005 script)
Script 3: 1.36 returning 5343 records (Erics 2000 script)

The first script initially took 32 seconds to return 5550, but I found that Id messed up the joins.

Thanks for the lesson. ;o)
Post #1378009
« Prev Topic | Next Topic »

Add to briefcase ««123»»

Permissions Expand / Collapse