Select most recent record in JOIN

Question

Select most recent record in JOIN

Chi Chi Cabron

Ten Centuries

Points: 1087
More actions
May 20, 2011 at 10:47 am

#250507

I'm sure this is a classic newbie question, but I haven't been able to find the answer myself.... I have a Order Status table with the following schema:
StatusIDOrderIDStatusDate OrderStatus
214500532011-03-10 Pending
794500532011-05-02Shipped
894500762011-05-09 Pending
240500882011-03-14 Pending
241500882011-03-14 Cancelled
267500882011-03-15 Reopen
392500882011-03-25 Shipped
My requirement is that I need a report that shows all orders and the most recent OrderStatus and StatusDate
The most efficient way I have been able to accomplish this is via a correlated subquery in the WHERE clause:
SELECT O.OrderID, O.OrderDate, S.StatusDate, S.OrderStatus
FROM Orders AS O JOIN OrderStatus AS S ON O.OrderID=S.OrderID
WHERE S.StatusID = (SELECT TOP 1 StatusID FROM OrderStatus WHERE OrderID = O.OrderID ORDER BY StatusDate DESC)
Which returns
OrderIDOrderDateStatusDate OrderStatus
500532011-05-012011-05-02Shipped
500762011-05-072011-05-09 Pending
500882011-03-132011-03-25 Shipped
But something tells me there is a better, less messy way. If there is, can anyone point this newbie in the right direction?

Viewing 8 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply

Welsh Corgi SSC Guru Points: 116520 More actions · Answer 1

Chi Chi Cabron (5/20/2011)
But something tells me there is a better, less messy way. If there is, can anyone point this newbie in the right direction?

Yes you can ask them to read the following article:

http://www.sqlservercentral.com/articles/Best+Practices/61537

🙂

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Chi Chi Cabron Ten Centuries Points: 1087 More actions · Answer 2

Chi Chi Cabron

Ten Centuries

Points: 1087

May 20, 2011 at 12:13 pm

#1327884

Obviously I am beyond help.

Mike01 SSChampion Points: 11420 More actions · Answer 3

is it possible that an order can have more than one record with the same StatusDate? Based on the limited data and the fact that we don't know what the other table even looks like, I took a stab at it.

Create table #orders (StatusID smallint primary key, OrderID int, StatusDate smalldatetime, OrderStatus varchar(20))

CREATE NONCLUSTERED INDEX [Orders_OX] ON #orders

(

[OrderID] ASC,

[StatusDate] ASC

)

insert into #orders values

(214,'50053','2011-03-10','Pending'),

(794,'50053','2011-05-02','Shipped'),

(894,'50076','2011-05-09','Pending'),

(240,'50088','2011-03-14','Pending'),

(241,'50088','2011-03-14','Cancelled'),

(267,'50088','2011-03-15','Reopen'),

(392,'50088','2011-03-25','Shipped')

select o.*

from #orders o

inner join (

select OrderID,

MAX(StatusDate) StatusDate

from #orders

group by OrderID ) v

on o.orderId = v.OrderID

and o.StatusDate = v.StatusDate

When I ran your code, I got this for statistics

(3 row(s) affected)

Table 'Worktable'. Scan count 11, logical reads 31, physical reads 0,

read-ahead reads 0, lob logical reads 0, lob physical reads 0,

lob read-ahead reads 0.

Table '#orders__________________________________________________

___________________________________________________________

00000000000F'. Scan count 4, logical reads 8, physical reads 0, read-ahead reads 0,

lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Here's the stats when I ran mine

3 row(s) affected)

Table '#orders____________________________________________________

_________________________________________________________

00000000000F'. Scan count 1, logical reads 16, physical reads 0, read-ahead reads 0,

lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Lastly, on a million rows

Yours -

(327682 row(s) affected)

Table 'Orders'. Scan count 917513, logical reads 2928793, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 6 Seconds to Retrieve

Mine -

(327682 row(s) affected)

Table 'Orders'. Scan count 18, logical reads 6209, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 4 Seconds

For better, quicker answers, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

GSquared SSC Guru Points: 260824 More actions · Answer 4

Chi Chi Cabron (5/20/2011)
Obviously, I am beyond help.

Nah.

The way you did it there is pretty much what you had to do in SQL 2000. In 2005 and beyond, you can often get this more easily/efficiently with the use of CTEs and ranking functions.

USE ProofOfConcept ;

GO

IF OBJECT_ID(N'tempdb..#T') IS NOT NULL

DROP TABLE #T ;

CREATE TABLE #T

(StatusID INT IDENTITY

PRIMARY KEY,

OrderID INT,

StatusDate DATE,

OrderStatus VARCHAR(25)) ;

INSERT INTO #T

(OrderID, StatusDate, OrderStatus)

VALUES (1, '20110101', 'Start'),

(1, '20110102', 'Done'),

(2, '20110101', 'Start') ;

;

WITH CTE

AS (SELECT *,

ROW_NUMBER() OVER (PARTITION BY OrderID ORDER BY StatusDate DESC) AS Row

FROM #T)

SELECT *

FROM CTE

WHERE Row = 1 ;

What I've done here is assign a row number based on the OrderID and the StatusDate. It resets for each OrderID (partition by), and starts at the last date value (desc).

Then, once I've got that, in the Common Table Expression (CTE), I just query where the row = 1, which will be the last date for each order.

Make sense?

For future posts, putting something together like I just did, with a temp table and some sample data, can help you get faster help, because then nobody has to figure that part out for you while trying to help out. That's all WC was asking for.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Mike01 SSChampion Points: 11420 More actions · Answer 5

Nice, I wasn't even thinking about CTEs. I was conerned with showing stats without thinking of trying to keep it simple

For better, quicker answers, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Evil Kraig F SSC Guru Points: 100851 More actions · Answer 6

I have no idea why I have to wrap the following in quotes to make it behave... but this is my message.

The other option is to use CROSS APPLY (SELECT TOP 1 DESC) instead of the MAX method.
MAX is my old friend, but depending on indexing and data heuristics CROSS APPLY can outperform MAX.
I'm still nailing down the details on that, but it's worth checking both when you're dealing with an audit table like this. In general both will outperform the reversed row_number if your indexes are sound, otherwise row_number can outperform.

- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA

Chi Chi Cabron Ten Centuries Points: 1087 More actions · Answer 7

Fancy! I will look into CTEs. Thanks so much, that is going to be incredibly useful!