Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12345»»»

ROW_NUMBER(): An Efficient Alternative to Subqueries Expand / Collapse
Author
Message
Posted Tuesday, May 12, 2009 12:16 PM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Tuesday, July 15, 2014 9:51 AM
Points: 229, Visits: 166
There are not any issues to my knowledge that require the NOLOCK hint for ROW_NUMBER(). I just used it for performance reasons.
Post #715367
Posted Tuesday, May 12, 2009 12:19 PM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Tuesday, July 15, 2014 9:51 AM
Points: 229, Visits: 166
JJ B (5/12/2009)
Well done article.

Question: I hate to open a can of worms on a different topic, but I'm so curious as to why you specified the nolock hint for the row_number version of the query. Do you know of specific problems that require use of nolock for this type of implementation?

Thanks.


There aren't any problems with the ROW_NUMBER() version of the query that require the NOLOCK hint. I just used it for the slight performance improvement.
Post #715371
Posted Tuesday, May 12, 2009 12:20 PM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Wednesday, August 27, 2014 4:53 PM
Points: 266, Visits: 2,589
Francis Rodrigues (5/12/2009)
There are not any issues to my knowledge that require the NOLOCK hint for ROW_NUMBER(). I just used it for performance reasons.

Thanks for the clarification. I guess that's what I was trying to get at. Do you mean that ROW_NUMBER() has such bad performance that you need to resort to using NOLOCK? Or is using NOLOCK just a common practice for your queries to try to preempt performance problems?

Thanks.
Post #715373
Posted Tuesday, May 12, 2009 12:31 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Thursday, January 24, 2013 12:02 PM
Points: 42, Visits: 152
In what way was simple set logic unable to perform this operation? It looks like this could be solved with left outer joins and is null tests. I don't have the test db mentioned, so my > may be wrong, it could need to be a <, I can never remember without testing it. But something like this should do the trick

SELECT p.ProductID, p.Version, p.MinorVersion, p.ReleaseVersion, p.StandardCost
from Production.ProductVersion p
left join Production.ProductVersion exclude1
on p.ProductID = exclude1.ProductID
and p.Version > exclude1.Version
left join Production.ProductVersion exclude2
on p.ProductID = exclude2.ProductID
and p.Version = exclude2.Version
and p.MinorVersion > exclude2.MinorVersion
left join Production.ProductVersion exclude3
on p.ProductID = exclude3.ProductID
and p.Version = exclude3.Version
and p.MinorVersion = exclude3.MinorVersion
and p.ReleaseVersion > exclude3.ReleaseVersion
left join Production.ProductVersion exclude4
on p.ProductID = exclude4.ProductID
and p.Version = exclude4.Version
and p.MinorVersion = exclude4.MinorVersion
and p.ReleaseVersion = exclude4.ReleaseVersion
and p.StandardCost > exclude4.StandardCost
where exclude1.Version is null
and exclude2.MinorVersion is null
and exclude3.ReleaseVersion is null
and exclude4.StandardCost is null

Post #715377
Posted Tuesday, May 12, 2009 2:35 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, February 17, 2011 10:13 AM
Points: 3, Visits: 83
wbrianwhite (5/12/2009)
In what way was simple set logic unable to perform this operation? It looks like this could be solved with left outer joins and is null tests. I don't have the test db mentioned, so my > may be wrong, it could need to be a <, I can never remember without testing it. But something like this should do the trick


I think it's just a matter of ...elegance? To me using Row_Number() statements (and CTEs in general) are easier to write and easier to follow in specific situations (provided the one reading knows what they do). And I can tell you just from testing this morning that the Row_Number() query has a performance improvement over the left joins and isnull tests I had been using previously.
Post #715467
Posted Tuesday, May 12, 2009 2:38 PM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Monday, August 25, 2014 10:11 PM
Points: 56, Visits: 1,043
I think this was a good demonstration of where these 2005+ features can be useful. I have a job coming up where this could really help.

IMO, none of the few alternative queries posted so far are as clear in their intent, which I think is almost as important as performance, and an acceptable trade-off where performance isn't dramatically affected.

Slightly off-topic, but I couldn't help being reminded of someone's suggestion I saw recently of a 'FIRST' aggregate (MS Access has one), which of course would make the query really simple, and probably perform really well. I suspect there must be someone who'd done one using CLR:
SELECT ProductID,
FIRST(Version) AS Version,
FIRST(MinorVersion) AS MinorVersion,
FIRST(ReleaseVersion) AS ReleaseVersion,
FIRST(StandardCost) AS StandardCost
FROM Production.ProductVersion
ORDER BY ProductID, Version DESC, MinorVersion DESC, ReleaseVersion DESC
GROUP BY ProductID
Post #715471
Posted Tuesday, May 12, 2009 2:52 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, May 9, 2013 2:19 PM
Points: 1, Visits: 23
Thanks for a good article, I definitely learned what Row_Number() is - definitely had no idea before.
Post #715482
Posted Tuesday, May 12, 2009 3:45 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Wednesday, August 27, 2014 9:30 PM
Points: 41, Visits: 436
I have seen similar examples of ROW_NUMBER() usage and the following has always puzzled me:

Why do we include columns in the ORDER BY when they are already in the PARTITION BY ?

i.e. I would expect

SELECT ROW_NUMBER() OVER(PARTITION BY ProductID,
Version ORDER BY ProductID, Version, MinorVersion DESC, ReleaseVersion DESC) AS MaxVersion

to be equivalent to

SELECT ROW_NUMBER() OVER(PARTITION BY ProductID,
Version ORDER BY MinorVersion DESC, ReleaseVersion DESC) AS MaxVersion

I know I must be missing obvious with this but I'm hanged if I can figure out what it is...
Post #715512
Posted Tuesday, May 12, 2009 3:56 PM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Thursday, June 5, 2014 10:54 AM
Points: 9,902, Visits: 9,480
UncleJimBob (5/12/2009)
I have seen similar examples of ROW_NUMBER() usage and the following has always puzzled me:

Why do we include columns in the ORDER BY when they are already in the PARTITION BY ?

i.e. I would expect

SELECT ROW_NUMBER() OVER(PARTITION BY ProductID,
Version ORDER BY ProductID, Version, MinorVersion DESC, ReleaseVersion DESC) AS MaxVersion

to be equivalent to

SELECT ROW_NUMBER() OVER(PARTITION BY ProductID,
Version ORDER BY MinorVersion DESC, ReleaseVersion DESC) AS MaxVersion

I know I must be missing obvious with this but I'm hanged if I can figure out what it is...

AFAIK, you are correct. There's no reason to include a column in the ORDER BY if it's already in the PARTITION BY, it has no effect.


-- RBarryYoung, (302)375-0451 blog: MovingSQL.com, Twitter: @RBarryYoung
Proactive Performance Solutions, Inc.
"Performance is our middle name."
Post #715518
Posted Tuesday, May 12, 2009 4:14 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Thursday, January 24, 2013 12:02 PM
Points: 42, Visits: 152
William Hutton (5/12/2009)

I think it's just a matter of ...elegance? To me using Row_Number() statements (and CTEs in general) are easier to write and easier to follow in specific situations (provided the one reading knows what they do). And I can tell you just from testing this morning that the Row_Number() query has a performance improvement over the left joins and isnull tests I had been using previously.


Do they? How big is it? Is that before or after you make a clustered index on the four significant columns? In figure 4 it looks like a well-indexed table is faster when queried by the sub-query approach. And since this table is going to be used in lots and lots of joins besides the one in the article, it needs to be well-indexed anyway.

The comparison of query plans in the article is skewed because the 14 nested subqueries can be replaced by a single select with 4 left outer joins, and it is that query plan comparison I'd be interested in seeing.

Also, isn't that subquery getting the max release version and standard cost as well, while the Row_Number() version is not?
Post #715528
« Prev Topic | Next Topic »

Add to briefcase ««12345»»»

Permissions Expand / Collapse