|
|
|
SSC Veteran
      
Group: General Forum Members
Last Login: Wednesday, March 27, 2013 11:11 AM
Points: 228,
Visits: 147
|
|
| There are not any issues to my knowledge that require the NOLOCK hint for ROW_NUMBER(). I just used it for performance reasons.
|
|
|
|
|
SSC Veteran
      
Group: General Forum Members
Last Login: Wednesday, March 27, 2013 11:11 AM
Points: 228,
Visits: 147
|
|
JJ B (5/12/2009) Well done article.
Question: I hate to open a can of worms on a different topic, but I'm so curious as to why you specified the nolock hint for the row_number version of the query. Do you know of specific problems that require use of nolock for this type of implementation?
Thanks.
There aren't any problems with the ROW_NUMBER() version of the query that require the NOLOCK hint. I just used it for the slight performance improvement.
|
|
|
|
|
SSC Veteran
      
Group: General Forum Members
Last Login: 2 days ago @ 3:08 PM
Points: 255,
Visits: 2,407
|
|
Francis Rodrigues (5/12/2009) There are not any issues to my knowledge that require the NOLOCK hint for ROW_NUMBER(). I just used it for performance reasons. Thanks for the clarification. I guess that's what I was trying to get at. Do you mean that ROW_NUMBER() has such bad performance that you need to resort to using NOLOCK? Or is using NOLOCK just a common practice for your queries to try to preempt performance problems?
Thanks.
|
|
|
|
|
SSC Rookie
      
Group: General Forum Members
Last Login: Thursday, January 24, 2013 12:02 PM
Points: 42,
Visits: 152
|
|
In what way was simple set logic unable to perform this operation? It looks like this could be solved with left outer joins and is null tests. I don't have the test db mentioned, so my > may be wrong, it could need to be a <, I can never remember without testing it. But something like this should do the trick
SELECT p.ProductID, p.Version, p.MinorVersion, p.ReleaseVersion, p.StandardCost from Production.ProductVersion p left join Production.ProductVersion exclude1 on p.ProductID = exclude1.ProductID and p.Version > exclude1.Version left join Production.ProductVersion exclude2 on p.ProductID = exclude2.ProductID and p.Version = exclude2.Version and p.MinorVersion > exclude2.MinorVersion left join Production.ProductVersion exclude3 on p.ProductID = exclude3.ProductID and p.Version = exclude3.Version and p.MinorVersion = exclude3.MinorVersion and p.ReleaseVersion > exclude3.ReleaseVersion left join Production.ProductVersion exclude4 on p.ProductID = exclude4.ProductID and p.Version = exclude4.Version and p.MinorVersion = exclude4.MinorVersion and p.ReleaseVersion = exclude4.ReleaseVersion and p.StandardCost > exclude4.StandardCost where exclude1.Version is null and exclude2.MinorVersion is null and exclude3.ReleaseVersion is null and exclude4.StandardCost is null
|
|
|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Thursday, February 17, 2011 10:13 AM
Points: 3,
Visits: 83
|
|
wbrianwhite (5/12/2009) In what way was simple set logic unable to perform this operation? It looks like this could be solved with left outer joins and is null tests. I don't have the test db mentioned, so my > may be wrong, it could need to be a <, I can never remember without testing it. But something like this should do the trick
I think it's just a matter of ...elegance? To me using Row_Number() statements (and CTEs in general) are easier to write and easier to follow in specific situations (provided the one reading knows what they do). And I can tell you just from testing this morning that the Row_Number() query has a performance improvement over the left joins and isnull tests I had been using previously.
|
|
|
|
|
Valued Member
      
Group: General Forum Members
Last Login: 2 days ago @ 5:04 PM
Points: 54,
Visits: 915
|
|
I think this was a good demonstration of where these 2005+ features can be useful. I have a job coming up where this could really help.
IMO, none of the few alternative queries posted so far are as clear in their intent, which I think is almost as important as performance, and an acceptable trade-off where performance isn't dramatically affected.
Slightly off-topic, but I couldn't help being reminded of someone's suggestion I saw recently of a 'FIRST' aggregate (MS Access has one), which of course would make the query really simple, and probably perform really well. I suspect there must be someone who'd done one using CLR: SELECT ProductID, FIRST(Version) AS Version, FIRST(MinorVersion) AS MinorVersion, FIRST(ReleaseVersion) AS ReleaseVersion, FIRST(StandardCost) AS StandardCost FROM Production.ProductVersion ORDER BY ProductID, Version DESC, MinorVersion DESC, ReleaseVersion DESC GROUP BY ProductID
|
|
|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Thursday, May 09, 2013 2:19 PM
Points: 1,
Visits: 23
|
|
| Thanks for a good article, I definitely learned what Row_Number() is - definitely had no idea before.
|
|
|
|
|
SSC Rookie
      
Group: General Forum Members
Last Login: Wednesday, May 15, 2013 7:10 PM
Points: 41,
Visits: 387
|
|
I have seen similar examples of ROW_NUMBER() usage and the following has always puzzled me:
Why do we include columns in the ORDER BY when they are already in the PARTITION BY ?
i.e. I would expect
SELECT ROW_NUMBER() OVER(PARTITION BY ProductID, Version ORDER BY ProductID, Version, MinorVersion DESC, ReleaseVersion DESC) AS MaxVersion
to be equivalent to
SELECT ROW_NUMBER() OVER(PARTITION BY ProductID, Version ORDER BY MinorVersion DESC, ReleaseVersion DESC) AS MaxVersion
I know I must be missing obvious with this but I'm hanged if I can figure out what it is...
|
|
|
|
|
SSCrazy Eights
        
Group: General Forum Members
Last Login: Saturday, May 04, 2013 11:13 AM
Points: 9,855,
Visits: 9,374
|
|
UncleJimBob (5/12/2009)
I have seen similar examples of ROW_NUMBER() usage and the following has always puzzled me: Why do we include columns in the ORDER BY when they are already in the PARTITION BY ? i.e. I would expect SELECT ROW_NUMBER() OVER(PARTITION BY ProductID, Version ORDER BY ProductID, Version, MinorVersion DESC, ReleaseVersion DESC) AS MaxVersion to be equivalent to SELECT ROW_NUMBER() OVER(PARTITION BY ProductID, Version ORDER BY MinorVersion DESC, ReleaseVersion DESC) AS MaxVersion I know I must be missing obvious with this but I'm hanged if I can figure out what it is...  AFAIK, you are correct. There's no reason to include a column in the ORDER BY if it's already in the PARTITION BY, it has no effect.
-- RBarryYoung, (302)375-0451 blog: MovingSQL.com, Twitter: @RBarryYoung Proactive Performance Solutions, Inc. "Performance is our middle name."
|
|
|
|
|
SSC Rookie
      
Group: General Forum Members
Last Login: Thursday, January 24, 2013 12:02 PM
Points: 42,
Visits: 152
|
|
William Hutton (5/12/2009)
I think it's just a matter of ...elegance? To me using Row_Number() statements (and CTEs in general) are easier to write and easier to follow in specific situations (provided the one reading knows what they do). And I can tell you just from testing this morning that the Row_Number() query has a performance improvement over the left joins and isnull tests I had been using previously.
Do they? How big is it? Is that before or after you make a clustered index on the four significant columns? In figure 4 it looks like a well-indexed table is faster when queried by the sub-query approach. And since this table is going to be used in lots and lots of joins besides the one in the article, it needs to be well-indexed anyway.
The comparison of query plans in the article is skewed because the 14 nested subqueries can be replaced by a single select with 4 left outer joins, and it is that query plan comparison I'd be interested in seeing.
Also, isn't that subquery getting the max release version and standard cost as well, while the Row_Number() version is not?
|
|
|
|