Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

Jason Carter

Jason Carter has spent most of his career as a .NET developer, with time spent as a development manager, accidental DBA, and most recently a full-time DBA. Having worked with large databases as a developer, he found great interest in tuning, tweaking, and making databases run faster. With the support of his wife, he gave up his managerial duties, jumped the development ship and dove head first into his new career as a Database Administrator.

5 Things Series – APPLY Operator

In this series I’m going to detail the 5 T-SQL Commands that review in my  5 T-SQL Commands I’ve been Missing, but were there the whole time presentation.  The gist of this presentation details my past as an uninformed developer, as it pertains to T-SQL and its vast features.   This presentation, and accompanying blog series, will highlight some of the commands I’ve found useful.

I will be using the same DBA.StackOverflow Database that I used in the first post of the series.

Today we’ll be discussing the APPLY operator.  This operator has been available to us since SQL 2005.  The always helpful MSDN documentation tells us about the APPLY operator.

The APPLY operator allows you to invoke a table-valued function for each row returned by an outer table expression of a query.

For a basic example we simply use cross apply to return a table-value function

select 
	U.Id
	, U.DisplayName
	, Location
	, Reputation
	, R.Position
from DBA.Users U
cross apply DBA.getUserRanking(U.Id) R
where AboutMe like '%MVP%'

Not very exciting. Most DBA’s would’ve seen this methodology against the myriad of dynamic management functions that are available to us. What I find useful in the CROSS APPLY is the ability to clean up your code and make it more readable. For example, take this query here. We have query that is pulling anyone that has the words ‘MVP’ in their About Me section of their profile. We want to see who they are, where they’re from, how active they have been, and how many post they mark as FAVORITE (VoteTypeId=5)

select 
	Id
	, DisplayName
	, Location
	, Reputation
	, (select count(2) from DBA.Votes where VoteTypeId=5 and UserId=U.Id)
from DBA.Users U
where AboutMe like '%MVP%'

This isn’t too crazy, but what if we want to also return the most recent post they marked as favorite to see what has piqued their interest. Let’s add a second correlated subquery.

select 
	Id
	, DisplayName
	, Location
	, Reputation
	, FavoritePost = (select count(2) from DBA.Votes where VoteTypeId=5 and UserId=U.Id) 
	, LastPost = (select max(PostId) from DBA.Votes where VoteTypeId=5 and UserId=U.Id) 
from DBA.Users U
where AboutMe like '%MVP%'
order by Reputation desc

Now we’re getting a little crazy  in the readability department, so lets see what a CROSS APPLY can do for us. We can take both of those aggregates and put them in a single CROSS APPLY.

select 
	Id
	, DisplayName
	, Location
	, Reputation
	, Favs.FavoritePost 
	, Favs.LastPost
from DBA.Users U
CROSS APPLY (select count(2) as FavoritePost, max(PostId) as LastPost from DBA.Votes where VoteTypeId=5 and UserId=U.Id) Favs
where AboutMe like '%MVP%'
order by Reputation desc

Now its fairly clear what we are doing here, but…. there are caveats. When we do this method our performance. According to our IO Stats we are reading a royal boatload of data for this simple query.
CrossApply Single Apply

We can fix that, let’s try to keep our readability, while getting our performance back, let’s split those queries into individual CROSS APPLYs.  As you can see, you can have multiple CROSS APPLYs in one query.

select 
	Id
	, DisplayName
	, Location
	, Reputation
	, Favs.FavoritePost 
	, Late.LastPost
from DBA.Users U
CROSS APPLY (select count(2) as FavoritePost from DBA.Votes where VoteTypeId=5 and UserId=U.Id) Favs
CROSS APPLY (select max(Postid) as LastPost from DBA.Votes where VoteTypeId=5 and UserId=U.Id) Late
where AboutMe like '%MVP%'
order by Reputation desc

There we go, our performance is back where it should be.
CrossApply Two Applies

One final trick I like to use CROSS APPLY is to reduce errors in complex formulas that might be used in multiple places in a query. In this query we can see we calculate a ‘Agreeability’ by putting DownVotes over UpVotes and getting a percentage, and then we order by that. I have seen at times where a formula might be the basis for a second formula, and a change in the first, might be missed in a change in the second.

select 
	Id
	, DisplayName
	, Location
	, UpVotes
	, DownVotes 
	, [Agreeabilitiy]  = convert(decimal(5,2), DownVotes*1.0/case when UpVotes=0 then 1 else UpVotes end*100)
from DBA.Users U
where AboutMe like '%MVP%'
order by  convert(decimal(5,2), DownVotes*1.0/case when UpVotes=0 then 1 else UpVotes end*100) desc

You can see here where we simply pull our formula into a dedicated CROSS APPLY and put our formula there, now you only have to make a change in one place, safety AND readability.

select 
	Id
	, DisplayName
	, Location
	, UpVotes
	, DownVotes 
	, CA.Agreeability
from DBA.Users U
CROSS APPLY( select convert(decimal(5,2), DownVotes*1.0/case when UpVotes=0 then 1 else UpVotes end*100) as Agreeability) CA
where AboutMe like '%MVP%'
order by  CA.Agreeability desc

I can think back to several places where the CROSS APPLY operator would have made my life so much easier, not only in reading and understanding what was going on in a query, but also in reducing errors when modifying a query. As with any new feature you implement, ensure you test before and after to ensure that you are maintaining or improving the performance of your queries, as new and shiny doesn’t help if it only slows you down.

Comments

Leave a comment on the original post [jason-carter.net, opens in a new window]

Loading comments...