Blog Post

SubQuery Performance

,

Why would you do this?

select distinct(hostname),
(
select count(*) as criticalCnt
from @temp where severity_guid='0168A833-1732-411E-8205-C2F6CD91737D'
and hostname=t.hostname
group by hostname),
(
select count(*) as criticalCnt
from @temp where severity_guid='CB2F2B90-2DA4-4075-BCAA-DD5D2CEFBFD5'
and hostname=t.hostname
group by hostname),
(
select count(*) as criticalCnt
from @temp where severity_guid='C4CF8A23-A106-4617-BAB0-94DA3CA74EF1'
and hostname=t.hostname
group by hostname)
from @temp t

I ran into this on a post where someone had asked about how to basically call a CASE statement. The posted didn’t know how and someone posted this as a way to tally the various counts of alerts.

I had glossed over it when I saw it, but when someone else replied with this statement, saying performance was better, I decided to look at see how much better.

select hostname,
sum(case when severity_guid='0168A833-1732-411E-8205-C2F6CD91737D'
then 1 else 0 end) as [Count_Of_0168A833-1732-411E-8205-C2F6CD91737D]
,sum(case when severity_guid='CB2F2B90-2DA4-4075-BCAA-DD5D2CEFBFD5'
then 1 else 0 end) as [Count_Of_CB2F2B90-2DA4-4075-BCAA-DD5D2CEFBFD5]
,sum(case when severity_guid='C4CF8A23-A106-4617-BAB0-94DA3CA74EF1'
then 1 else 0 end) as [Count_Of_C4CF8A23-A106-4617-BAB0-94DA3CA74EF1]
from @temp
group by hostname

I set the statistics on for these queries and found these results.

Query 1:

Table '#45544755'. Scan count 52, logical reads 52, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Query 2:

Table '#473C8FC7'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

That’s a huge difference. If this were to be run with any significant frequency on a system, you’d be using (4 x the number of rows) as many scans of the data than you needed to. Even though they are logical scans, it’s still CPU and memory movement you are requiring, and if this were a significant amount of data.

You can also see a drastic difference in the execution plans:

ScanPlan1

ScanPlan2

I’ll let you figure out which plan goes to which query.

This is a simple example, but it shows where someone really can get poor performance over time with badly written SQL. It’s nice to have various ways to solve problems, but you also want to choose the appropriate tool. Subqueries make sense at times, but this isn’t one of them.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating