Adding columns to SELECT slows processing to a crawl

Question

Adding columns to SELECT slows processing to a crawl

MattieNH

SSCertifiable

Points: 7463
More actions
July 28, 2006 at 8:03 am

#113905

I have the following T-SQL code (it will eventually become a stored procedure) that I'm running in Query Analyzer.
SET NOCOUNT ON
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
DECLARE @ArraignData TABLE (
ArraignSessionDate datetime,
ArraignEventID int,
ArraignEventTypeID int,
ArraignEventDeleted bit,
ArraignChargeID int PRIMARY KEY CLUSTERED,
ArraignHearingID int,
ArraignCaseID int,
ArraignNodeID int,
ArraignCurrentSettingID int,
ArraignSettingHearingID int,
ArraignCourtSessionBlockID int,
ArraignCalendarID int,
ArraignDescription varchar(100))
PRINT 'Insert into temp table'
PRINT convert(varchar(100), getdate(), 109)
INSERT INTO @ArraignData
SELECT cs.SessionDate,
e.EventID,
e.EventTypeID,
e.Deleted,
xcbc.ChargeID,
he.HearingID,
he.CaseID,
he.NodeID,
he.CurrentSettingID,
s.HearingID,
s.CourtSessionBlockID,
cs.CalendarID,
cs.Description
FROM CaseAssignHist cah
INNER JOIN xCaseBaseChrg xcbc
ON cah.CaseID = xcbc.CaseID
INNER JOIN HearingEvent he
ON xcbc.CaseID = he.CaseID
INNER JOIN Event e
ON he.HearingID = e.EventID
INNER Join Setting s
ON he.CurrentSettingID = s.SettingID
INNER JOIN CtSessionBlock csb
ON s.CourtSessionBlockID = csb.CourtSessionBlockID
INNER JOIN CtSession cs
ON csb.CourtSessionID = cs.CourtSessionID
INNER JOIN (
SELECT xcbc1.CaseID,
MIN(e1.EventID) as MinEventID
FROM xCaseBaseChrg xcbc1
INNER JOIN HearingEvent he1
ON xcbc1.CaseID = he1.CaseID
INNER JOIN Event e1
ON he1.HearingID = e1.EventID
INNER Join Setting s1
ON he1.CurrentSettingID = s1.SettingID
INNER JOIN CtSessionBlock csb1
ON s1.CourtSessionBlockID = csb1.CourtSessionBlockID
INNER JOIN CtSession cs1
ON csb1.CourtSessionID = cs1.CourtSessionID
INNER JOIN (
SELECT xcbc2.CaseID,
MIN(cs2.SessionDate) as FirstArraignDate
FROM xCaseBaseChrg xcbc2
INNER JOIN HearingEvent he2
ON xcbc2.CaseID = he2.CaseID
INNER JOIN Event e2
ON he2.HearingID = e2.EventID
INNER Join Setting s2
ON he2.CurrentSettingID = s2.SettingID
INNER JOIN CtSessionBlock csb2
ON s2.CourtSessionBlockID = csb2.CourtSessionBlockID
INNER JOIN CtSession cs2
ON csb2.CourtSessionID = cs2.CourtSessionID
WHERE e2.EventTypeID in (312, 324, 321, 322)
AND e2.Deleted = 0
GROUP BY xcbc2.CaseID) DateTable
ON xcbc1.CaseID = DateTable.CaseID
AND cs1.SessionDate = DateTable.FirstArraignDate
WHERE e1.EventTypeID in (312, 324, 321, 322)
AND e1.Deleted = 0
GROUP BY xcbc1.CaseID) T1
ON cah.CaseID = t1.caseid
AND e.EventID = T1.MinEventID
WHERE E.EventTypeID in (312, 324, 321, 322)
AND E.Deleted = 0
PRINT 'start select temp'
PRINT convert(varchar(100), getdate(), 109)
SELECT * FROM @ArraignData
PRINT 'end select temp'
PRINT convert(varchar(100), getdate(), 109)
PRINT 'start select perm'
PRINT convert(varchar(100), getdate(), 109)
SELECT cah.CaseNbr,
cah.CaseID,
c.ChargeID,
c.OffenseDate,
oh.OffenseID,
oh.DegreeID,
uc1.Code,
IsNull(uc1.Description, '') AS DegreeCode,
IsNull(oh.JurisdictionID, '') AS DegreeDesc,
IsNull(uc2.Code, '') AS JurisdictionCode,
IsNull(uc2.Description, '') AS JurisdictionDesc,
oh.Statute,
oh.GOC
/*,
ad.ArraignSessionDate,
ad.ArraignDescription*/
FROM CaseAssignHist cah
INNER JOIN xCaseBaseChrg xcbc
ON cah.CaseID = xcbc.CaseID
INNER JOIN Chrg c
ON xcbc.ChargeID = c.ChargeID
INNER JOIN OffHist oh
ON c.OffenseHistoryIDCur = oh.OffenseHistoryID
LEFT OUTER JOIN uCode uc1
ON oh.DegreeID = uc1.CodeID
LEFT OUTER JOIN uCode uc2
ON oh.JurisdictionID = uc2.CodeID
LEFT OUTER JOIN @ArraignData ad
ON xcbc.ChargeID = ad.ArraignChargeID
ORDER BY cah.CaseNbr
PRINT 'end select'
PRINT convert(varchar(100), getdate(), 109)
The times associated with the print statements are:
Insert into temp table
Jul 28 2006 8:41:31:383AM
start select temp
Jul 28 2006 8:41:46:693AM
end select temp
Jul 28 2006 8:41:54:093AM
This table has around 70,000 rows
start select perm
Jul 28 2006 8:41:54:093AM
end select
Jul 28 2006 8:42:13:453AM
This table has around 311,000 rows
So from start to finish, this code takes less than a minute to run.
When I uncomment the following columns in the perm table select,
,ad.ArraignSessionDate,
ad.ArraignDescription
the code takes forever to run, as in it's been running for more than an hour now. I'd like to take a look at differences between the execution plans, but it's got to finish first.
Since I'd like this to finish before the weekend, I decided to change the table variable to a temp table. And to my surprise, it finished it under a minute. So the solution to the long-running query problem is obvious: use a temp table rather than a table variable.
But I still have several questions. Why the huge performance difference between a temp table and a table variable? Is it ever worth using a table variable? And why does displaying those fields make a difference? Isn't all the overhead in the join, rather than the select?
This is running SQL Server 2000, SP4.
Thanks,
Mattie

Viewing 6 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply

PW-201837 SSC-Insane Points: 20805 More actions · Answer 1

Can't answer as to the difference between the @Table var and #Temp table - the exec plan will tell you. Possibly due to different statistics causing the optimizer to select a different plan.

>>And why does displaying those fields make a difference? Isn't all the overhead in the join, rather than the select?

That join is based on the primary key, ArraignChargeID.

If you exclude those 2 columns, you are essentially removing the need to hit that table, period. It's being LEFT JOINED to. And if you exclude those columns, there is nothing needed from the "ad" aliased table.

When you add the 2 columns, in order to retrieve them, the clustered index has to be used to find the relevant data pages - completely different query plan.

colin.Leversuch-Roberts SSC Guru Points: 52643 More actions · Answer 2

so many questions - so many answers !!!

table variables are good for small data sets but rubbish for large data sets. If the populated set is of any size always use a temp table, they are especially optimsed for the job.

You can view the estimated query plan to get an idea of what may happen, check out the options in QA.

Adding a couple of extra columns may cause the selection of a different index - sorry too much hassle to go through your query.

I'd say you may have too many joins in there for safety, but I guess if it runs quickly....

Check the query plan for scans initially and also repetitive scans , one scan on a table might not be too bad but if the same table scans a few thousand times that would not be good.

You really need to dump the table variable and analyse the query plans to see what changes - all we can do is make guesses which might well be wrong.

[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/

MattieNH SSCertifiable Points: 7463 More actions · Answer 3

Wow, that's interesting about the JOIN.

I can't tell you about the execution plan yet. Two and a half hours later, it's still running.

Mattie

MattieNH SSCertifiable Points: 7463 More actions · Answer 4

Colin,

Thanks for the response. I don't blame you for not looking at the query, I only posted it because when I don't, I get a response telling me to post my code.

Finally, the query has stopped: two hours and forty minutes later. I wish I knew how to post the execution plan here, because I'm sure I'm going to miss something important in describing the differences.

When the query uses a temp table, here's what the execution plan shows:

Table

Action

Executes

Cost

Feeds

Cost

xCaseBaseChrg (IJ)

Index Scan

1

0%

Hash Match/Inner Join

Case AssignHist (IJ)

Clustered Index Scan

1

4%

Hash Match/Inner Join

2%

Chrg (IJ)

Clustered Index Scan

1

2%

Hash Match/Inner Join

3%

OffHist (IJ)

Clustered Index Scan

1

18%

Hash Match/Inner Join

6%

uCode1 (LOJ)

Clustered Index Scan

1

0%

Hash Match/Right Outer Join

1%

uCode2 (LOJ)

Clustered Index Scan

1

0%

Hash Match/Right Outer Join

1%

#ArraignData (LOJ)

Clustered Index Scan

1

0%

Hash Match/Right Outer Join

2%

Sort

1

60%

Compute Scalar

1

0%

Select

1

0%

When the query uses a table variable, here's what the execution plan shows:

Table

Action

Executes

Cost

Feeds

Cost

xCaseBaseChrg (IJ)

Index Scan

1

0%

Hash Match/Inner Join

CaseAssignHist (IJ)

Clustered Index Scan

1

6%

Hash Match/Inner Join

3%

Chrg (IJ)

Clustered Index Scan

1

2%

Hash Match/Inner Join

5%

OffHist (IJ)

Clustered Index Scan

1

25%

Hash Match/Inner Join

8%

Sort

1

16%

uCode1 (LOJ)

Index Seek

311.179

15%

Nested Loop/Left Outer Join + Bookmark Lookup

1%

uCode2 (LOJ)

Index Seek

311.179

15%

Nested Loop/Left Outer Join + Bookmark Lookup

1%

@ArraignData (LOJ)

Clustered Index Scan

1

0%

Table Spool/Lazy Spool + Nested Loop/Left Outer Join

1%

Compute Scalar Cost

1

0%

Select

1

0%

I'm assuming the slowdown is the result of the way the left outer joins are handled. And I find the description of the Table Spool Lazy Spool especially scary: Stores the data from the input into a temporary table in order to optimize rewinds. Its row count is 405,714,905.

I wouldn't have been surprised if the INSERT into @ArraignData had taken forever, given the complex (to me, anyway) joining that goes on in there. But in a straightforward SELECT, who knew?

Mattie

colin.Leversuch-Roberts SSC Guru Points: 52643 More actions · Answer 5

as I say only small data sets should go into a table variable. Your clustered idnex scans are in fact table scans so that tells me that your indexing isn't optimal - however you can't always escape table sans and for small tables a scan is often more efficient than a seek dues to the locks etc. etc.

Yup lazy spools can be very bad news, you've probably answered your own question - use a temp table and not a table variable!!

Don't know what else I can suggest - query tuning is always difficult when you don't have the whole thing in front of you. what are total i/o differences between the methods - you should use profiler to report your i/o not QA.

[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/