July 8, 2008 at 10:52 am
I am finding that running a Select * is faster than selecting only a few individual columns. This seems counter intuitive to everything that I know about SQL Server. Fewer columns returned should be faster or at least the same speed. My query is pretty large and running on some tables that have over a million rows. The speed difference is very large! Select * returns over 165 columns and runs in about 6 seconds. Select only about 15 columns runs in over 20 minutes. I am comparing the execution plans now, but this does not make sense. Any help would be appreciated. Thanks, Brian
July 8, 2008 at 10:59 am
Is it consistent? Did the data have to be fetched from disk on one? Possible blocking?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
July 8, 2008 at 11:36 am
That doesn't make sense. The data returned should be less if you select individual columns.
How are you selecting? Are you sure that your SELECT * from MyTable doesn't have some limiting factor like a rowcount or TOP set?
July 8, 2008 at 11:37 am
GilaMonster (7/8/2008)
Is it consistent? Did the data have to be fetched from disk on one? Possible blocking?
It seems to be fairly consistent, usually select * is faster, but I usually only compare if a query is running slow. The data is all on the same raid array. There do not appear to be any blocking issues.
Also, the query uses about 6 recursive joins to the same table.
July 8, 2008 at 11:44 am
I know it does not make sense. I am not using any limiting factor. Here is the query:
-- Select individual columns
SELECT LName, FName, Address1, Address2, City, [State], Zip, Phone, SSN, 0 [Basic], EmpBen_EL.Amount3 Optional, EmpBen_VAD.Amount3 EmployeeVADD,
EmpBen_VAD.Amount3 FamilyVADD,
EmpBen_SPL.Amount3 Spouse, EmpBen_CHL.Amount3 Child, DOB
-- Select all columns
--SELECT *
FROM PersonalData INNER JOIN Census
ON PersonalData.TrackingID = Census.TrackingID
INNER JOIN ClientPlan
ON Census.PlanID = ClientPlan.PlanID
-- select a record if it has one of the following benefits
INNER JOIN
(SELECT EmpBenefits.TrackingID
FROM EmpBenefits INNER JOIN
(SELECT MAX(EmpBenefitID) EmpBenefitID
FROM EmpBenefits
WHERE CoverageTier <> 'N'
AND PYBegin = '7/1/2008'
AND BenefitType IN ('EL', 'BAD', 'VAD', 'SPL', 'CHL')
GROUP BY TrackingID, PYBegin, BenefitType) A
ON EmpBenefits.EmpBenefitID = A.EmpBenefitID
WHERE EndDate IS NOT NULL
GROUP BY EmpBenefits.TrackingID) EmpBen
ON Census.TrackingID = EmpBen.TrackingID
-- get individual benefit data if available
LEFT OUTER JOIN
(SELECT EmpBenefits.TrackingID, EmpBenefits.Amount3, EmpBenefits.EndDate
FROM EmpBenefits INNER JOIN
(SELECT MAX(EmpBenefitID) EmpBenefitID
FROM EmpBenefits
WHERE BenefitType = 'EL'
AND PYBegin = '7/1/2008'
AND CoverageTier <> 'N'
GROUP BY TrackingID, PYBegin, BenefitType) A
ON EmpBenefits.EmpBenefitID = A.EmpBenefitID
WHERE EndDate IS NOT NULL) EmpBen_EL
ON Census.TrackingID = EmpBen_EL.TrackingID
-- get individual benefit data if available
LEFT OUTER JOIN
(SELECT EmpBenefits.TrackingID, EmpBenefits.Amount3, EmpBenefits.EndDate
FROM EmpBenefits INNER JOIN
(SELECT MAX(EmpBenefitID) EmpBenefitID
FROM EmpBenefits
WHERE BenefitType = 'BAD'
AND PYBegin = '7/1/2008'
AND CoverageTier <> 'N'
GROUP BY TrackingID, PYBegin, BenefitType) A
ON EmpBenefits.EmpBenefitID = A.EmpBenefitID
WHERE EndDate IS NOT NULL) EmpBen_BAD
ON Census.TrackingID = EmpBen_BAD.TrackingID
-- get individual benefit data if available
LEFT OUTER JOIN
(SELECT EmpBenefits.TrackingID, EmpBenefits.Amount3, EmpBenefits.EndDate, EmpBenefits.CoverageTier
FROM EmpBenefits INNER JOIN
(SELECT MAX(EmpBenefitID) EmpBenefitID
FROM EmpBenefits
WHERE BenefitType = 'VAD'
AND PYBegin = '7/1/2008'
AND CoverageTier <> 'N'
GROUP BY TrackingID, PYBegin, BenefitType) A
ON EmpBenefits.EmpBenefitID = A.EmpBenefitID
WHERE EndDate IS NOT NULL) EmpBen_VAD
ON Census.TrackingID = EmpBen_VAD.TrackingID
-- get individual benefit data if available
LEFT OUTER JOIN
(SELECT EmpBenefits.TrackingID, EmpBenefits.Amount3, EmpBenefits.EndDate
FROM EmpBenefits INNER JOIN
(SELECT MAX(EmpBenefitID) EmpBenefitID
FROM EmpBenefits
WHERE BenefitType = 'SPL'
AND PYBegin = '7/1/2008'
AND CoverageTier <> 'N'
GROUP BY TrackingID, PYBegin, BenefitType) A
ON EmpBenefits.EmpBenefitID = A.EmpBenefitID
WHERE EndDate IS NOT NULL) EmpBen_SPL
ON Census.TrackingID = EmpBen_SPL.TrackingID
-- get individual benefit data if available
LEFT OUTER JOIN
(SELECT EmpBenefits.TrackingID, EmpBenefits.Amount3, EmpBenefits.EndDate
FROM EmpBenefits INNER JOIN
(SELECT MAX(EmpBenefitID) EmpBenefitID
FROM EmpBenefits
WHERE BenefitType = 'CHL'
AND PYBegin = '7/1/2008'
AND CoverageTier <> 'N'
GROUP BY TrackingID, PYBegin, BenefitType) A
ON EmpBenefits.EmpBenefitID = A.EmpBenefitID
WHERE EndDate IS NOT NULL) EmpBen_CHL
ON Census.TrackingID = EmpBen_CHL.TrackingID
WHERE ClientPlan.ClientID = 'CSMC'
July 8, 2008 at 12:04 pm
I'd need to see the execution plans to be sure, but I'll be you're getting index lookups for the column list, but not for the SELECT *. Can you post the execution plans?
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
July 8, 2008 at 12:36 pm
Here are the plans. remove the .txt
Thnanks, Brian
July 8, 2008 at 12:53 pm
The specific columns plan has an index scan of PersonalData, while the * plan has a RID for that table based on an index seek.
I'd see about doing a covering index on the relevant columns for that table, and see what that does to it.
create index IDX_PeronalData_Cover1 on dbo.PersonalData (TrackingID) include (LName, FName, Address1, Address2, City, State, Zip, Phone, SSN)
Try that (name as appropriate for your convention), see what it does to the second one.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
July 8, 2008 at 1:13 pm
How come you don't need to add the scalar operator, '0 [Basic]', to the * version of the query? If you did add that, would it change the execution time?
Jeff will beat me for saying this, but according to the plan that is an added cost that is not present in the SELECT * plan. I'd just be curious to see if it makes a difference.
GSquared also caught the scan. That leads into a Merge join on the slow plan where is a neat little nested loop on the faster plan. So one row joined to 185 in the fast plan or 12000 filtered down to 185 in the slow plan. Follow his advice there.
Oh, and my guess was utterly wrong.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
July 8, 2008 at 1:13 pm
That did not work, same plan. The EmpBenefits table is the big one, almost 1.3 million rows. PersonalData & Census are smaller, ~125000 rows and ~50000 rows.
July 8, 2008 at 1:18 pm
Adding the '0 Basic' did not slow down the fast one.
July 8, 2008 at 1:23 pm
The plan changed slightly, for the PersonalData Index Scan it is now using the new index instead of a dta_ index. It is still not doing a RID lookup.
July 8, 2008 at 2:01 pm
Brian Tax (7/8/2008)
The plan changed slightly, for the PersonalData Index Scan it is now using the new index instead of a dta_ index. It is still not doing a RID lookup.
It won't need a RID lookup for a covering index.
Please post the new execution plans, with that covering index added.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
July 8, 2008 at 2:11 pm
Here is the new plan.
July 8, 2008 at 4:17 pm
I have solved this, but I am an not sure why. If I add the column EmpBen.TrackingID it runs fast and the plan is differnent, it uses a bunch more parallelism. I have attached the two plans (remove .txt). Why are the plans so different and the run time so much faster?
Thanks, Brian
Viewing 15 posts - 1 through 15 (of 22 total)
You must be logged in to reply to this topic. Login to reply