Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

where fldValue=123 vs join to table with one row Expand / Collapse
Author
Message
Posted Wednesday, January 23, 2013 6:50 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Tuesday, April 22, 2014 10:38 AM
Points: 103, Visits: 448
I have a huge DB and a view that pulls records from multiple tables, some as big as 2M records. I have great indexes. I can query: select from myView where fldValue = 123.

It returns 3K records in .3 seconds. Happy. Now I need to return data for multiple fldValue values.

Select from myView where fldValue IN (123,456,789)

Works great. However this stinks:

Select from myView V inner join myTemp T ON V.fldValue = T.fldValue

So for a test, I reduced the temp table to one record. Still falls on its face. I have a few whacky solutions. My question is, why would this happen and is there a non whacky way to make the join to the temp table perform correctly?

Thanks
ST

Post #1410539
Posted Wednesday, January 23, 2013 6:56 AM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Friday, July 25, 2014 12:09 PM
Points: 429, Visits: 334
souLTower (1/23/2013)
I have a huge DB and a view that pulls records from multiple tables, some as big as 2M records. I have great indexes. I can query: select from myView where fldValue = 123.

It returns 3K records in .3 seconds. Happy. Now I need to return data for multiple fldValue values.

Select from myView where fldValue IN (123,456,789)

Works great. However this stinks:

Select from myView V inner join myTemp T ON V.fldValue = T.fldValue

So for a test, I reduced the temp table to one record. Still falls on its face. I have a few whacky solutions. My question is, why would this happen and is there a non whacky way to make the join to the temp table perform correctly?

Thanks
ST



What's the DDL for that temp table? Do you have the same PK and indexing on the temp table as you do on the tables underlying the view? Or are you leaving the temp table in a heap?


Roland Alexander
The Developing World


There are two means of refuge from the miseries of life: music and cats. ~ Albert Schweitzer
Post #1410545
Posted Wednesday, January 23, 2013 7:34 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Yesterday @ 5:56 AM
Points: 1,839, Visits: 3,421
Can you post the actual execution plan for both queries here?

A wild guess is that parallell execution with hash match or sort is involved, and that the distribution of processed rows between each thread is so skewed that some threads run out of memory to store the hash buckets and therefore spilling them to tempdb.

Look for parallellism operators in the actual exec plan, and see if number of rows processed by each thread is very skewed. Also run a profiler trace on your spid with Sort Warnings and Hash Warnings included when executing the query.
Post #1410577
Posted Wednesday, January 23, 2013 8:59 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Tuesday, April 22, 2014 10:38 AM
Points: 103, Visits: 448
The execution plans are pretty big. I'm not sure how to post them.
I get completely different execution plans between fldValue=123 and fldValue IN (select value from tempTable). I know that the optimizer is a mystery but to me, I'm asking for the same thing. It boils down to:

A) where fldValue=CONSTANT
B) where fldValue IN (CONSTANT, CONSTANT2, etc)
C) where fldValue IN (Select from table with 1 row)

Execution plan and performace for A and B is the same. Very fast.
C is terrible

Is there a hint or something I can try? I've tried adding an index to the temp table, using a table variable, no joy. I'm left with:

Create the IN () list as a SQL variable and run the query dynamically.
Query the temp table, get the value I want row by row, query the main query for each value and put the results in a temp table. Then query the temp table.

All of those ideas seem stupid to me.

Thanks for any additional thoughts.
ST

Post #1410650
Posted Wednesday, January 23, 2013 9:01 AM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Friday, July 25, 2014 12:09 PM
Points: 429, Visits: 334
souLTower (1/23/2013)
The execution plans are pretty big. I'm not sure how to post them.
I get completely different execution plans between fldValue=123 and fldValue IN (select value from tempTable). I know that the optimizer is a mystery but to me, I'm asking for the same thing. It boils down to:

A) where fldValue=CONSTANT
B) where fldValue IN (CONSTANT, CONSTANT2, etc)
C) where fldValue IN (Select from table with 1 row)

Execution plan and performace for A and B is the same. Very fast.
C is terrible

Is there a hint or something I can try? I've tried adding an index to the temp table, using a table variable, no joy. I'm left with:

Create the IN () list as a SQL variable and run the query dynamically.
Query the temp table, get the value I want row by row, query the main query for each value and put the results in a temp table. Then query the temp table.

All of those ideas seem stupid to me.

Thanks for any additional thoughts.
ST



Still like to know how that temp table is organized...


Roland Alexander
The Developing World


There are two means of refuge from the miseries of life: music and cats. ~ Albert Schweitzer
Post #1410653
Posted Wednesday, January 23, 2013 9:05 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Tuesday, April 22, 2014 10:38 AM
Points: 103, Visits: 448
Temp table is a heap with 1 row, 2 columns of int, int.

Looking for how to post an execution plan.
Post #1410659
Posted Wednesday, January 23, 2013 9:09 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Tuesday, April 22, 2014 10:38 AM
Points: 103, Visits: 448
Uploaded the two execution plans.

Thanks!
ST


  Post Attachments 
FastPlan.sqlplan (13 views, 547.09 KB)
slowPlan.sqlplan (8 views, 547.09 KB)
Post #1410662
Posted Wednesday, January 23, 2013 9:22 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Tuesday, April 22, 2014 10:38 AM
Points: 103, Visits: 448
Well, I can see the difference. I'm interested in knowing if there's a way to improve it.

When I use a constant in my query, the data is filtered at the first query in the execution plan. This greatly reduces the records upstream. When I use IN (temp), the records are filtered about halfway through the execution.
Post #1410672
Posted Thursday, January 24, 2013 9:26 AM
SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Yesterday @ 5:16 PM
Points: 1,970, Visits: 2,911
You need to get rid of the massive GROUP BY; you should be able to do the SUM() in a subquery rather than the main query, but I can't be sure which columns come from which views/tables so I can't rewrite any part of the query now.

Until then, at least GROUP BY the ORDER BY columns first, to reduce the sorting required to satisfy the query, viz:

	GROUP BY fldConsulNo, fldCCGroupCd, fldCCSortOrder, 
RC.fldCCCd, fldSubCCNo,
fldCCGroupDesc,
fldConsulName,
tblConsul.fldConsulKey,
fldCCDesc, fldSubCCName, v.fldCostCenterKey
ORDER BY fldConsulNo, fldCCGroupCd, fldCCSortOrder, fldCCCd , fldSubCCNo



SQL DBA,SQL Server MVP('07, '08, '09)
"In America, every man is innocent until proven broke!" Brant Parker
Post #1411209
Posted Thursday, January 24, 2013 11:24 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Tuesday, April 22, 2014 10:38 AM
Points: 103, Visits: 448
That helps but still doesn't explain why the
where IN (sub select)

has such a different execution plan than

WHERE IN (constants)

What I've done to improve the performance (and I'm not really proud of) is to loop over the list of values, query the main view into a temp table using WHERE fldValue = syntax, then query the temp table for the final output. I tried changing indexes, join hints, lots of things but this worked best and was extensible whether querying against 1 value or 100.
Post #1411275
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse