Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 1234»»»

Performance Issue - understanding query plans Expand / Collapse
Author
Message
Posted Tuesday, February 05, 2013 3:51 AM


SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Thursday, June 13, 2013 8:15 PM
Points: 162, Visits: 437
Hi,

We have a query with a bad performance issue. (28 hours to run).

I have been looking at the estimated query plan and noticed something strange.

Conceptually we are joining a .5 million row table (SE) which contains a non-overlapping from/to field to a 9 million row table (SFI) like this:

UPDATE leica.States_for_Interval
SET
Smooth_Time_Key = COALESCE(SE.SmoothTimeKey,SFI.Time_Key),
Smooth_FailureEvent = CASE WHEN SE.FirstFailure = 'Y' AND SFI.RowNum = SE.RowNum_Start THEN 'Y' ELSE 'N' END
FROM
leica.States_for_Interval SFI
INNER JOIN
tfm.stg_Time_SmoothEvents SE
ON SFI.RowNum BETWEEN SE.RowNum_Start AND SE.RowNum_End
AND SFI.SRC_System = SE.SRC_System
AND SE.SRC_Application = 'LEICA'


I've attached the query plan, and table creation DDL

I know that there is a 1:n relationship as the ranges do not overlap. So every row in the large table should match only one row in the range table.

Therefore the output of this query should be the number of rows in T (9 million)

HOWEVER when I look at the query plan I see that

The 'Estimated Number of rows' on the smaller table is the FULL row count
The estimated rows and the larger table is the the full row count
The estimated output from the Nested Loop join looks like the cross product of this.


From looking at other query plans with tables with 1:n relationships, I see this:
The 'Estimated Number of rows' on the smaller table is ONE (not all the rows)
The estimated rows on the larger table is the full row count
The estimated output from the Nested Loop join is the the full row count



In other words it looks to me like SQL Server thinks there is a cross join going on. But I know from the data that there isn't (I am yet to actually test this but I'll get back to you)


So I have two questions:

1. Am I correct in the way the nested loops operator should work under 1:n circumstances - the smaller input should have a estimated row count of ONE
2. HOW can I give SQL Server the clue that the ranges are mutually exclusive? There are no constraints that I can put on the table to indicate this.


  Post Attachments 
ExecutionPlan.bmp (9 views, 362.28 KB)
ExecutionPlan.DDL.TXT (10 views, 2.26 KB)
IndexUsageStats.txt (4 views, 2.99 KB)
1.Original.sqlplan (3 views, 33.05 KB)
2.Force1n.sqlplan (1 view, 28.71 KB)
4.WithUniqueIndex.sqlplan (3 views, 31.31 KB)
Post #1415696
Posted Tuesday, February 05, 2013 4:02 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 4:18 PM
Points: 38,062, Visits: 30,359
Can you post the execution plan please?


Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #1415705
Posted Tuesday, February 05, 2013 4:31 AM


SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Thursday, June 13, 2013 8:15 PM
Points: 162, Visits: 437
OK I've attached a text query plan.

I'm not at work right now so I have limited access but I very much appreciate you having a look for me.

I've also edited the post with the sanitised query rather than an example
Post #1415728
Posted Tuesday, February 05, 2013 4:52 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 4:18 PM
Points: 38,062, Visits: 30,359
Don't suppose you could post the graphical plan? There's not enough info in that text plan.


Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #1415739
Posted Tuesday, February 05, 2013 5:20 AM


SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Thursday, June 13, 2013 8:15 PM
Points: 162, Visits: 437
Sure, I'll do it first thing tomorrow at work, thats in about 11 hours, thanks for your interest!
Post #1415760
Posted Tuesday, February 05, 2013 10:56 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Yesterday @ 10:11 AM
Points: 1,333, Visits: 1,803
nick.mcdermaid (2/5/2013)
2. HOW can I give SQL Server the clue that the ranges are mutually exclusive? There are no constraints that I can put on the table to indicate this.



You should try uniquely clustering the SE table by (RowNum_Start, RowNum_End).

Just to review, what are the current indexes and their usages on the SE table?

Also, review the indexes and usages on SFI, particularly the clustering index (as always!): but it will almost certainly be much more involved to determine the correct clustered index on SFI.


Edit: Changed the understated "Have you tried uniquely clustering" to "You should try ...".


SQL DBA,SQL Server MVP('07, '08, '09)
One man with courage makes a majority. Andrew Jackson
Post #1416002
Posted Tuesday, February 05, 2013 4:39 PM


SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Thursday, June 13, 2013 8:15 PM
Points: 162, Visits: 437
OK, I have reposted an unsanitised query, the sqlplan file, the DDL, and a picture of part of the query plan.

I can try clustering SE on RowNum_Start and RowNum_End but this still won't tell it those ranges are mutually exclusive.

Post #1416161
Posted Tuesday, February 05, 2013 4:47 PM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Yesterday @ 10:11 AM
Points: 1,333, Visits: 1,803
nick.mcdermaid (2/5/2013)
OK, I have reposted an unsanitised query, the sqlplan file, the DDL, and a picture of part of the query plan.

I can try clustering SE on RowNum_Start and RowNum_End but this still won't tell it those ranges are mutually exclusive.




If they are uniquely clustered it will ... right!?

Just noticed the "SE.SRC_System" and "SE.SRC_Application = 'LEICA'", so those will need to be in the clus index first (perhaps, if the system and application values are not all the same).

But pls first post the indexing stats on both tables (from sys.dm_db_index_usage_stats).


SQL DBA,SQL Server MVP('07, '08, '09)
One man with courage makes a majority. Andrew Jackson
Post #1416163
Posted Tuesday, February 05, 2013 5:07 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 5:35 PM
Points: 5,722, Visits: 6,194
nick.mcdermaid (2/5/2013)
OK, I have reposted an unsanitised query, the sqlplan file, the DDL, and a picture of part of the query plan.


The .sqlplan you posted is the estimated plan, not the actual. Can the query not complete?

The key on stg_Time_SmoothEvents is fine, but that should probably be a non-clustered PK. At the least you want an index built to (and in this order):
SRC_Application, SRC_System, RowNum. If it's non-clustered use an include and bring over SmoothTimeKey

The reason Rownum is last is because it's a range join (non-equijoin) and the index will stop using columns deeper in the index from there. That should clean up the major scan.

From there, if we can see the actual, that'll help tremendously.

EDIT: Additionally, remove the implicit conversion. SRC_Application is NVARCHAR. Use AND SRC_Application = N'LEICA' instead of SRC_Application = 'LEICA'. The N'' will indicate varchar, removing one step of the process.



- Craig Farrell

Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

For better assistance in answering your questions | Forum Netiquette
For index/tuning help, follow these directions. |Tally Tables

Twitter: @AnyWayDBA
Post #1416169
Posted Tuesday, February 05, 2013 6:37 PM


SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Thursday, June 13, 2013 8:15 PM
Points: 162, Visits: 437
Thanks for your interest and suggestions everyone.

I have uploaded usage stats.

The query has not completed for a couple of days, and we have rebooted the server since then so I think that Actual plan is gone until we can let it complete. Do you expect any variance between the actual and estimated plan if I generate the estimated plan while the actual query is running (i.e. stats and record counts are identical). I do appreciate that need to remove the uncertainty and analyse the actual plan.

I can see that there is a lot of scope for altering indexes. Unfortunately I think I need to let it complete and get at least one actual plan before we can try anything.

With regards to the data, in this particular case, SRC_Application, SRC_System currently have only one unique value anyway.


My real concern at this stage is why does the loop join operator say that the estimated number of rows (which I assume is the number of rows it processes or outputs) going to produce a cross join of records from the inputs when I know the data is n:1


On a side note, about the exclusive ranges thing: What I'm getting at is that if FROM/TO ranges are overlapping and you join a transactional table into it.... you'll get repeated transactional records. i.e. if you have these two records:

Record FROM TO
A 1 10
A 1 20


Yes the combination of FROM/TO is unique, but if you join into this table using between you get an overlapping range from 1-10

So a unique constraint on both columns does not guarantee that there is not an overlapping range. Anyway that's besides the point. Perhaps I will start a new thread on that topic.
Post #1416196
« Prev Topic | Next Topic »

Add to briefcase 1234»»»

Permissions Expand / Collapse