Optimiser estimating wrong number of actual rows.

Question

Post reply

Optimiser estimating wrong number of actual rows.

Rahul N

SSC Veteran

Points: 280
More actions
August 5, 2009 at 8:57 am

#208918

Hello,
I have a table valued function which works out the entities that a particular user does not have access to. It is joined to a view which produces search results for the interface.
If the search results (before filtering restricted entities) return anywhere between 7K to 20k rows then the left outer join to TVF results in a significant performance drop. When I checked the query plans the function was contributing '1115167077' actual number of rows when in reality it has 97K rows.
If I run the same query to return search results outside 7K-20K range then the left outer join contributes correct number of actual rows.
This is my sql query
The view returns 11K rows and then the result is filtered based on the rows in TVF.
(Execution time 1 min 50 sec)
select DISTINCT MATTER_DIWOR as 'Matter Diwor', MATTERCODE as 'Matter Code', Matter_Description as 'Matter Description' ,
Matter_Status as 'Status' , client_name as 'Client Name' , Other_Reference as 'Other Ref'
from VW_RET_MAT vm
left outer join [dbo].lsfun_getaccessibleentities ('M', 195) ls on vm.MATTER_DIWOR = ls.ENTITY_DIWOR
where MATTERCODE like '%m%'
and ls.ENTITY_DIWOR is null
order by 2
when I change the search criteria to exceed the above specified range,
This time view returns 33K rows and then results are filtered by the left join
(Execution time 8 sec)
select DISTINCT MATTER_DIWOR as 'Matter Diwor', MATTERCODE as 'Matter Code', Matter_Description as 'Matter Description' ,
Matter_Status as 'Status' , client_name as 'Client Name' , Other_Reference as 'Other Ref'
from VW_RET_MAT vm
left outer join [dbo].lsfun_getaccessibleentities ('M', 195) ls on vm.MATTER_DIWOR = ls.ENTITY_DIWOR
where MATTERCODE like '%a%'
and ls.ENTITY_DIWOR is null
order by 2
Can somebody shed some light as to why it is happening?
Any help will be appreciated:-)

Viewing 8 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply

Jack Corbett SSC Guru Points: 184395 More actions · Answer 1

Are your statistics up to date? You can check using sys.stats and STATS_DATE:

SELECT

T.[name] AS table_name,

I.[object_id],

I.[name] AS index_name,

I.index_id,

I.type_desc,

I.fill_factor,

S.[name] AS stats_name,

S.stats_id,

S.auto_created,

S.user_created,

S.no_recompute,

STATS_DATE(S.[object_id], stats_id) AS STATS_DATE

FROM

sys.stats S JOIN

sys.tables T

ON S.[object_id] = T.[object_id] LEFT JOIN

sys.indexes I

ON S.[object_id] = I.[object_id] AND

S.stats_id = I.index_id

Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question

GSquared SSC Guru Points: 260824 More actions · Answer 2

Is it an inline or multi-value UDF?

Multi-value UDFs are treated like table variables by the database engine, which means it estimates one row per call, regardless of how many it will actually be. That sounds like what you're running into.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Jeff Moden SSC Guru Points: 1004664 More actions · Answer 3

Estimated rows are just that... estimated. From what I've seen, they're rarely correct and there's not much you can do about it if your stats are actually up to date.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Rahul N SSC Veteran Points: 280 More actions · Answer 4

Those are not estimated rows. When you open a query plan and hover over the arrows it actually shows you Actul and Estimated row contribution of every branch.

Estimated rows are correctly displayed as 1 as it is a multiline function returning table variable but the actual number of rows jump to billion when TVf is getting joined with a result set containing the rows in the rang of 7k-20k. Outside of that range that figure of actual rows actually displays the correct number of rows in the TVF.

It is really baffling as it is sort of a search hotspot for my system. I have tested that function when it returns 120k rows and if my search result set falls out of the above said range then the query execution time is 16 sec. But if my search results from the view fall in the range then even if the function is returning 10 rows it takes close to a minute to execute the query.

GSquared SSC Guru Points: 260824 More actions · Answer 5

Actually, that kind of makes sense. It might be the difference between join mechanics. Hash Match can be quite fast, and might end up being used on large numbers of rows.

Is the UDF something that can be replaced by an inline query? Perhaps with Cross Apply? Or maybe have its results dumped into a temp table or table variable at the beginning of the proc and just run it once and then join to that?

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Jeff Moden SSC Guru Points: 1004664 More actions · Answer 6

Rahul N (8/6/2009)
Those are not estimated rows. When you open a query plan and hover over the arrows it actually shows you Actul and Estimated row contribution of every branch.

Heh... yep, I know that. But, it's real hard to see what you've done from here and many folks confuse the two because they don't think that estimated rows will be shown on an actual execution plan for some reason. Bad assumption on my part about what you meant. Sorry.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Paul White SSC Guru Points: 150468 More actions · Answer 7

Rahul,

The explanation will be an easy one - we just need you to save an actual execution plan (right-click, Save Execution Plan As...) then zip it up and attach it to a post. It needs to be zipped because this site will not accept *.sqlplan attachments directly.

(My crystal ball predicts a loop join with a table spool.)

Paul

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi