Performance Difference Between TOP 1 and TOP 2, FTS+Relational Mixed Query

Question

Post reply

Performance Difference Between TOP 1 and TOP 2, FTS+Relational Mixed Query

Viewing 8 posts - 16 through 23 (of 23 total)

You must be logged in to reply to this topic. Login to reply

b.lam SSC Veteran Points: 284 More actions · Answer 1

@Lynn Pettis

Thanks for taking a look.

I actually haven't tried the query you suggested so I ran it against some live data just now.

The query adds an additional nested loop to the left of the original inner join for the TVF.

Everything else to the right of the execution plan seems to be the same.

Performance is a little worse than the original query.

As for your other question, yes, I have tried running just the part of the query with the CONTAINs predicate.

Performance is pretty good (less than 1 sec to return all rows). Of course, the execution plan for this is also different.

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 2

Have you tried putting the results of the query using just the contains predicate into a temporary table (not a table variable), and then query the results? You may need to index the temporary table as well.

Orlando Colamatteo SSC Guru Points: 182293 More actions · Answer 3

b.lam (6/20/2011)
@opc.three:
Thanks for the suggestion.
The CTE will actually have to be flipped to include the CONTAINs predicate.
The reason being that the CONTAINs predicate cannot be applied to a table that is not full-text indexed (e.g. the CTE).
...
The above query is similar to one of the things I tried before.
The resulting execution plan for this will be the same as the original query, where the TVF estimated vs actual still having too large a gap on TOP 1 but not on TOP n.

I violated my rule about developing against thin air and delivering untested code. I figured they might show the same plan when written as a CTE but haven't worked much with FTS so wasn't sure if that may generate a different plan

Lynn Pettis (6/20/2011)
Have you tried putting the results of the query using just the contains predicate into a temporary table (not a table variable), and then query the results? You may need to index the temporary table as well.

You stole the words right out of my brain 😀 The comment about the main FTS only taking a few seconds when run outside the rest of the query took me right to using temp tables as well.

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato

joanmask31 Newbie Points: 7 More actions · Answer 4

joanmask31

Newbie

Points: 7

June 20, 2011 at 11:07 pm

#1341603

haha, thask.

b.lam SSC Veteran Points: 284 More actions · Answer 5

Lynn Pettis (6/20/2011)
Have you tried putting the results of the query using just the contains predicate into a temporary table (not a table variable), and then query the results? You may need to index the temporary table as well.

Thanks again for the suggestion, Lynn (and opc.three). Very much appreciated.

I have tried testing with a table variable before but not a temp table.

A table variable would actually do a similar execution plan with the TVF call ending up as the outer table of the nested loop (and thus having an execution count of 1). Performance on a single run seems to be the same for temp tables, although I would prefer to use table variables in this case due to their limited scope.

Going back to my original question though, what do you think is causing the correlated parameter to not be checked for the original query when TOP 1 is applied? If the correlated parameter check is not relevant, what else may be causing the slow down?

Orlando Colamatteo SSC Guru Points: 182293 More actions · Answer 6

b.lam (6/21/2011)
Lynn Pettis (6/20/2011)
Have you tried putting the results of the query using just the contains predicate into a temporary table (not a table variable), and then query the results? You may need to index the temporary table as well.
Thanks again for the suggestion, Lynn (and opc.three). Very much appreciated.
I have tried testing with a table variable before but not a temp table.
A table variable would actually do a similar execution plan with the TVF call ending up as the outer table of the nested loop (and thus having an execution count of 1). Performance on a single run seems to be the same for temp tables, although I would prefer to use table variables in this case due to their limited scope.

I would recommend temp tables here given the expected number of rows (~215K). Te optimizer always assumes a rowcount of 1 for table variables so you have a better chance of getting the right plan using temp tables.

Going back to my original question though, what do you think is causing the correlated parameter to not be checked for the original query when TOP 1 is applied? If the correlated parameter check is not relevant, what else may be causing the slow down?

We definitely see from the plan that the FTS function is the source of the differences...but beyond that I got nothing, sorry. If there is a Microsoft support incident to burn and the time my next move would be to open a ticket.

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato

b.lam SSC Veteran Points: 284 More actions · Answer 7

opc.three (6/21/2011)
I would recommend temp tables here given the expected number of rows (~215K). Te optimizer always assumes a rowcount of 1 for table variables so you have a better chance of getting the right plan using temp tables.

I'll definitely evaluate temp tables as you suggest.

I wanted to point out that with the TVF on the outer table of the nested loop, and an execution count of 1, I'm not expecting it to reach 200k (I mentioned in the first post that the table only has 120k rows). Testing using live data has an actual count of ~290 rows.

While the TVF was the inner table of the loop (as seen on the *.sqlplan attached before), it seems like it may have been doing 700++ executions at 300 rows, resulting in 200K++ rows. I am thinking this was due to the lack of correlated parameters and was hoping there was an explanation as to why the optimizer was not using this for TOP 1 when it was doing so for TOP 2.

opc.three (6/21/2011)
We definitely see from the plan that the FTS function is the source of the differences...but beyond that I got nothing, sorry. If there is a Microsoft support incident to burn and the time my next move would be to open a ticket.

It does sound like it may be an issue with SQLServer. Thank you for all your help 🙂

Orlando Colamatteo SSC Guru Points: 182293 More actions · Answer 8

b.lam (6/22/2011)
opc.three (6/21/2011)
I would recommend temp tables here given the expected number of rows (~215K). Te optimizer always assumes a rowcount of 1 for table variables so you have a better chance of getting the right plan using temp tables.
I'll definitely evaluate temp tables as you suggest.
I wanted to point out that with the TVF on the outer table of the nested loop, and an execution count of 1, I'm not expecting it to reach 200k (I mentioned in the first post that the table only has 120k rows). Testing using live data has an actual count of ~290 rows.
While the TVF was the inner table of the loop (as seen on the *.sqlplan attached before), it seems like it may have been doing 700++ executions at 300 rows, resulting in 200K++ rows. I am thinking this was due to the lack of correlated parameters and was hoping there was an explanation as to why the optimizer was not using this for TOP 1 when it was doing so for TOP 2.
opc.three (6/21/2011)
We definitely see from the plan that the FTS function is the source of the differences...but beyond that I got nothing, sorry. If there is a Microsoft support incident to burn and the time my next move would be to open a ticket.
It does sound like it may be an issue with SQLServer. Thank you for all your help 🙂

My pleasure, thanks for posting back...and my bad on the misquoted row count...the 215K number from the plan operator stuck with me 🙂

There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato