Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Table Valued functions Vs Scalar Valued Functions Expand / Collapse
Author
Message
Posted Wednesday, January 23, 2008 2:50 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Monday, April 27, 2009 8:28 PM
Points: 161, Visits: 162
Hi All,

I read in this particular post (http://sqlblog.com/blogs/adam_machanic/archive/2006/08/04/scalar-functions-inlining-and-performance-an-entertaining-title-for-a-boring-post.aspx) that is a scalar valued function is used as table valued function it would perform better ... but I do not really understand it why??

Can someone guide me in the right direction ?

Thanks
Post #446308
Posted Wednesday, January 23, 2008 6:01 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 3:26 AM
Points: 14,205, Visits: 28,536
I think this is a case of buyer beware, your mileage may vary, some assembly required... You get the drift.

Scalar functions, as Adam very clearly outlines, unless they're very simple & direct, generally lead to RBAR (row-by-agonizing-row) processing (all due deference to Jeff Moden) because the optimizer can't simply unpack them and add them to the execution plan in a meaningful way. So instead, Adam showed how using a table valued function, in that case, resulted in a query that the optimizer could do something with, hence a radical increase in speed (through a reduction in I/O).

It's not the case that ALL table valued functions will perform better than ALL scalar functions, but rather that there are situations where you achieve significant wins, depending on what the different functions are doing. Further, using Adam's example, you probably write a query, no function needed, that would perform even faster than the final UDF.

Personall, at my company, we've found functions just too problematic to use them much. Note the keyword, much. We do use them, but we try to very carefully identify places where we can achieve either maintenance & coding improvements at no performance cost, or performance improvements. Other than that, we try to stay away from them because they are very problematic, as Adam outlined.

Does that help at all?


----------------------------------------------------
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood..." Theodore Roosevelt
The Scary DBA
Author of:
SQL Server Query Performance Tuning
and
SQL Server Execution Plans

Product Evangelist for Red Gate Software
Post #446368
Posted Wednesday, January 23, 2008 6:37 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Monday, April 27, 2009 8:28 PM
Points: 161, Visits: 162
What I derived is:

functions can cause Row by row processing and also that would result in a table scan as for each row the function will have to be executed. The analogy which can be taken here is the use of aggregate function in the where clause.

Questions:
In case of table valued functions how does the processing take place? .... I read that joins can be executed by FunctionReturningTable(parameter) ... In this case the execution plan generated for the SP will include that of Function ... but when the function is in the where clause how does the database engine create a execution plan?

how and why in the above example returning a table reduce the logical reads?
Post #446380
Posted Wednesday, January 23, 2008 7:53 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 3:26 AM
Points: 14,205, Visits: 28,536
If you look at the execution plans for the two queries, the scalar function looks like the simpler plan. It simply has a clustered scan and a scalar operation. The scan is apparently the full cost of the execution. The other query shows two scans of the table, a hash match aggregate, and a merge join. Basically the optimizer is able to see into the table valued function and incorporate it's query into the plan for the outer query. Because it's able to generate a plan, it can make good choices in terms of performance. I set up the tests on my copy of AdventureWorks and saw the performance times go from 937ms to 78ms.
The reads, according to SQL Trace, go from 374944 to 1253. Again, this has to come from getting an actual plan that seeks out appropriate data & joins it together as opposed to opening the table and checking each and every row, one by one.

The only funny thing was, based on estimated plan cost, when the two scripts were run side by side, was 1% and the second plan was 99% even though the actual execution statistics went in exactly the opposite direction.


----------------------------------------------------
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood..." Theodore Roosevelt
The Scary DBA
Author of:
SQL Server Query Performance Tuning
and
SQL Server Execution Plans

Product Evangelist for Red Gate Software
Post #446446
Posted Wednesday, January 23, 2008 8:02 AM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Today @ 5:23 AM
Points: 40,620, Visits: 37,088
One of the things to watch for with table-valued functions (especially the multi-statement table valued functions) is that the resultant table, just like a table variable) has no column statistics and no indexes. The optimiser doesn't know how many rows are in the function's result table. That's not an issue for small numbers of rows, but it becomes a very big issue for larger numbers of rows.

If the optimiser misjudges the number of rows, it can result in very poor query plans.

Note, this is a maybe, possibly, could be, not definitely and always. Test carefully.



Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #446455
Posted Wednesday, January 23, 2008 8:11 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 7:02 PM
Points: 7,179, Visits: 15,776
The other place not yet mentioned where this gets even more important to consider is when you use a table-valued function within a CROSS APPLY clause. It's in those cases where it sets up something like a "correlated sub-function".

Again - very powerful potential, but will tend to quickly turn into something not at all efficient on the execution plan side.


----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
Post #446461
Posted Thursday, January 24, 2008 12:31 PM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Sunday, September 28, 2014 1:41 PM
Points: 959, Visits: 2,888
I've bench marked the type of scalar function in Adam's article against correlated sub queries, since they're doing the same type of thing, and they seem to work at about the same efficiency. When the result set is small you don't see much performance problem. When the result set gets into the hundreds or thousands of rows, the performance takes a big dive.

I've pretty much always used derived tables when I need to compare a single row to an aggregate value and I usually get good performance.

Something like this:

SELECT
O.CustomerID, OD.ProductID, OD.Quantity
, X.MaxQty
FROM
Orders O
INNER JOIN
OrderDetails OD ON
O.OrderID = OD.OrderID
INNER JOIN
(SELECT ProductID, MAX(Quantity) AS MaxQty
FROM Orders O
INNER JOIN OrderDetails OD ON
O.OrderID = OD.OrderID
WHERE O.OrderDate BETWEEN '1/1/07' and '1/31/07'
GROUP BY OD.ProductID
) AS X ON
OD.ProductID = X.ProductID
WHERE
O.OrderID BETWEEN '1/1/07' AND '1/31/07'

The hard dates are usually just variables in the SProc.

I'd have to bench mark a UDF that returns a table against this type of logic to see what happens
Todd Fifield
Post #447154
Posted Wednesday, June 25, 2008 1:53 PM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Thursday, December 18, 2014 7:35 AM
Points: 74, Visits: 549
Matt Miller (1/23/2008)
The other place not yet mentioned where this gets even more important to consider is when you use a table-valued function within a CROSS APPLY clause. It's in those cases where it sets up something like a "correlated sub-function".

Again - very powerful potential, but will tend to quickly turn into something not at all efficient on the execution plan side.



GilaMonster (1/23/2008)
One of the things to watch for with table-valued functions (especially the multi-statement table valued functions) is that the resultant table, just like a table variable) has no column statistics and no indexes. The optimiser doesn't know how many rows are in the function's result table. That's not an issue for small numbers of rows, but it becomes a very big issue for larger numbers of rows.

If the optimiser misjudges the number of rows, it can result in very poor query plans.

Note, this is a maybe, possibly, could be, not definitely and always. Test carefully.




I'm assuming that those and the other statements talk about table functions in general, and I believe they are true for non-Inline table functions.

But I'm very interessed to hear what you guys think about Inline Table Functions.
if we use inline table functions, would we have performance decrease?

As far as I understand, when an SQL statement references an Inline Table Function,the parser and query optimizer analyze the source of both the SQL statement and the Inline Table Function and then resolve them into a single execution plan. There is not one plan for the SQL statement and a separate plan for the Inline Table Function (please correct me if I'm wrong). If this is true, would we still have any reasons to believe that an Inline Table Function may hurt the performance?

Thanks,
Luiz.


Post #523694
Posted Wednesday, June 25, 2008 3:13 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 7:02 PM
Points: 7,179, Visits: 15,776
It's ultimately going to depend on what the function is and how it does it. Yes, inlining a function can yield some interesting perf gains, but again, they tend to fall into specific circumstances. So - there's not really a decent way to generalize. In both of the cases you mentioned, even if the function execution itself is faster due to inlining, there are lots of opportunities for derived tables to flat out beat this kind of setup (unless the inline function itself is a derived table setup).

This is just one of those where you need to test, test and test. And did I mention to test?


----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
Post #523750
Posted Wednesday, June 25, 2008 3:29 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 12:22 AM
Points: 35,772, Visits: 32,443
The best thing to do, in most cases, it to avoid any kind of function and any kind of correlated sub-query including Cross Apply. Yeah, I know... I'm going to get the same argument that people make for cursors... sometimes, it's not possible to do otherwise. The problem is, that a lot of folks just don't know what is and what is not possible... ;)

--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #523761
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse