Creating and using inline table-valued functions

,

In case we've forgotten or have never come across inline table-valued functions before, we'll start with a quick explanation of what they are.

Taking the definition of what a user defined function (UDF) is from Books Online, an inline table-valued function (iTVF) is a table expression that can accept parameters, perform an action and provides as its return value, a table. The definition of an iTVF is permanently stored as a database object, similar to a view would.

The syntax to create an iTVF from Books Online (BOL) can be found here and is as follows:

CREATE [ OR ALTER ] FUNCTION [ schema_name. ] function_name  

( [ { @parameter_name [ AS ] [ type_schema_name. ] parameter_data_type  

    [ = default ] [ READONLY ] }  

    [ ,...n ] 

  ] 

RETURNS TABLE 

    [ WITH <function_option> [ ,...n ] ] 

    [ AS ] 

    RETURN [ ( ] select_stmt [ ) ] 

[ ; ]

The following code snippet is an example of an iTVF.

CREATE OR ALTER FUNCTION [dbo].[itvfnGetRecentComment] (@UserId int, @CommentDate date = '2006-01-01T00:00:00.000')
RETURNS TABLE 
WITH SCHEMABINDING AS 
RETURN (
SELECT c.UserId, 
       MAX(c.CreationDate) AS LatestCommentDate, 
       COUNT(*) AS TotalComments
FROM dbo.Comments c
WHERE c.UserId = @UserId
      AND c.CreationDate >= @CommentDate
GROUP BY c.UserId
);

A few things worth mentioning in the above script.

DEFAULT VALUE ([ = default ]) - You can set default values for parameters and in our example, @CommentDate has a default value of 01 Jan 2006.

RETURNS TABLE - Returns a virtual table based on the definition of the function

SCHEMABINDING - Specifies that the function is bound to the database objects that it references. This condition will prevent changes to the function if other schema bound objects are referencing it (from BOL).

Looking at the SELECT statement in the example iTVF, it is similar to a query that you'd place in a view, except for the parameter passed into the WHERE clause. This is a critical difference.

Although an iTVF is similar to a view in that the definition is stored permanently in the database, by having the ability to pass in parameters we have not only a way of encapsulating and reusing logic but also flexibility in being able to query for specific values we'd want to pass in. In this case, we can imagine an inline table-valued function to be a sort of "parameterized view".

Why use an iTVF?

Before we get stuck into seeing how we use iTVFs, it's important to consider why we'd use them. ITVFs allow us to provide solutions in support of aspects such as (but not limited to):

  1. Modular development
  2. Flexibility
  3. Avoiding performance penalties

I'll discuss each of these a bit below.

Modular development

iTFVs can encourage good development practices, such as modular development. Essentially, we want to ensure our code is "DRY" and not repeat code we've previously produced each time it's needed in a different place.

Ignoring the cases where modular development is taken to the nth degree at the cost of maintainable code, the benefits of code reuse and encapsulation with iTVFs are one of the first things we notice.

DECLARE @CommentDate DATE= '2008-01-01T00:00:00.000';
SELECT u.Id, 
       u.Reputation, 
       MAX(c.CreationDate) AS LatestCommentDate, 
       COUNT(*) AS TotalComments
FROM dbo.Users u
     INNER JOIN dbo.Comments c ON u.Id = c.UserId
WHERE c.CreationDate >= @CommentDate
GROUP BY u.Id, 
         u.Reputation;
GO

VS

DECLARE @CommentDate DATE= '2008-01-01T00:00:00.000';
SELECT u.Id, 
       u.Reputation, 
       c.LatestCommentDate, 
       c.TotalComments
FROM dbo.Users u
     CROSS APPLY [dbo].[itvfnGetRecentComment](u.Id, @CommentDate) c;
GO

The query using the iTVF is straightforward and when used in more complicated queries, can really aid development efforts by hiding the underlying complexity. In scenarios where we need to tweak the logic in the function for enhancement or debugging purposes, we can make the change to one object, test it and that one change will be reflected in the other places the function is called.

If we had repeated the logic in multiple places, you'd have to make the same change multiple times with the risk of errors, deviations or other mishaps each time. This does mean we need to understand where we reuse logic in iTVFs, so we can test that we haven't introduced those troublesome features known as bugs.

Flexibility

When it comes to flexibility, the idea to use the ability of functions to pass in parameter values and be easy to interact with. As we will see with examples later, we can interact with the iTFV as though we were querying a table.

If the logic were in a stored procedure for example, we'd likely have to take the results into a temporary table and then query the temporary table to interact with the data.

If the logic were in a view, we wouldn't be able to pass in a parameter. Our options would include querying the view and then adding a WHERE clause outside of the definition of the view. While not a big deal for this demo, in complicated queries, simplifying with an iTVF can be a more elegant solution.

Avoiding performance penalties

For the scope of this article, I won't be looking at Common Language Runtime (CLR) UDFs.

Where we start to see a divergence between iTVFs and other UDFs is performance. A lot can be said about performance, and for the most part, other UDFs suffer a lot, performance wise. An alternative could be to rewrite them as iTVFs instead, as performance is a bit different for iTFVs. They don't suffer from the same performance penalties that affect scalar or Multi Statement Table-Valued Functions (MSTVFs).

iTVFs, as their name would suggest, get inlined in the execution plan. With the optimiser un-nesting the definition's query elements and interacting with the underlying tables (inlining), you have a better chance of getting an optimal plan, as the optimiser can consider the statistics of the underlying tables and has available to it, other optimisations like parallelism, if this is required.

For many people, performance alone has been a big reason to use iTVFs in place of scalar or MSTVFs. Even if you don't care about performance just yet, things may change very quickly with greater volumes or complicated applications of your logic so understanding the pitfalls of the other types of UDFs is important. Later in this article, we'll show a basic performance comparison involving iTVFs and other UDF types.

Now that we've listed some reasons for using iTVFs, let's go ahead and see how we can use them.

Setup iTVF demo

In this demo, we will use the StackOverflow2010 database, which is freely available from the lovely folks at StackOverflow via https://archive.org/details/stackexchange

Alternatively, you can get the database via the other lovely folks at Brent Ozar Unlimited here : https://www.brentozar.com/archive/2015/10/how-to-download-the-stack-overflow-database-via-bittorrent/

I have restored the StackOverflow database on my local SQL Server 2016 Developer Edition, in an isolated environment. I will run a quick demo on my installed version of SQL Server Management Studio (SSMS) v18.0 Preview 5. I also set the compatibility mode to 2016 (130) and I'm running the demo on a machine with an Intel i7 2.11 GHz CPU and 16GB of RAM.

Once up and running with the StackOverflow database, run the T-SQL in the 00_iTFV_Setup.sql resource file contained in this article to create the iTVFs we'll use. The logic in these iTVFs is straightforward and can be repeated in multiple places but here we'll write it once and offer it up for reuse.

Using an iTVF as a Virtual Table

Remember that iTVFs are not persisted anywhere in the database. What gets persisted is their definition. We can still interact with iTVFs as if they were tables, so in this respect, we treat them as virtual tables.

The following 6 examples show ways we can treat iTVFs as virtual tables. With the StackOverflow database up and running on your machine, feel free to run the following 6 queries from Section 1 to see this for yourself. You can also find these in the file, 01_Demo_SSC_iTVF_Creating_Using.sql.

/*
*************************************************************************
1. Using iTVFs in queries as if it were a table.
*************************************************************************
*/
DECLARE @UserId INT= 3, @CommentDate DATE= '2006-01-01T00:00:00.000';
--1.1 Get specific columns
SELECT UserId, 
       LatestCommentDate
FROM dbo.itvfnGetRecentComment(3, '2006-01-01T00:00:00.000');
--1.2 Get all columns 
SELECT *
FROM dbo.itvfnGetRecentComment(@UserId, @CommentDate);
--1.3 Use a default value
SELECT UserId, 
       LatestCommentDate, 
       TotalComments
FROM dbo.itvfnGetRecentComment(@UserId, DEFAULT);
--1.4 In the WHERE clause
SELECT u.DisplayName, 
       u.Location, 
       u.Reputation
FROM dbo.Users u
WHERE u.Reputation <=
(
    SELECT TotalComments
    FROM dbo.itvfnGetRecentComment(@UserId, DEFAULT)
);
--1.5 In the HAVING clause
SELECT u.Location, 
       SUM(u.Reputation) AS TotalRep
FROM dbo.Users u
GROUP BY u.Location
HAVING SUM(u.Reputation) >=
(
    SELECT rc.TotalComments
    FROM dbo.itvfnGetRecentComment(3, DEFAULT) rc
);
--1.6 In the SELECT clause
SELECT u.Location,
       CASE
           WHEN SUM(u.Reputation) >=
(
    SELECT TotalComments
    FROM dbo.itvfnGetRecentComment(3, DEFAULT)
)
           THEN 'YES'
           ELSE 'NO'
       END AS InRepLimit
FROM dbo.Users u
GROUP BY u.Location;

As shown above, when we interact with iTVFs, we can:

  1. Return the full column list
  2. Return specified columns
  3. Use default values by passing the keyword, DEFAULT
  4. Pass the iTVF in the WHERE clause
  5. use the iTVF in the HAVING clause
  6. use the iTVF in the SELECT clause

To put this into perspective, typically with a stored procedure, one would insert the result set into a temporary table and then interact with the temporary table to do the above. With an iTVF, there is no requirement to interact with other objects outside of the function's definition for cases similar to the 6 we've just seen.

iTVFs in nested functions

Although nesting of functions can bring about its own problems if done poorly, it is something we can do within an iTVF. Conversely, when used appropriately, it can be very useful in hiding complexity.

In our contrived example, the function, dbo.itvfnGetRecentCommentByRep, returns a user's latest comment, total comments etc., adding an additional filter for the reputation.

SELECT DisplayName, 
       LatestCommentDate, 
       Reputation
FROM [dbo].[itvfnGetRecentCommentByRep](3, DEFAULT, DEFAULT);
GO

When we return the last comment date, this is via the call to the other function, dbo.itvfnGetRecentComment.

CREATE OR ALTER FUNCTION [dbo].[itvfnGetRecentCommentByRep] (@UserId int, @Reputation int = 100, @CommentDate datetime = '2008-01-01T00:00:00.000')
RETURNS TABLE 
WITH SCHEMABINDING AS 
RETURN (
  SELECT u.DisplayName, 
       u.Reputation, 
       c.LatestCommentDate, 
       c.TotalComments
FROM dbo.Users u
     OUTER APPLY [dbo].[itvfnGetRecentComment](u.Id, @CommentDate) c
WHERE u.Id = @UserId
      AND u.Reputation >= @Reputation
);
GO

In the example, we can see that even with nesting of functions, the underlying definition gets inlined and the objects that we interact with are the clustered indexes, dbo.Users and dbo.Comments.

In the plan, without a decent index to support our predicates, a clustered index scan of the Comments table is performed, as well as going parallel where we end up number of executions as 8. We filter for UserId 3 from the Users table so we get a seek on the Users table and then join to the Comments table after the GROUP BY (stream aggregate) for the latest comment date.

iTVF Nested Function

Using the APPLY operator

The APPLY operator offers so many creative uses, and it's certainly an option we can apply (pun intended) to interacting with an iTVF. For more on the APPLY operator, see the following documents: https://docs.microsoft.com/en-us/previous-versions/sql/sql-server-2008-r2/ms175156(v=sql.105) and http://www.sqlservercentral.com/articles/Stairway+Series/121318/

In the following example we want to call the iTVF for all users and return their LatestCommentDate, Total Comments, Id and Reputation since 01 Jan 2008. To do this we use the APPLY operator, where execute for each row of the entire Users table, passing in the Id from the Users table and the CommentDate. In doing so, we retain the encapsulation benefits of using iTVF, applying our logic to all rows in the Users table.

DECLARE @CommentDate DATE= '2008-01-01T00:00:00.000';
SELECT u.Id, 
       u.Reputation, 
       c.LatestCommentDate, 
       c.TotalComments
FROM dbo.Users u
     OUTER APPLY [dbo].[itvfnGetRecentComment](u.Id, @CommentDate) c;
GO

iTVFs in JOINs

Similar to joining 2 or more tables, we can involve iTVFs in joins with other tables. Here, we want the DisplayName from the Users table and the UserId, BadgeName and BadgeDate from the iTVF, for users who obtained the badge of 'Student' on or after 01 Jan 2008.

DECLARE @BadgeName varchar(40)= 'Student', @BadgeDate datetime= '2008-01-01T00:00:00.000';
SELECT u.DisplayName, 
       b.UserId, 
       b.BadgeName, 
       b.BadgeDate
FROM dbo.Users u
     INNER JOIN [dbo].[itvfnGetBadge](@BadgeName, @BadgeDate) b ON b.UserId = u.Id;
GO

Alternative to other UDFs

Earlier in this article we mentioned that a reason to use iTVFs was to the avoid performance penalties that other types of UDFs such as scalar and MSTVFs suffer from. This section will demonstrate how we can replace scalar and MSTVFs with iTVFs and the following part will look at why we may choose to do so.

Scalar UDF alternative

Before we carry on, let’s run the following query to create the scalar UDF,  dbo.sfnGetRecentComment that we'll use shortly.

USE [StackOverflow2010]
GO
CREATE FUNCTION [dbo].[sfnGetRecentComment] (@UserId int, @CommentDate date = '2006-01-01T00:00:00.000')
RETURNS datetime
WITH  
SCHEMABINDING, 
RETURNS NULL ON NULL INPUT
AS
BEGIN
    DECLARE @LatestCreationDate datetime;
    SELECT @LatestCreationDate = MAX(CreationDate)
    FROM dbo.Comments
    WHERE UserId = @UserId
          AND CreationDate >= @CommentDate
    GROUP BY UserId;
    RETURN @LatestCreationDate;
END;
GO

After creating the scalar function, run the following query. Here we want to use the use the logic encapsulated in the scalar function to query the entire Users table and return the Id and the latest comment date of each user since 01 Jan 2008. We do this by adding the function in the SELECT clause where for parameter values, we will pass in the Id from the Users table and the comment date which we've just set at 01 Jan 2008.

DECLARE @CommentDate DATE= '2008-01-01T00:00:00.000';
SELECT u.Id, 
       [dbo].[sfnGetRecentComment](u.Id, @CommentDate) AS LatestCommentDate
FROM dbo.Users u;

A similar query can be written using an iTVF, which is shown in the following query. Here we want to get the Id and latest comment date since 01 Jan 2008 using the iTVF and we achieve this by using the APPLY operator against all rows of the Users table. For each row we pass in the user Id and comment date.

DECLARE @CommentDate DATE= '2008-01-01T00:00:00.000';
SELECT u.Id, 
       c.LatestCommentDate
FROM dbo.Users u
     OUTER APPLY [dbo].[itvfnGetRecentComment](u.Id, @CommentDate) c;
GO

MSTVF alternative

Run the following query to create the MSTVF dbo.mstvfnGetRecentComment, that we'll use shortly.

USE [StackOverflow2010]
GO
CREATE FUNCTION [dbo].[mstvfnGetRecentComment] (@UserId int, @CommentDate datetime = '2008-01-01T00:00:00.000')
RETURNS @RecentComment TABLE
(
    [UserId] int NULL,
    [LatestCommentDate] date NULL,
    [TotalComments] int NULL
)
WITH SCHEMABINDING AS
BEGIN
    INSERT INTO @RecentComment
           SELECT c.UserId,
                  MAX(c.CreationDate) AS LatestCommentDate,
                  COUNT(*) AS TotalComments
           FROM dbo.Comments c
           WHERE c.UserId = @UserId
                 AND c.CreationDate >= @CommentDate
           GROUP BY c.UserId;
    RETURN;
END;
GO

Now that we have created the MSTVF, we’ll use it to return the latest comment date for all rows in the Users table since 01 Jan 2008. Like the iTVF example with the APPLY operator in the earlier part of this article, we will “apply” our function to all rows in the Users table by using the APPLY operator and pass in the user Id for each row, as well as the comment date we set.

DECLARE @CommentDate date= '2008-01-01T00:00:00.000';
SELECT u.Id,
       c.LatestCommentDate
FROM [dbo].Users u
     OUTER APPLY [dbo].[mstvfnGetRecentComment](u.Id, @CommentDate) c;
GO

When we replace the MSTVF in our example with an iTVF, the result is very similar, expect for the name of the function that is called. In our case we will call itvfnGetRecentComment  in place of the MSTVF in the previous example, retaining the columns in the SELECT clause. Doing so allows us to get the Id and latest comment date since 01 Jan 2008 using the iTVF for all rows of the Users table. This is shown in the query that follows.

DECLARE @CommentDate DATE= '2008-01-01T00:00:00.000';
SELECT u.Id,
       c.LatestCommentDate
FROM dbo.Users u
     OUTER APPLY [dbo].[itvfnGetRecentComment](u.Id, @CommentDate) c;
GO

A Preference for iTVFs

A reason to choose an iTVF over a scalar function or MSTVF is performance.

We will shortly capture the performance of the 3 functions based on the example queries we ran in the previous section and not go into much detail, other than to compare performance across metrics such as execution time and execution count for the functions. The code for performance stats that we'll use throughout this demo, is from Listing 9.3 in the code samples of the 3rd edition of SQL Server Execution Plans by Grant Fritchey https://www.red-gate.com/simple-talk/books/sql-server-execution-plans-third-edition-by-grant-fritchey/

Scalar performance

Here is the previous query we ran invoking the scalar function.

DECLARE @CommentDate DATE= '2008-01-01T00:00:00.000';
SELECT u.Id,
       [dbo].[sfnGetRecentComment](u.Id, @CommentDate) AS LatestCommentDate
FROM dbo.Users u;

It's well known that scalar functions inhibit parallelism and that a statement with a scalar function is invoked once for every row. If your table has 10 rows, you might be OK. A few thousand rows, and you may be surprised (or not) by the drastic performance drop. Let’s see this in action by running the following code snippet.

What we want to do first is clear (in an isolated environment) the SQL Server cache and existing plan that could be reused(FREEPROCCACHE), ensuring that we obtain data from disk first (DROPPCLEANBUFFERS) to begin with a cold cache. Loosely speaking, we're starting from a clean slate by doing this.

After this, we run the SELECT query on the Users table where we return the Id and LatestCommentDate using the scalar function. As parameter values, we will pass in the Id from the Users table and the comment date which we've just set to 01 Jan 2008. An example is as follows.

-- run in an isolated environment
CHECKPOINT;
DBCC DROPCLEANBUFFERS;
DBCC FREEPROCCACHE;
GO
DECLARE @CommentDate DATE= '2008-01-01T00:00:00.000';
SELECT u.Id,
       [dbo].[sfnGetRecentComment](u.Id, @CommentDate) AS LatestCommentDate
FROM dbo.Users u;

Although the SELECT query where we utilise the scalar function may appear to be simple in structure, after 4 minutes on my machine, it was still executing.

Recall that for scalar functions, they execute once for every row. The problem with this is that row by row operations are very rarely the most efficient way of retrieving data in SQL Server compared to single set based operations.

To observe repeated executions, we can run a query for some performance stats. These are collected by SQL Server when we run our queries and here, we want to get the query statement, creation time, execution count of the statement as well as the query plan. The following query is an example of how we can get the performance stats.

SELECT SUBSTRING(dest.text, (deqs.statement_start_offset / 2) + 1, (CASE deqs.statement_end_offset
                                                                        WHEN-1
                                                                        THEN DATALENGTH(dest.text)
                                                                        ELSE deqs.statement_end_offset - deqs.statement_start_offset
                                                                    END) / 2 + 1) AS QueryStatement,
       deqs.creation_time,
       deqs.execution_count,
       deqp.query_plan
FROM sys.dm_exec_query_stats AS deqs
     CROSS APPLY sys.dm_exec_query_plan(deqs.plan_handle) AS deqp
     CROSS APPLY sys.dm_exec_sql_text(deqs.plan_handle) AS dest
WHERE deqp.objectid = OBJECT_ID('dbo.sfnGetRecentComment')
ORDER BY deqs.last_execution_time DESC;
GO

The resulting execution of the performance stats can be seen in the following image. The query statement shows the SELECT statement used by the scalar function, the execution count of 1788 has captured the repeated execution count (once for each row) and the plan that is used by the function can be viewed by clicking on the query plan XML message.

iTVF vs Scalar DMV Comparison

An additional way of seeing important performance metrics is by using extended events. For more on extended events, click here but for the purposes of our demo, run the following query to create an extended events session.

CREATE EVENT SESSION QuerySession ON SERVER 
  ADD EVENT sqlserver.module_end(
      WHERE ([package0].[equal_uint64]([sqlserver].[session_id],(??)))), -- enter your Session Id
  ADD EVENT sqlserver.sp_statement_completed(
     WHERE ([package0].[equal_uint64]([sqlserver].[session_id],(??)))),-- enter your Session Id
  ADD EVENT sqlserver.sql_batch_completed(
     WHERE ([package0].[equal_uint64]([sqlserver].[session_id],(??)))); -- enter your Session Id
GO
ALTER EVENT SESSION QuerySession ON SERVER STATE = START;
GO

Once the extended events session has been created, go to Management in Object Explorer, expand sessions and open the context menu for "QuerySession" which we've just created. Select "Watch live data".

In the newly open window, observe the multiple statement lines of the scalar function as evidence of the repeated executions.

Click on the red square to "Stop data feed" for now.

MSTVF performance

For MSTVFs, there can also be performance penalties as a result of their use and this is linked to SQL Server choosing sub optimal execution plans due to inaccurate statistics. SQL Server relies on internally captured statistics to understand how many rows of data a query will return so it can produce an optimal plan. This process of is known as cardinality estimation.

Prior to SQL Server 2014, the cardinality estimation for table variables was set at 1 row. From SQL Server 2014 onwards this is set at 100 rows, regardless of whether we're querying against 100 rows or 1 million rows. MSTVFs return data via table variables that are declared at the beginning of the function’s definition. The reality is that the optimiser will not seek the best plan it can find ever regardless of time and it must maintain a balance there (put very simply). With MSTVFs, the lack of accurate statistics (caused by inaccurate cardinality estimation of table variables within their definition) can inhibit performance by having the optimiser produce sub optimal plans.

In practice, replacing MSTVFs with iTVFs can bring about better performance as iTVFs will use the statistics of the underlying tables.

The following code snippet uses the APPLY operator to execute the MSTVF against the entire Users table. We return the Id from the Users table and the LatestCommentDate from the function for every row in the Users table, passing in the user's Id and comment date that's set for 01 Jan 2008. Similar to the scalar function run, we will do this from a cold cache. With this test, also have the Actual execution plan selected in SSMS, by clicking Ctrl + M. Run the query.

CHECKPOINT;
DBCC DROPCLEANBUFFERS;
DBCC FREEPROCCACHE;
GO
DECLARE @CommentDate date= '2008-01-01T00:00:00.000';
SELECT u.Id, 
       c.LatestCommentDate
FROM [dbo].Users u
     OUTER APPLY [dbo].[mstvfnGetRecentComment](u.Id, @CommentDate) c;
GO

The resulting execution plan is shown in the following image and in the yellow properties box, notice the values for the estimated number of rows vs actual number of rows.

You should see that the estimated number of rows from the table valued function is set at 100 but the actual number of rows is set at 151,482. Recall in our set up, we chose the SQL Server 2016 (130) compatibility mode and that from SQL Server 2014 onwards, cardinality estimation is set at 100 for table variables.

Such wildly different values for estimated number of rows vs actual number of rows typically hurts performance and may prevent us getting more optimal plans and ultimately better running queries.

MSTVF execution plan comparison

Similar to the scalar function we ran previously, we will run a query against SQL Server's performance stats. Here we also want to obtain the query statement, the creation time, the statement's execution count and the query plan used and filter for the MSTVF in the WHERE clause.

SELECT SUBSTRING(dest.text, (deqs.statement_start_offset / 2) + 1, (CASE deqs.statement_end_offset
                                                                        WHEN-1
                                                                        THEN DATALENGTH(dest.text)
                                                                        ELSE deqs.statement_end_offset - deqs.statement_start_offset
                                                                    END) / 2 + 1) AS QueryStatement, 
       deqs.creation_time, 
       deqs.execution_count, 
       deqp.query_plan
FROM sys.dm_exec_query_stats AS deqs
     OUTER APPLY sys.dm_exec_query_plan(deqs.plan_handle) AS deqp
     OUTER APPLY sys.dm_exec_sql_text(deqs.plan_handle) AS dest
WHERE deqp.objectid = OBJECT_ID('dbo.mstvfnGetRecentComment');

The execution plan we initially observed for the MSTVF is misleading as it does not reflect the actual work done by the function. Using the performance stats we just ran and selecting the Showplan XML, we should see a different query plan that shows the work that is being done by the MSTVF. An example of the different plan to what we initially saw is as follows. Here we can now see that the MSTVF inserts data into a table variable called @RecentComment. The data in the table variable @RecentComment is what is then returned by the MSTVF to return the latest comment date.

MSTVF Query Plan from DMVs

Another aspect of the performance metrics we captured is the execution count. Here we want to be wary of multiple execution counts as these are typically bad for performance as mentioned when looking at the scalar performance.

What we see in the following image for execution count, is a value of 76 which is evidence of multiple executions of our MSTVF.

MSTVF Multiple Executions

Let us go back to Management in Object Explorer, expand sessions and Select "Watch live data" for the "QuerySessions" session we created earlier when we looked at the scalar performance. Click on the "Start data feed" green triangle in SSMS if the session was stopped previously and we should see for our example, multiple statement lines of the MSTVF which represent the repeated executions of the statement.

Feel free to cancel the query at this point if it hasn't completed yet.

iTVF performance

When we replace either the scalar or multi statement functions with an iTVF, we want to do so to avoid the performance penalties we’ve just seen e.g. execution time, multiple executions. Using the previous example which follows, we will query against the Users table to get the Id and the Latest Comment Date from the iTVF since 01 Jan 2008. using the APPLY operator. We will also do so from a cold cache by clearing the SQL Server cache and obtaining data from disk to begin with.

--run in an isolated environment 
CHECKPOINT;
DBCC DROPCLEANBUFFERS;
DBCC FREEPROCCACHE;
GO
DECLARE @CommentDate DATE= '2008-01-01T00:00:00.000';
SELECT u.Id, 
       c.LatestCommentDate
FROM dbo.Users u
     OUTER APPLY [dbo].[itvfnGetRecentComment](u.Id, @CommentDate) c;
GO

The following query consists of the performance stats that help us to see the statement that is being executed, the execution count and the query plan for that statement.

-- Query stats and execution plan
SELECT SUBSTRING(dest.text, (deqs.statement_start_offset / 2) + 1, (CASE deqs.statement_end_offset
                                                                        WHEN-1
                                                                        THEN DATALENGTH(dest.text)
                                                                        ELSE deqs.statement_end_offset - deqs.statement_start_offset
                                                                    END) / 2 + 1) AS QueryStatement,
       deqs.creation_time,
       deqs.execution_count,
       deqp.query_plan
FROM sys.dm_exec_query_stats AS deqs
     CROSS APPLY sys.dm_exec_query_plan(deqs.plan_handle) AS deqp
     CROSS APPLY sys.dm_exec_sql_text(deqs.plan_handle) AS dest
WHERE deqp.dbid = DB_ID('StackOverflow2010')
ORDER BY deqs.last_execution_time DESC;

The example query for the iTVF we just ran completed in just over 4 secs and the performance stats that we queried for should show an execution count of 1 for the iTVF as shown in the following image. This is a marked improvement in performance compared to our earlier examples with scalar and MSTFV attempts.

iTVF DMV Stats

Comparing scalar, MSTVF and iTVF performance

The following table captures the comparison of the 3 UDFs we just ran to show how they performed in terms of execution time and execution count.

Comparing scalar, MSTVF and iTVF performance
Scalar MSTVF iTVF
Execution time > 5 minutes > 5 minutes 4 seconds
Execution count > 10,000 >10,000 1

The difference in execution time and execution count by using the iTVF is immediately apparent. In our example, the iTVF returned the same data faster and with only 1 execution. In environments where there are heavy workloads or for scenarios where logic is used in multiple reports that are slow due to scalar or MSTVF usage, a change to an iTVF may help alleviate serious performance issues.

Limitations

iTVFs really come into play when you need to reuse logic without sacrificing performance but they are not perfect. They too have limitations.

For a list of limitations and restrictions of user defined functions in general you can visit Books Online here. Outside of that list and specifically for inline table valued functions, the following are general points to ponder when evaluating whether they're the right tool for a problem you may have.

Single SELECT statement

As the definition of an iTVF is restricted to 1 SELECT statement, solutions that require logic beyond that 1 SELECT statement where perhaps variable assignment, conditional logic or temporary table usage is required, are probably not suited to implementing in an iTVF alone.

Error handling

This is a restriction that affects all user defined functions where constructs such as TRY CATCH are not permitted.

Calling stored procedures

You cannot call stored procedures from within the body of the iTVF unless it's an extended stored procedure. This is linked to a tenet of SQL Server UDF design where it cannot alter the state of the database, which if you used a stored procedure, you possibly could.

Conclusion

This article began with the intention of revisiting inline table-valued functions (iTVFs) as they currently are in 2016, showing the variety of ways they can be used.

iTVFs have a key benefit in that they don't sacrifice performance unlike other types of user defined functions (UDFs) and may encourage good development practices with their reuse and encapsulation capabilities.

With recent features in SQL Server, firstly in SQL Server 2017 and in the upcoming SQL Server 2019, performance of either of scalar and MSTVFs has some extra help. For more details, check out interleaved execution which looks to provide an accurate cardinality estimate for MSTVFs rather than the fixed numbers or either 1 or 100 and scalar UDF inlining. With scalar UDF inlining in 2019, the focus is on addressing the performance weakness of scalar UDFs.

For your solutions, be sure to test and be aware of their limitations. When used well, they're one of the better tools we could have in our SQL armoury.

Thanks for reading.

Resources

Rate

4.55 (11)

Share

Share

Rate

4.55 (11)