(Be sure to checkout the FREE SQLpassion Performance Tuning Training Plan - you get a weekly email packed with all the essential knowledge you need to know about performance tuning on SQL Server.)
One of the coolest new performance improvements in SQL Server 2019 is a better support for Scalar Valued Functions. As you (hopefully) know, Scalar Valued Functions always introduced a huge amount of performance problems – based on their implementation in SQL Server.
Scalar Valued Functions prior SQL Server 2019
I don’t know a database which doesn’t use Scalar Valued Functions. And very often their use introduces a lot of huge performance problems. Why they are so bad?
- Scalar Valued Functions are executed and called for each returned record within the execution plan. The more records your query returns, the more often SQL Server has to call the Scalar Valued Function. This can get really dangerous if you are doing data access in your Scalar Valued Function.
- The Query Optimizer doesn’t assign realistic costs to the Compute Scalar operator in the execution plan, which calls the Scalar Valued Function. SQL Server only assigns a few CPU cost units the Compute Scalar operator, but this doesn’t reflect in any way the reality. Therefore, the costs of calling and executing a Scalar Valued Function is always underestimated.
- As soon as you reference a Scalar Valued Function within a query, the Query Optimizer can’t generate a parallel execution plan anymore, which hurts the performance on larger systems. I have seen once a Data Warehousing system, where each analytics query was calling a Scalar Valued Function. Therefore, every query was running single-threaded – as slow as possible!
Let’s have a look on a concrete example how bad Scalar Valued Functions are. The following listing show a simple definition of a function, which returns the number of sales for a given CustomerID.
CREATE FUNCTION GetSalesCountForCustomer ( @CustomerID INT ) RETURNS INT AS BEGIN DECLARE @salesCount INT SELECT @salesCount = COUNT(CustomerID) FROM Sales.SalesOrderHeader WHERE CustomerID = @CustomerID RETURN @salesCount END GO
As you can see, the Scalar Valued Function just counts how many rows are available in the table Sales.SalesOrderHeader for a specific customer. Let’s use that Scalar Valued Function in a query:
SELECT c.*, dbo.GetSalesCountForCustomer(c.CustomerID) AS 'SalesCount' FROM Sales.Customer c GO
When I execute that query on my system, I takes around 1.3 seconds for execution. The following picture shows the generated execution plan.
As you can see, the Scalar Valued Function itself is nowhere within the execution plan. You can only see a Compute Scalar operator with a minor CPU overhead of 2%, but that’s it. But SQL Server spends around 1.12 seconds in this operator. As you can already see here quite nicely, the costing of the Compute Scalar operator doesn’t reflect reality in any way! The query itself produces almost 40000 logical reads (verified by my good old, depreciated friend SQL Server Profiler…). But when you look at the output of STATISTICS IO you can only see 123 logical reads against the table Sales.Customer.
The problem is that STATISTICS IO doesn’t show you the logical reads that are happening with the Scalar Valued Function. In our case the access to the table Sales.Customer is done through a Non-Clustered Index Seek operation that needs 2 logical reads for each execution. In our case we call the Scalar Valued Function for each returned record – 19820 times – over and over again. Therefore, we produce in sum 39763 logical reads (2 x 19820 + 123).
Imagine that the Scalar Valued Function would have no supporting Non-Clustered Index. In that case, SQL Server would have to scan the whole Clustered Index, and would produce for each execution 689 logical reads! If you do now the math again, the whole query execution would generate 13656103 logical reads! These are more than 100 GB of data that are processed during query execution. If I run the query in that way, it needs around 30 seconds until it is finished. And that’s again the execution plan:
Let’s have look now at how SQL Server 2019 deals with that problem.
Scalar Valued Functions in SQL Server 2019
Beginning with SQL Server 2019, the Query Optimizer is finally able to compile your Scalar Valued Function and transform it into relation constructs that can be embedded into the final generated execution plan. To make use of this optimization, you have to do only one thing: setting your database into the compatibility mode 150 – SQL Server 2019:
ALTER DATABASE AdventureWorks2014 SET COMPATIBILITY_LEVEL = 150 GO
When you now run the same, identical query with the same, identical Scalar Valued Function again, it finishes in around 250 milliseconds. And here is now the generated execution plan:
As you can see from the execution plan, the Query Optimizer was now smart enough the transform the iterative call to the Scalar Valued Function into a Nested Loop Join that references through a Non-Clustered Index on the inner side the table Sales.SalesOrderHeader. And the Compute Scalar operator is completely gone! How many logical reads do we have now? Let’s check it:
That’s now interesting! We have more logical reads as previously – around 42000, but we are much faster. The problem here is that the Query Optimizer transformed the call of the Scalar Valued Function into a Nested Loop Join. Therefore, the inner side of the Nested Loop has to be executed for each record that is returned from the outer side. In our case, the outer side returns 19820 rows from the table Sales.Customers, and therefore the Non-Clustered Index Seek is also executed 19820 times – over and over again.
The query is faster, but produces more logical reads Are we able to be smarter than the Query Optimizer? Let’s try it. In the past (prior SQL Server 2019) I have always recommended to avoid Scalar Valued Functions or to rewrite them as Table Valued Functions. Because a Table Valued Function returns a set, and therefore the Query Optimizer can make its transformations with that set of records. Let’s do that and rewrite the Scalar Valued Function as a Table Valued Function.
CREATE FUNCTION GetSalesCountForCustomerTVF ( @CustomerID INT ) RETURNS TABLE AS RETURN ( SELECT COUNT(CustomerID) AS 'SalesCount' FROM Sales.SalesOrderHeader WHERE CustomerID = @CustomerID ) GO
Now we can reference the Table Valued Function within our query:
SELECT c.*, ( SELECT SalesCount FROM dbo.GetSalesCountForCustomerTVF(c.CustomerID) ) SalesCount FROM Sales.Customer c ORDER BY SalesCount DESC GO
When I have executed that query, it took around 160 milliseconds, and only produced 180 logical reads – as you can see from the output of STATISTICS IO:
To be honest, this is much faster than the transformed Scalar Valued Function. Why? Let’s have a look again at the execution plan:
With the Table Valued Function in place, the Query Optimizer has chosen a physical Merge Join operator. The benefit of the Merge Join operator is that SQL Server has to access both sides of the join only once! Therefore, you have only one Non-Clustered Index Scan against the table Sales.SalesOrderHeader and one Clustered Index Scan against the table Sales.Customer. And this makes everything much faster as the previous execution plan with the Nested Loop operator.
Scalar Valued Functions are evil! Yes, they are, but Microsoft has made huge improvements in SQL Server 2019 to make them faster. But as you have seen in this blog posting, your exact mileage can vary. For that specific example I was able to outperform the Query Optimizer. But for other scenarios it is also the case that the Query Optimizer will outperform you. As always, it depends… There is also a great whitepaper available from Microsoft that describes how the Query Optimizer can transform Scalar Valued Functions.
If you are interested more in SQL Server Query Tuning, I highly suggest to checkout my Online Training about it that will run next Monday.
Thanks for your time,