SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 

DATEDIFF vs. DATEADD

Let’s talk about the DATEDIFF vs. DATEADD functions in SQL Server. Each one performs a different task. The former calculates the differences between two given date & time values based on some date part (day, minute, second, etc.). The latter calculates a date & time value by adding a number of date parts to a given date & time value. You can use both of them to achieve the same results, but performance will be very different.

 

DATEDIFF vs. DATEADD

 

Once in a while I encounter a query that looks like the following:

SELECT
	*
FROM
	Sales.Orders
WHERE
	DATEDIFF (DAY , OrderDateTime , SYSDATETIME ()) <= 90;

This query is supposed to retrieve orders in the last 90 days. But there is a problem with this query, and I’m not talking about the use of SELECT *, which is not a good practice for so many reasons. I’m talking about applying the predicate on an expression rather than on a column alone. In this case the expression involves the DATEDIFF function, but it can be any other expression. If we don’t apply the predicate on the column alone, then the optimizer can’t use an index on that column, even if such an index exists. And if the table is very large, and the predicate filters out most of the table, then instead of getting a quick and efficient index seek, we will get a long and heavy table scan.

So we need to rewrite the predicate to look like the following:

SELECT
	*
FROM
	Sales.Orders
WHERE
	OrderDateTime {Operator} {Constant Expression};

Where {Operator} is “<=” or “>=” or something similar, and {Constant Expression} is some expression, which does not depend on any column and therefore can be calculated once for the query. This way the optimizer will be able to estimate the number of rows based on the constant expression and the column statistics, and if the estimated number of rows is low enough, it can leverage the index on the “OrderDateTime” column.

So if you think about it, instead of asking for “the orders for which the number of days between them and now is 90 or less”, we can ask for “the orders that were created on or after a certain point in time, which is now less 90 days”. And the query can be rewritten like this:

SELECT
	*
FROM
	Sales.Orders
WHERE
	OrderDateTime >= DATEADD (DAY , -90 , SYSDATETIME ());

I tested both queries on a sample table, which has millions of rows, and only around 500 rows in the last 90 days. The first query produced a table scan, while the second query produced an index seek. Of course, the execution time of the second query was much lower than the first query.

Both queries were supposed to return the orders in the last 90 days, but the first query returned 523 rows, and the second query returned 497 rows. So what’s going on?

The answer has to do with the way DATEDIFF works. This function returns the number of date parts (days, years, seconds, etc.) between two date & time values. It does that by first rounding down each one of the date & time values to the nearest date part value, and then counting the number of date parts between them.

For example, let’s calculate the number of years between the following two date & time values: “2016-12-31 23:59:59” and “2017-01-01 00:00:00”. The time between these two values is one second, but the result of applying the DATEDIFF function on these values is one year:

DATEDIFF (YEAR , '2016-12-31 23:59:59' , '2017-01-01 00:00:00') = 1

This is because each one of the values is first rounded down to the nearest whole year, like this:

DATEDIFF (YEAR , '2016-01-01 00:00:00' , '2017-01-01 00:00:00') = 1

The DATEADD function, on the other hand, doesn’t need to round anything. It just adds (or subtracts) a given number of date parts to a date & time value. So this is why we get different results from the two queries, although both of them do the same thing conceptually – they both return the orders in the last 90 days.

In many cases, the accuracy is not important, so as long as the query answers the business question (“give me the orders in the last 90 days”), then it’s good enough. Fortunately, the second query (using DATEADD) is not only much more efficient, thanks to the ability to perform an index seek, but it’s also more accurate, because no rounding is involved.

But you need to be careful. If users are already used to a report that runs the DATEDIFF query behind the scenes, and you change it to the DATEADD query, then the users will be very grateful for the performance improvement, but as soon as they realize that the results of the report are not consistent with what they were used to, then they might get very angry at you. Sometimes consistency is more important than accuracy, and you need to make the right choice. If you need to maintain consistency, but you still want to be able to utilize the index and get good performance, then you can mimic the behavior of the DATEDIFF function by rounding down the values yourself, like this:

SELECT
	*
FROM
	Sales.Orders
WHERE
	OrderDateTime >= DATEADD (DAY , -90 , DATEADD (DAY ,
	DATEDIFF (DAY , '2000-01-01' , SYSDATETIME ()) , '2000-01-01'));

The expression beginning with the second DATEADD rounds down the value of SYSDATETIME () to the nearest whole day, which is today at midnight. There is no need to round down the value of “OrderDateTime”, because if the rounded value is greater than the constant expression, then the value itself is also greater than the constant expression.

This query looks ugly, but it works. It uses an index seek, and it is consistent with the result of the DATEDIFF query.

 

The post DATEDIFF vs. DATEADD appeared first on Madeira Data Solutions.

Guy Glantser

Guy Glantser, Data Platform MVP, is the leader of the Israeli PASS chapter and also the CEO and founder of Madeira Data Solutions. His career has been focused on the Microsoft Data Platform for the past 20 years, performing various database roles as either an on-site DBA, an external consultant or a speaker. Guy is involved in many activities in the Microsoft Data Platform community. He occasionally speaks at community events, such as PASS Summit, SQLBits, SQL Saturdays and user groups around the world. He also co-hosts the SQL Server Radio podcast.

Comments

Leave a comment on the original post [www.madeiradata.com, opens in a new window]

Loading comments...