Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12345»»»

Comparison of Dates in SQL Expand / Collapse
Author
Message
Posted Tuesday, April 28, 2009 3:14 AM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Tuesday, September 16, 2014 5:48 AM
Points: 3,241, Visits: 5,002
From the following code, it is verified that the Index is used with or without function on the column...

Create Table #t1 (mid int,gdate datetime)

CREATE CLUSTERED INDEX [IX_ALIEN_MISSILE] ON [#t1]
(
[gdate] ASC
) ON [PRIMARY]

Insert into #t1 Values ( 1,'01-01-2009 06:00:00' )
Insert into #t1 Values ( 2,'01-01-2009 07:00:00' )
Insert into #t1 Values ( 3,'01-01-2009 04:00:00' )
Insert into #t1 Values ( 4,'01-01-2009 03:00:00' )
Insert into #t1 Values ( 5,'01-02-2009 04:00:00' )
Insert into #t1 Values ( 6,'01-02-2009 05:00:00' )
Insert into #t1 Values ( 7,'01-02-2009 06:00:00' )
Insert into #t1 Values ( 8,'01-02-2009 06:00:00' )
Insert into #t1 Values ( 9,'01-03-2009 04:00:00' )
Insert into #t1 Values ( 10,'01-03-2009 04:00:00' )
Insert into #t1 Values ( 11,'01-03-2009 05:00:00' )
Insert into #t1 Values ( 12,'01-03-2009 06:00:00' )
Insert into #t1 Values ( 13,'01-03-2009 07:00:00' )
Insert into #t1 Values ( 14,'01-04-2009 08:00:00' )
Insert into #t1 Values ( 15,'01-04-2009 09:00:00' )
Insert into #t1 Values ( 16,'01-04-2009 10:00:00' )
Insert into #t1 Values ( 17,'01-04-2009 10:00:00' )

Select * from #t1 where Datediff(day,gdate,Convert(datetime,'2009-01-01 00:00:00.000')) = 0
Select * from #t1 where Convert(datetime,Convert(varchar(20),gdate,110)) = Convert(datetime,'2009-01-01 00:00:00.000')
Select * from #t1 where gdate = Convert(datetime,'2009-01-01 00:00:00.000')

drop table #t1


On checking the execution plan, it is clear that the Inde Scan is Used on the Convert of DateDiff Function and Index Seek is Used if used without function on the column. In either case, Index is Used...


----------------------------------------------------------------------------------------------------------------------------------------------------------------------

Sometimes, winning is not an issue but trying.

You can check my BLOG here

Post #705617
Posted Tuesday, April 28, 2009 3:17 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Thursday, September 25, 2014 12:38 PM
Points: 1,385, Visits: 1,249
This is not something I had ever heard of:

Paul White (4/28/2009)
Though the example given may not result in a good plan either, because SQL Server cannot sniff the local variable parameters.


Are you saying that the following statement would result in a better query plan? (or more reliably result in a good query plan?)

SELECT *
FROM SomeTable
WHERE (SomeColumn >= '2009-04-01' AND SomeColumn < '2009-04-16' )

If so, why would that be? Because in one case SQL Server can read the values and choose an appropriate query plan based on statistics / known value distributions, and in the other case it cannot?

In that case, does adding a function call inline break it?

SELECT *
FROM SomeTable
WHERE (SomeColumn >= '2009-04-01' AND SomeColumn < DateAdd(Day, 1, '2009-04-15') )


Is there any reference to this issue / mechanism anywhere online that I can look at?


http://poorsql.com for T-SQL formatting: free as in speech, free as in beer, free to run in SSMS or on your version control server - free however you want it.
Post #705622
Posted Tuesday, April 28, 2009 3:37 AM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Friday, October 17, 2014 8:13 AM
Points: 9,926, Visits: 11,188
benr (4/28/2009)
When I am working with dates, I always say >= the begin date, and then < the next day. Eg. If you want all records up to and including the end of 31st March 2009 your query should ask for all records prior to the 1st of April. This method should then be able to leverage indexes.

This is true, and a valid approach, unless either date column can be NULL.

Adding ISNULL or COALESCE on a NULLable column will prevent index usage (subject to the caveats I already posted). Using OR to cope with the NULLs will almost always prevent index usage too. Using a number of UNION ALLs to cope with the NULLs will usually be able to use indexes, but n seeks and n - 1 UNION ALLs will not always be cheaper than one index scan.

The functions posted can be improved because they:

1. Use multiple SETs where a single SELECT would suffice
2. Does not specify SCHEMABINDING to mark it as deterministic
3. Does not specify RETURNS NULL ON NULL INPUT to avoid running for NULL inputs

The function body can be rewritten in one line as below

RETURN CONVERT(DATETIME, CONVERT(CHAR(8), @Date, 112), 112)

If the new function is marked with SCHEMABINDING, it is deterministic and can be used in a persisted column. It should also be marked with RETURNS NULL ON NULL INPUT.

Cheers,

Paul




Paul White
SQL Server MVP
SQLblog.com
@SQL_Kiwi
Post #705629
Posted Tuesday, April 28, 2009 3:45 AM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Friday, October 17, 2014 8:13 AM
Points: 9,926, Visits: 11,188
Tao Klerks (4/28/2009)
Interesting approach - we use Convert, relying on the fact that DateTime values are internally stored as floating-point values, with every whole number representing a day

The FLOAT method is imprecise in the sense required for functions to be used in persisted columns. For that reason, I wouldn't use this code as the basis for a scalar UDF.

I don't suppose Microsoft are about to change the relationship between FLOATs and DATETIMEs, but it seems an unnecessary hidden dependency to add to your code.

Some folks also view this trick as a bit of a hack, when a better solution exists.

Paul




Paul White
SQL Server MVP
SQLblog.com
@SQL_Kiwi
Post #705638
Posted Tuesday, April 28, 2009 3:58 AM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Friday, October 17, 2014 8:13 AM
Points: 9,926, Visits: 11,188
craig.lovegren (4/28/2009)
I did exactly that, using the small example, you find that the datediff and dateadd are nearly identical, indexes or not. So I grabbed a large table (a few million rows) from our systems, indexed them appropriately and ran a few tests. When comparing my three tests, datediff ranked at 90%, dateadd, although a complex plan, took only 9%, and the final test took the remaining 1%.

It sounds as if you are quoting percentage batch costs from the execution plan. If so, please be aware that these percentages are based on the optimizers cost estimates. To accurately report performance results, you need a reproducible test set up (with code) and output from SET STATISTICS IO, TIME ON as a minimum. It is also common to start with a cold cache. A posted actual execution plan is often helpful too.

craig.lovegren (4/28/2009)

What we typically do for tables which are queried on a datetime field religiously is to actually create another column with the date portion only. It does require an extra 4 bytes (smalldatetime) per row, but disk space is cheap. Just remember to update the column after the data load in a batch update statement, don't use triggers, computed columns, or calcs in your inserts, as this will slow your data loads down drastically.

Well that all rather depends on your priorities. Computed columns have the distinct advantage of being guaranteed to match the source data at all times. Someone could update one of your date columns and not the other.

I am confused as to how a computed column slows down your data loads 'drastically'. Also, simple scalar calculations in the SELECT part of an INSERT are typically so much cheaper than the INSERT operation that it makes no odds.




Paul White
SQL Server MVP
SQLblog.com
@SQL_Kiwi
Post #705644
Posted Tuesday, April 28, 2009 4:06 AM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Friday, October 17, 2014 8:13 AM
Points: 9,926, Visits: 11,188
Tao Klerks (4/28/2009)
Are you saying that the following statement would result in a better query plan? (or more reliably result in a good query plan?)

SELECT *
FROM SomeTable
WHERE (SomeColumn >= '2009-04-01' AND SomeColumn < '2009-04-16' )


Absolutely yes. Though you might need to add an OPTION (RECOMPILE) to prevent re-use of any previous plan, especially if forced parameterization is in effect.

Tao Klerks (4/28/2009)

If so, why would that be? Because in one case SQL Server can read the values and choose an appropriate query plan based on statistics / known value distributions, and in the other case it cannot?

Yes! See http://blogs.msdn.com/queryoptteam/archive/2006/03/31/565991.aspx

Tao Klerks (4/28/2009)

In that case, does adding a function call inline break it?

Why don't you test it and tell everyone what you find?

Tao Klerks (4/28/2009)

Is there any reference to this issue / mechanism anywhere online that I can look at?

The link above is to a blog entry from the SQL Server QUery Optimization Team.
Google 'SQL Server 2005 parameter sniffing' and you will turn up many good articles, including some on this site.

Paul




Paul White
SQL Server MVP
SQLblog.com
@SQL_Kiwi
Post #705646
Posted Tuesday, April 28, 2009 4:38 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, January 17, 2011 11:35 PM
Points: 1, Visits: 10
What if one just manually add the times to your SQL query?

I have been doing something like this:

" WHERE DATEFIELD BETWEEN " & @DATE1 & " 00:00:00" & " AND " & @DATE2 & " 23:59:59.997"

I know this works (please ignore any typos or syntax errors - I have not checked it), and I know it uses the indexes.
Why would something like this not be advisable?

regards

Dawid

Post #705661
Posted Tuesday, April 28, 2009 4:40 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, March 13, 2014 9:41 PM
Points: 7, Visits: 228
Paul White (4/28/2009)
It sounds as if you are quoting percentage batch costs from the execution plan.
The results were from batch queries, 100 each with random dates for each iteration, I just rounded the results up because I was more interested in the severe differences between what was posted in the article and other suggestions.

Paul White (4/28/2009)
I am confused as to how a computed column slows down your data loads 'drastically'. Also, simple scalar calculations in the SELECT part of an INSERT are typically so much cheaper than the INSERT operation that it makes no odds.
You're right, I made a gross generalization, and actually didn't elaborate well what I was thinking. Computed columns have no impact on loads and they're only computed for indexing purposes or during queries; the same impacts you'd see by bulk updating. In this instance, since truncating a date using floor would be deterministic, it would actually be OK to use computed columns with an index as opposed to computing the value and sticking in it another column.

What I was really aiming for, however, is that if the principal query against your table is going to be on date ranges, if you're looking to avoid the time portion of it, eliminate it from the equation all together. Calling on UDFs or math functions inside of a query, although optimized by SQL Server itself, will still be slower than computing the date portion once and querying multiple times against it. If you have to do the calculation more than once, you're just wasting the servers time by making it repeat itself; compute it and save it, the disk cost is negligible compared to the performance gains you'll achieve.
Post #705662
Posted Tuesday, April 28, 2009 4:51 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Thursday, September 25, 2014 12:38 PM
Points: 1,385, Visits: 1,249
Thanks Paul!

You've just added another dimension of complexity to SQL Performance tuning for me, now I have to go away and think about this. :)

The things that are most unclear for me now are:

1) How are things different (if at all) in SQL 2000? The article targets 2005, but mentions procs in 2000...

2) What is the relationship between query plan caching and parameter sniffing, eg in a stored procedure? Isn't parameter sniffing essentially doing the opposite of query plan caching? Are there circumstances under which "sniffed" parameters will cause a stored procedure to be recompiled? (or can multiple query plans be cached for a single proc, and be selected based on sniffed parameters?)

I don't expect you to "answer" this, I understand it's probably food for hours of investigation and/or discussion; I'll save this for my "cool stuff to look at when I have extra time on my hands (never)" pile.




http://poorsql.com for T-SQL formatting: free as in speech, free as in beer, free to run in SSMS or on your version control server - free however you want it.
Post #705669
Posted Tuesday, April 28, 2009 5:14 AM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Friday, October 17, 2014 8:13 AM
Points: 9,926, Visits: 11,188
Tao Klerks (4/28/2009)
You've just added another dimension of complexity to SQL Performance tuning for me, now I have to go away and think about this. :)

The good news is that learning never stops. I start to worry if I don't come across something I didn't know every day...

Tao Klerks (4/28/2009)

1) How are things different (if at all) in SQL 2000? The article targets 2005, but mentions procs in 2000...

SS2K had the same issue, see http://blogs.msdn.com/khen1234/archive/2005/06/02/424228.aspx
There are more options for working around this behaviour in 2K5 and 2K8 (which introduces OPTIMIZE FOR...UNKNOWN)

Tao Klerks (4/28/2009)

2) What is the relationship between query plan caching and parameter sniffing, eg in a stored procedure? Isn't parameter sniffing essentially doing the opposite of query plan caching? Are there circumstances under which "sniffed" parameters will cause a stored procedure to be recompiled? (or can multiple query plans be cached for a single proc, and be selected based on sniffed parameters?)

See http://sqlblog.com/blogs/kalen_delaney/archive/2007/07/30/did-you-know-estimated-vs-actual-plans.aspx for information which, with a little thought, will allow you to answer this for yourself.

Enjoy!

Paul




Paul White
SQL Server MVP
SQLblog.com
@SQL_Kiwi
Post #705676
« Prev Topic | Next Topic »

Add to briefcase ««12345»»»

Permissions Expand / Collapse