Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12345»»»

A Faster BETWEEN Dates Expand / Collapse
Author
Message
Posted Monday, November 1, 2010 5:28 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Monday, October 22, 2012 8:55 AM
Points: 124, Visits: 180

We have a number of servers which often run into tight-memory situations, we've tried the above approach quite exhaustively. The frequent creation and destruction of temporary tables can make a subsecond SP end up taking over 30 seconds to wait for object memory to become available. If this query was frequently run on a busy server, I'd expect to start seeing vast slow-downs. If the server's idling, then we found this approach can help in certain situations. However, a good index is as fast.

If your app is running very slowly because of date comparison, a good solution is to add another int column, index it, and copy over the dates into it. Then cast any querrent date to int and use the new column(s) instead. Everything runs super-fast, especially if the table is transactional, so the index fill-factor can be set to 100% to make it compact and take fewer IOs. The increase in table size is a good trade-off of hard-disk usage versus the memory usage of table variables.

:)



Post #1013752
Posted Monday, November 1, 2010 7:01 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Today @ 1:08 PM
Points: 174, Visits: 446
Greetings All,

I have not had time to reply to your messages individually due to time constraints so I will try to shed some light here.

The calendar and temporary tables were used in this example only to create test data - nothing more. For me, I am applying this method against a regular data table that has a start and end date in it.

The environment is on an older (slower) server using SQL 2005. The database and table layout is not optimal, but I did not design it and changing it to be better is not always an option.

Since I can't just change things for the better, I have to find more inventive ways to make it work better, hence, why I came up with this start date check and a boolean flag for the end date check.

Put this against a faster server and also on SQL 2008 and yes, these speed changes may disappear. But, when walking into a database situation not of your design and this may help you.

I had tried different ways of applying an index to the data but none of them were giving me the speed enhancement that this method did. Bit then, maybe I just did not get the order of the fields correct or something. I will look at the suggestions listed here to see if any of them work too.

I enjoy learning new and different ways of solving a problem. I can list ways I had solved them here and learn from others who have better knowledge than I. I certainly am not too proud to try to say that my script doesn't stink, but this trick may still benefit another as well as the suggestions given.

Have a good day.

Terry Steadman
Post #1013793
Posted Monday, November 1, 2010 7:47 AM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Yesterday @ 10:53 AM
Points: 80, Visits: 728
terrance.steadman (11/1/2010)
I enjoy learning new and different ways of solving a problem. I can list ways I had solved them here and learn from others who have better knowledge than I. I certainly am not too proud to try to say that my script doesn't stink, but this trick may still benefit another as well as the suggestions given.


I personally like to compare the author's logic with logic I've used to solve similar situations. SQL Server as a common ground makes for better reading and understanding than say, reading about or even watching a chess match between two other players (*yawn*). As much as I like to play chess, I don't seem to learn much watching a match I'm not playing (probably an attention thing). Put this reply in the "for what it's worth" category.

Thanks for the article.
Post #1013827
Posted Monday, November 1, 2010 8:01 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Saturday, November 27, 2010 10:41 PM
Points: 58, Visits: 113
First of all, did you look at the query plans between the two approaches? Or flush caches?

Secondly, why use a boolean flag when the below approach is simpler and should yield the same plan? (I'll try to verify that with your test data):

[code]LEFT JOIN @CalTable AS ct ON
wt.Work_Start <= ct.CalDate
WHERE ct.CalDate <= wt.Work_End

Third, as Carlo points out, your approach is not identical to the original query. It partially transforms a left join into an inner join, which may go a long way to explaining why it ran faster.
Post #1013838
Posted Monday, November 1, 2010 8:01 AM


SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Today @ 12:20 PM
Points: 4,356, Visits: 6,188
1) the fact that table variables are used in the demos is likely creating execution plans that will not play out in the real world where actual tables are used. Table variables usually cause the optimizer to create vastly different plans.

2) Personally I like dynamic SQL (when available) for widely varying date inputs because the optimizer always gets the best possible chance to create the optimum plan for each execution. This is even better in databases if you have the ad hoc optimization enabled.


Best,

Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru at GMail
Post #1013839
Posted Monday, November 1, 2010 8:40 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Today @ 1:08 PM
Points: 174, Visits: 446
Greetings All,

Ok, some of you are complaining about the use of the LEFT JOIN. Please allow me to explain. When I am running some reports on the data, I am requested to show which ones fall into certain date ranges and which ones fall into different date ranges. The final set of records will end up having some flags on them for reporting purposes. None of the records are to be excluded just because they don't fall within the date check. So, I can't just throw them away in the script by doing a regular JOIN. A record that does not match one criteria may match another and so must still be included.

This example used temporary tables and created a calendar table. My regular script DOES NOT!!! That was done only to show the logic behind the idea here and to create some test data. If you want to test this better, then you will need to create your own regular table that has a start and end date in it and a ton of records. I also have to be careful what I post for scripts and data. As many of you already know, security and privacy are very important. So, I need to show this with completely different tables and names but still show the logic. I can't show you the real scripts or real data, it has to be completely different or else I get into serious trouble (lose job and/or jail time). I really don't want to have to create a whole test database just to show one idea. If I did that, it would make the attachments on my articles a lot larger.

I am also dealing with 4 or 5 different tables that are linked together for my reports. These tables will be using "Primary Keys" that range between 4 to 7 fields in size. Not a lot of fun to write scripts on in my book. But, I inherited this database when I received this job from the prior DBA. I can't just change the layout of the database as I wish. I have to take what exists and just make it work. Sort of like when the street crew has your main highway closed for construction and you have to find a way through the back roads to get to work. Your boss usually doesn't care about the transportation hurdles, he just wants you at your desk on time. You can't just use the fast highway like you want so you have to be inventive with what you have.

This logic was only a small portion of a much larger script I had to do a major report. But, this also was the biggest slowdown in that report, (the BETWEEN dates). This one change made the report run a lot faster. Everything else I had tried, including adding several different indexes, did not. The report had to analyze the data on several levels so sometimes it's not easy to find just the right index. When I did find a way that worked very well, I wanted to share that idea with others to help them too. Showing the original script would not be good as this one idea would have been lost in the complexity of the entire script. So, splitting it out into a smaller script to highlight the one idea and adding some test data was hopefully a bit better. If you have some ideas on how to write this better, then fine, please share. I will always be a learning DBA and am willing to accept constructive advise.

Have a good day.

Terry Steadman


Post #1013867
Posted Monday, November 1, 2010 8:44 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Thursday, July 28, 2011 8:03 AM
Points: 70, Visits: 316
GPO (11/1/2010)

If it was your job to design these tables, would you be tempted to have an end_date_time column that could not be null, and instead substitute a ridiculous future date for all events that have not yet ended (like 22000101)? Are there indexing or other performance advantages to making your end_date_time column not null?


I would not be the least bit tempted to do that. Fake data is the bane of my existence.
Post #1013872
Posted Monday, November 1, 2010 9:04 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Saturday, November 27, 2010 10:41 PM
Points: 58, Visits: 113
terrance.steadman (11/1/2010)
Ok, some of you are complaining about the use of the LEFT JOIN. Please allow me to explain. When I am running some reports on the data, I am requested to show which ones fall into certain date ranges and which ones fall into different date ranges. The final set of records will end up having some flags on them for reporting purposes. None of the records are to be excluded just because they don't fall within the date check. So, I can't just throw them away in the script by doing a regular JOIN.

But you ARE doing a 'regular' join in your "new" query. You're left joining on one end date, and inner-joining on the other. That's the entire point. The two queries don't yield identical results in all cases. And you're surprised by their having differing run times? The entire test (and the story resulting from it) is meaningless.

Does anyone verify these stories before they're posted?



Post #1013883
Posted Monday, November 1, 2010 9:05 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Friday, June 14, 2013 7:41 AM
Points: 30, Visits: 81
Hi Terry,

Thanks for the article!

In a data warehousing environment with larger data I am always looking for ways to enhance the query performance. I have to put your approach into test and see how it works.

Thanks again.


Moe M
Database Consultant
http://www.cubeangle.com
Post #1013885
Posted Monday, November 1, 2010 9:10 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Thursday, August 21, 2014 12:08 PM
Points: 1,414, Visits: 4,540
i know it's a big no-no, but i have a few tables for logging/admin duties and i just created the clustered index on the date column on those. most of the toughest queries on those tables select the last few days of data or data between 2 days. in the case where this is not the case, putting the clustered index on the date columns also causes other records that are joined between 2 tables to be on the same pages making the join faster as well.

why not just cluster the date column instead of creating a date table? doesn't work in all cases like when you need to calculate specific weeks and days, but if you need to select a few days of data it's a pretty good solution


https://plus.google.com/100125998302068852885/posts?hl=en
http://twitter.com/alent1234
x-box live gamertag: i am null
[url=http://live.xbox.com/en-US/MyXbox/Profile[/url]
Post #1013890
« Prev Topic | Next Topic »

Add to briefcase ««12345»»»

Permissions Expand / Collapse