August 31, 2009 at 7:43 pm
I have this table with zillions of record in it. There is a field(date_time) which contains data from 2002 till date and I want a subset of it (from 2008-09).
Even for a single day it returns about 30-40,000 rows.
Now, I've created an index to parse thru that data but even after that it take about 3-4 hours to process. Below is the query I am running.
select * into #test
from table a
where date_time between '01/07/2008' and '31/07/2009'
CREATE INDEX t1 ON #test(id)
CREATE INDEX t2 ON #test(DATE_TIME)
select * from #test b
where b.id= (select max(id) from #test t
where t.date_time = b.date_time)
order by date_time
Is there any way to speed up things.
Thanks.
August 31, 2009 at 8:01 pm
Try making the t2 index on DATE_TIME into a clustered index.
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
August 31, 2009 at 8:33 pm
For starters, I'd like to know which steps are taking the most time. If you are doing a table scan of a zillion large rows, that could take a while right there. And your #temp table is going to be approximately 20-30% of the size of your primary table, right? (I'm assuming that daily volume has been roughly constant.) If we're averaging 35000 rows a day then 608 * 35000 = 21,280,000 rows to be loaded into the #temp table. Do you have a date_time index on the original table? Does it include the ID column? Does the execution plan for your query show that it's being used?
The following modifications to your code will tell us how many minutes each step is taking.
declare @timer datetime
set @timer = getdate()
select *
into #test
from table a
where date_time between '01/07/2008' and '31/07/2009'
select 'Populate #temp', datediff(minute,@timer,getdate()
set @timer = getdate()
CREATE INDEX t1 ON #test(id)
CREATE INDEX t2 ON #test(DATE_TIME)
select 'Create Indexes', datediff(minute,@timer,getdate()
set @timer = getdate()
select * from #test b
where b.id= (select max(id)
from #test t
where t.date_time = b.date_time)
order by date_time
select 'Run Final Query', datediff(minute,@timer,getdate()
set @timer = getdate()
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
August 31, 2009 at 8:40 pm
select * from #test b
where b.id= (select max(id) from #test t
where t.date_time = b.date_time)
order by date_time
If I'm reading this correctly, you are simply trying to get all the data for the row with the max(ID) for each date_time. Another way to get this is....
;with cte as (
select *,row_number() over (partition by date_time order by id DESC) as rowSeq
from table a
where date_time between '01/07/2008' and '31/07/2009'
)
select * from cte
where rowSeq = 1
order by date_time
Since you were ordering by date_time anyway, we should get the use of row_number() at very little added cost. You might even try changing your final ORDER BY to date_time, id DESC.
By the way, if you aren't using any of the columns from your original table, be sure to select only those columns in the CTE. (Elementary, I know, but it always bears repeating.)
Give it a try and let us know how it performs.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
August 31, 2009 at 9:00 pm
Sure hope you're feeling better, Barry. If not now, then soon.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
Viewing 5 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply