Quick "Does Data Exist for a day" query

  • Greeting!

    I have a table with trend data that is usually collected on 5 second intervals, and it is put into the database by a program that is run once a day.  I wrote a stored procedure to sift through the data and find missing *days*, but it churns way too much for me (each day has about 3k rows of data, and all I want is a "did May 10 happen").

    The report looks like this (200 second later):

    Fairburn 2003-01-01 00:00:00.000 NULL

    Fairburn 2004-02-25 00:00:00.000 1

    Fairburn 2004-05-20 00:00:00.000 NULL

    Here's the code:

    create table #t1 (col int IDENTITY(1,1), dt datetime)

    create table #t2 (col int IDENTITY(2,1), dt datetime)

    create table #t3 (PGID int, StartDate datetime, DaysMissing int)

    insert into #t1 (dt)

    SELECT  distinct cast(convert(varchar(10), point_data_time, 101) as datetime)

     FROM         point_data inner join point_definition on point_data.point_definitionID = point_definition.point_definitionID

     WHERE        point_groupID in (select * from dbo.sublevel(@TopPgid)) order by cast(convert(varchar(10), point_data_time, 101) as datetime)

    insert into #t2 (dt)

    SELECT  dt from #t1 order by dt

    insert into #t3

    select @toppgid, min(#t2.dt) as StartDate, NULL as DaysMissing from #t2

    UNION

    select @toppgid, #t2.dt as startDate, datediff(dd,#t2.dt,#t1.dt)-1 as DaysMissing from #t1 left join #t2 on #t1.col = #t2.col where datediff(dd,#t2.dt,#t1.dt)-1>0

    UNION

    select @toppgid, max(#t2.dt) as StartDate, NULL as DaysMissing from #t2

    delete from #t1

    delete from #t2

    Fetch next from cur into @toppgid

    END

    select Group_Name as Site, StartDate, DaysMissing from #T3 inner join point_groups on #T3.PGID=point_groups.point_groupid

    drop table #t1

    drop table #t2

    drop table #t3

     

    TIA,

    Thor Johnson

     

  • Hello Thor,

    I don't fully understand the structure and the way you want to display missing days, but I'll try. I see there is some cursor used to fetch rows... well, that in itself could slow down the process. I try to avoid cursors whenever possible, are you sure you need it here?

    If the data in #t1 are ordered by dt ASC, it would be possible to skip the #t2, if you use a self join on #t1, something like that :

    select #t1.* , datediff(dd,q.dt, #t1.dt)-1

    from #t1

    JOIN #t1 q ON q.col = #t1.col-1

    where datediff(dd,q.dt, #t1.dt) > 1

    Otherwise, I guess that it should be possible to find a better way... First thing I would try is to create a reference table (e.g. dbo.calendar) containing all possible dates, and then make a left join from it to #t1... or maybe directly to your source data table? That depends, try which is better for you.

    SELECT cal.dt

    FROM dbo.calendar cal

    LEFT JOIN #t1 ON cal.dt = #t1.dt

    WHERE #t1.col IS NULL

    Hope that will help a bit, although I'm not sure I correctly understood everything in your description... I just tried to hint, in which direction to look, not to solve the problem - I would need more info even to attempt that . Cheers, Vladan

  • Hi Thor.

    Vladan makes a good point about not understanding your requirements here. I don't understand them either. It's helpful to all of us when you are posting a question like this to make your problem and requirements as clear as possible; and it's also helpful to include not only the table definitions related to the question, but also the corresponding inserts that load test data into the tables. That makes all of our jobs faster and easier.

    Hope to see a clarification in reply so we can help you.

    Eaton

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply