List Problem...

  • Hi

    "Half the battle is defining the problem!"

    I'm running MS SQL 8.00.2039 .

    I have a table that amongst other things contains dates.

    What I need to do is produce a map of the table such that I can locate "gaps" in the dates in each record. For example:

    The records in the table may be

     1 Jan 2004

     2 Jan 2004

     3 Jan 2004

     6 Jan 2004

     7 Jan 2004

     8 Jan 2004

    10 Jan 2004

    What I need to know is that 4, 5 & 9 Jan are not in the table. In real life the table contains about 5 million records.

    Can anyone think of a query / method that will tell me what I need to know?

    Thanks in advance.

     

     

  • On way of doing this would be to create a temp table with every date in your date range in it - so in your case:

    RefTable:

    1 Jan 2004

    2 Jan 2004

    3 Jan 2004

    4 Jan 2004

    5 Jan 2004

    6 Jan 2004

    7 Jan 2004

    8 Jan 2004

    9 Jan 2004

    10 Jan 2004

     

    The do somthing like:

    SELECT Reftable.Date FROM RefTable LEFT JOIN yourTable ON Reftable.Date = yourTable.Date WHERE yourtable.Date = NULL

    It might be better to distinct first (especially if you have an index on your date col):

    IE

    SELECT RefTable.Date

    FROM RefTable

    LEFT JOIN

    (SELECT DISTINCT Date FROM yourTable) [dates]

    ON RefTable.Date = dates.Date WHERE dates.Date IS NULL

    Hope this helps,

    - James

    --
    James Moore
    Red Gate Software Ltd

  • To go along with James Moore's suggestion about creating a temp table of all the dates in the range...

    There are a lot of folks that keep an "auxilliary calendar table" handy... good for lot's of different types of calculations having to do with dates and this would be one of them. Google "auxilliary calendar table" or do a search on this forum for lot's of great scripts to make one.  Michael Valentine Jones has a great one.

    A lot of my "customers" have a difficult time accepting the bit of disk space (it really doesn't take that much) for such a table.  For some reason though, I'm usually able to convince them that a "Tally" table (a table with a single column of well indexed numbers) is worth something (lot's of different uses including this one) and is similar to the temp table James mentioned but is a permanent table.  Here's how to make one...

    --===== Create and populate the Tally table on the fly

     SELECT TOP 11000 --equates to more than 30 years of dates

            IDENTITY(INT,1,1) AS N

       INTO dbo.Tally

       FROM Master.dbo.SysColumns sc1,

            Master.dbo.SysColumns sc2

    --===== Add a Primary Key to maximize performance

      ALTER TABLE dbo.Tally

            ADD CONSTRAINT PK_Tally_N PRIMARY KEY CLUSTERED (N)

    --===== Allow the general public to use it

      GRANT SELECT ON dbo.Tally TO PUBLIC

    Ok... what's that got to do with dates?  Try this...

    DECLARE @MyMinDate DATETIME

    DECLARE @MyMaxDate DATETIME

        SET @MyMinDate = '1997-06-01'

        SET @MyMaxDate = '2007-05-31'

     SELECT @MyMinDate + (N-1)

       FROM dbo.Tally

      WHERE N<= @MyMaxDate-@MyMinDate+1

    Viola!  Instant range of dates.  And, since the Tally table has 11,000 prefabricated numbers from 1 to 11,000, you get more than 30 years worth of possible dates in any date range.  If that's not good enough, consider that 300 years worth of dates would only take about 110,000 numbers... a Tally table with that number of entries takes less than 1.5 mega-bytes to store and only a second to build...

    --===== Create and populate the Tally table on the fly

     SELECT TOP 110000 --equates to more than 300 years of dates

            IDENTITY(INT,1,1) AS N

       INTO dbo.Tally

       FROM Master.dbo.SysColumns sc1,

            Master.dbo.SysColumns sc2

    --===== Add a Primary Key to maximize performance

      ALTER TABLE dbo.Tally

            ADD CONSTRAINT PK_Tally_N PRIMARY KEY CLUSTERED (N)

    --===== Allow the general public to use it

      GRANT SELECT ON dbo.Tally TO PUBLIC

    EXEC dbo.sp_SpaceUsed Tally

    If you can't even get a Tally table in, MVJ also has some dandy functions that are lightning quick that can be used in a FROM clause... but I'll let him tout his own wares   Hopefully, he'll see this and chime-in.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Ok Jeff, with an invitation like that, I have to post.

    You can use function F_TABLE_DATE from the link below.  You can use it directly, or use it to load a permanent calendar table.  If you have a lot of need to use a calendar table, I recommend making it a permanent table for performance reasons.

    Date Table Function F_TABLE_DATE:

    http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=61519

    This link has links to a lot of other datetime scripts and other information about dates in SQL Server.

    Date/Time Info and Script Links

    http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=64762

     

    As you can see, the query to find the missing date is a fairly simple left join.

    -- Find Missing Dates in table MyTable
    select
     a.Date
    from
     -- Dates for range 2000-01-01 to 2010-12-31
     -- Adjust range as needed for your application
     dbo.F_TABLE_DATE ( '20000101','20101231' ) a
     left join
     MyTable b
     on a.Date = b.MyDate
    where
     b.MyDate is null
    order by
     a.Date
     
     
     
     
  • Excellent, I''ll give that a try. Many thanks to you both.

     

  • Thanks for the feedback, Therion... it means a lot to us that someone is actually looking at the answers.  Appreciate it much!

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 6 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply