Adding empty rows for items which do not exist

  • Hi Chris

    Do you have a tally (numbers) table? If so, try this:

    [font="Courier New"]DECLARE @StartDate DATETIME, @EndDate DATETIME

    SET @StartDate = 'Nov 01 2008'

    SET @EndDate = 'Dec 01 2008'

    SELECT DATEADD(DAY, n.number-1, @StartDate) AS CalcLogDate, l.LogDate

    FROM Numbers n

    LEFT JOIN #LogEntry l ON l.LogDate = DATEADD(DAY, n.number-1, @StartDate)

    WHERE DATEADD(DAY, n.number-1, @StartDate) < @EndDate

    [/font]

    If not, read this:

    http://www.sqlservercentral.com/articles/TSQL/62867/

    Cheers

    ChrisM

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Hi Chris, thanks for the help!

    My table looks like this (simplified):

    CREATE TABLE [dbo].[LogEntry](

    [ID] [bigint] IDENTITY(1,1) NOT NULL,

    [LogDate] [smalldatetime] NOT NULL,

    [Message] [text] NOT NULL,

    [ProcessID] [nvarchar](50) NULL

    )

    Some test data:

    SET IDENTITY_INSERT LogEntry ON

    INSERT INTO LogEntry (ID, LogDate, [Message], ProcessID)

    SELECT '669', '2008-11-19 10:21:00', 'LogMessage', 'a3d655dd-9470-4de4-944e-a862edece434' UNION ALL

    SELECT '670', '2008-11-19 10:21:00', 'LogMessage', 'a3d655dd-9470-4de4-944e-a862edece434' UNION ALL

    SELECT '671', '2008-11-19 10:21:01', 'LogMessage', 'a3d655dd-9470-4de4-944e-a862edece434' UNION ALL

    SELECT '672', '2008-11-19 10:21:01', 'LogMessage', 'a3d655dd-9470-4de4-944e-a862edece434' UNION ALL

    SELECT '673', '2008-11-19 10:21:01', 'LogMessage', 'a3d655dd-9470-4de4-944e-a862edece434' UNION ALL

    SELECT '674', '2008-11-19 11:15:00', 'LogMessage', 'd7f997ae-a8d3-43f2-aab6-d99e11d54889' UNION ALL

    SELECT '675', '2008-11-19 11:15:00', 'LogMessage', 'd7f997ae-a8d3-43f2-aab6-d99e11d54889' UNION ALL

    SELECT '675', '2008-11-19 11:15:01', 'LogMessage', 'd7f997ae-a8d3-43f2-aab6-d99e11d54889' UNION ALL

    SELECT '675', '2008-11-19 11:15:01', 'LogMessage', 'd7f997ae-a8d3-43f2-aab6-d99e11d54889'

    SET IDENTITY_INSERT LogEntry ON

    I have a new tally table called Tally with and column called N

    Heres my newly edited query:

    DECLARE @StartDate DATETIME, @EndDate DATETIME

    SET @StartDate = 'Nov 01 2008'

    SET @EndDate = 'Dec 01 2008'

    SELECT

    DATEADD(DAY, n.N-1, @StartDate) AS CalcLogDate,

    Count(DISTINCT(ProcessID)) As ProcessCount

    FROM Tally n

    LEFT JOIN LogEntry l ON l.LogDate = DATEADD(DAY, n.N-1, @StartDate)

    WHERE DATEADD(DAY, n.N-1, @StartDate) < @EndDate

    GROUP BY DATEADD(DAY, n.N-1, @StartDate)

    I get a record for each day, but zero in the ProcessCount column for every record, where I should have 2 in the record for the 19th.

    Any ideas?

    Thanks

  • Hi Chris

    It's because there's a time component in LogDate. Using a range takes care of this:

    [font="Courier New"]SELECT DATEADD(DAY, n.number-1, @StartDate) AS CalcLogDate,

            COUNT(DISTINCT(ProcessID)) AS ProcessCount

    FROM Numbers n

    LEFT JOIN #LogEntry l ON l.LogDate > DATEADD(DAY, n.number-1, @StartDate) AND l.LogDate < DATEADD(DAY, n.number, @StartDate)

    WHERE DATEADD(DAY, n.number-1, @StartDate) < @EndDate

    GROUP BY DATEADD(DAY, n.number-1, @StartDate)

    ORDER BY DATEADD(DAY, n.number-1, @StartDate)

    [/font]

    Cheers

    ChrisM

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Thanks very much, thats sopt on!

    For info, a collegue of mine came up with this alternative solution:

    declare @days as int

    declare @startDate as datetime

    declare @endDate as datetime

    set @startDate = '2008-11-01'

    set @endDate = '2008-12-01'

    set @days = datediff(dd, @startDate, @endDate)

    --print @days

    create table #tempVals

    (

    logdate varchar(11),

    numProcessID int

    )

    declare @i as int

    set @i = 0

    --temp table

    while(@i < @days)

    begin

    insert into #tempVals values

    (substring(cast(dateadd(dd, @i, @startDate) as varchar), 1, 11), 0);

    set @i = @i + 1;

    end

    SELECT

    count(distinct l.ProcessID) numProcesses,

    Substring(Cast(t.LogDate as nvarchar), 1, 11)

    FROM LogEntry l right outer join #tempVals t on

    Substring(Cast(l.LogDate as nvarchar), 1, 11) = t.logdate

    GROUP BY Substring(Cast(t.LogDate as nvarchar), 1, 11)

    drop table #tempVals;

    I'm going to run them both side-by-side for a few days until I have more entries in the LogEntry table and I can get a meaningfull timing out, although I'm pretty sure your version will be faster.

    Thanks very much for the help

  • Your colleagues version is exactly what I would have done pre tally-tables.

    Incidentally, this...

    Substring(Cast(l.LogDate as nvarchar), 1, 11) = t.logdate

    could be a performance killer, SQL Server won't know if a logdate value in l will match a value in t, or not, without first converting it.

    Here's how to set up the calendar table using a tally table:

    SELECT substring(cast(dateadd(dd, number, @startDate) as varchar), 1, 11) AS logdate,

    CAST(0 AS INT) AS numProcessID

    INTO #tempVals

    FROM Numbers n

    WHERE number <= @days

    Cheers

    ChrisM

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 5 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply