SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Generating Missing Dates and Numbers


Generating Missing Dates and Numbers

Author
Message
jacob sebastian
jacob sebastian
SSCrazy
SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)

Group: General Forum Members
Points: 2688 Visits: 2523
GSquared,
I agree that a number table should be the first choice. The approach presented in this article is useful for cases when you do not want to use a number table.

Thanks for sharing the performance data. That was really helpful.

.
jacob sebastian
jacob sebastian
SSCrazy
SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)

Group: General Forum Members
Points: 2688 Visits: 2523
Great tip Jeff. Thank you for sharing this. I remember this discussion is coming up for the second time. I appreciate your comments and am glad to learn something new everytime I read your posts Smile

.
Mike DiRenzo
Mike DiRenzo
SSChasing Mays
SSChasing Mays (601 reputation)SSChasing Mays (601 reputation)SSChasing Mays (601 reputation)SSChasing Mays (601 reputation)SSChasing Mays (601 reputation)SSChasing Mays (601 reputation)SSChasing Mays (601 reputation)SSChasing Mays (601 reputation)

Group: General Forum Members
Points: 601 Visits: 210
Agreed
Mike DiRenzo
Mike DiRenzo
SSChasing Mays
SSChasing Mays (601 reputation)SSChasing Mays (601 reputation)SSChasing Mays (601 reputation)SSChasing Mays (601 reputation)SSChasing Mays (601 reputation)SSChasing Mays (601 reputation)SSChasing Mays (601 reputation)SSChasing Mays (601 reputation)

Group: General Forum Members
Points: 601 Visits: 210
Superaltive article. Thanks.
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (215K reputation)SSC Guru (215K reputation)SSC Guru (215K reputation)SSC Guru (215K reputation)SSC Guru (215K reputation)SSC Guru (215K reputation)SSC Guru (215K reputation)SSC Guru (215K reputation)

Group: General Forum Members
Points: 215851 Visits: 41981
GSquared (2/6/2008)
Yet again, I have to say this is clever, but a Numbers table is better.

Yes, if you for some reason need to create a list of sequential numbers on the fly, this CTE is probably the fastest means of doing so. It certainly is at least A fast means of doing so. But having an actual Numbers table, with a clustered index on it, is MUCH better.

(Yes, this is the third or fourth time I've written on this exact subject. But for whatever reason, it keeps coming up.)


Spot on, G... didn't see your post as I was writting mine.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
GSquared
GSquared
SSC Guru
SSC Guru (57K reputation)SSC Guru (57K reputation)SSC Guru (57K reputation)SSC Guru (57K reputation)SSC Guru (57K reputation)SSC Guru (57K reputation)SSC Guru (57K reputation)SSC Guru (57K reputation)

Group: General Forum Members
Points: 57995 Visits: 9730
You can also find missing dates with a Numbers table more easily than is outlined in the article.

To set up the test:


create table MissingDates (
Date datetime primary key)
go
insert into dbo.missingdates(date)
select dateadd(second, subtime.number, subdate.date)
from
(select dateadd(day, number, '1/1/08') as Date
from common.dbo.Numbers
where number <= 100) SubDate
cross join
(select number
from common.dbo.BigNumbers
where number <= (3600 * 24) - 1) SubTime
go
delete from dbo.missingdates
where date between '1/12/08' and '1/13/08'
go
delete from dbo.missingdates
where date between '1/30/08' and '1/31/08'



This time, I tried my usual Numbers table method:


select dateadd(day, number, '1/1/08')
from common.dbo.numbers
left outer join dbo.missingdates
on date >= dateadd(day, number, '1/1/08')
and date < dateadd(day, number + 1, '1/1/08')
where date is null
number between 0 and 31



------------------
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 1 ms.

(2 row(s) affected)
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'MissingDates'. Scan count 32, logical reads 5638, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Numbers'. Scan count 3, logical reads 52, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 688 ms, elapsed time = 687 ms.
------------------

Then I tried the method recommended in the article:


SELECT
CAST('2008-01-01' AS DATETIME) + Number-1
FROM dbo.GetNumbers(1, 30)
where CAST('2008-01-01' AS DATETIME) + Number-1 not in
(select cast(convert(varchar(100), date, 101) as datetime)
from dbo.missingdates)



(The cast-convert is necessary because without that, it will simply detect missing midnights, not whole missing days.)

I killed that process after it had been running for 3 minutes. Not sure how long it would have taken in all.

Converted it to:


select dateadd(day, number, '1/1/08')
from dbo.getnumbers(1,31)
left outer join dbo.missingdates
on date >= dateadd(day, number-1, '1/1/08')
and date < dateadd(day, number, '1/1/08')
where date is null



Which is pretty much identical to the query used with the Numbers table.

----------------
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 1 ms.

(2 row(s) affected)
Table '#28D10FF3'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'MissingDates'. Scan count 31, logical reads 5418, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 673 ms, elapsed time = 366 ms.
-----------------

This time, the total elapsed time was shorter, but the CPU time was nearly identical. One less scan count on the MissingDates table, and the server was able to split the process across more than 1 CPU (the computer I'm testing this on is a dual-core Pentium D).

Some advantage to the CTE, but in a loaded up server, it won't make much difference.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Jerry Hung
Jerry Hung
Hall of Fame
Hall of Fame (3.5K reputation)Hall of Fame (3.5K reputation)Hall of Fame (3.5K reputation)Hall of Fame (3.5K reputation)Hall of Fame (3.5K reputation)Hall of Fame (3.5K reputation)Hall of Fame (3.5K reputation)Hall of Fame (3.5K reputation)

Group: General Forum Members
Points: 3468 Visits: 1208
Weird, your CTE code took 16 seconds (twice) on my SQL Server
that is not VERY VERY fast. Anyway, we don't have the need for it, I was just curious

16 seconds on a SQL Server on VMWare, and 16 seconds on a physical SQL Server (8 cpu, 24GB RAM)


WITH
L0 AS (SELECT 1 AS C UNION ALL SELECT 1), --2 rows
L1 AS (SELECT 1 AS C FROM L0 AS A, L0 AS B),--4 rows
L2 AS (SELECT 1 AS C FROM L1 AS A, L1 AS B),--16 rows
L3 AS (SELECT 1 AS C FROM L2 AS A, L2 AS B),--256 rows
L4 AS (SELECT 1 AS C FROM L3 AS A, L3 AS B),--65536 rows
L5 AS (SELECT 1 AS C FROM L4 AS A, L4 AS B),--4294967296 rows
num AS (SELECT ROW_NUMBER() OVER(ORDER BY C) AS N FROM L5)
SELECT N FROM NUM WHERE N <= 1000000;



SQLServerNewbie

MCITP: Database Administrator SQL Server 2005
Dennis D. Allen
Dennis D. Allen
SSC Veteran
SSC Veteran (201 reputation)SSC Veteran (201 reputation)SSC Veteran (201 reputation)SSC Veteran (201 reputation)SSC Veteran (201 reputation)SSC Veteran (201 reputation)SSC Veteran (201 reputation)SSC Veteran (201 reputation)

Group: General Forum Members
Points: 201 Visits: 163
What do folks think about adding directly to a datetime and allowing implicit conversions to take place?


CREATE FUNCTION dbo.DateRange( @start DATETIME, @length INT ) 
RETURNS @range TABLE (
dateOf datetime PRIMARY KEY
)
AS
BEGIN

-- Populate the output table of dates using a number table
INSERT INTO @range ( dateOf )
SELECT
@start + n AS dateOf
FROM (
SELECT (n1.n +n10.n +n100.n +n1000.n) AS n
FROM (SELECT 0 AS n UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) AS n1
CROSS JOIN (SELECT 0 AS n UNION SELECT 10 UNION SELECT 20 UNION SELECT 30 UNION SELECT 40 UNION SELECT 50 UNION SELECT 60 UNION SELECT 70 UNION SELECT 80 UNION SELECT 90) AS n10
CROSS JOIN (SELECT 0 AS n UNION SELECT 100 UNION SELECT 200 UNION SELECT 300 UNION SELECT 400 UNION SELECT 500 UNION SELECT 600 UNION SELECT 700 UNION SELECT 800 UNION SELECT 900) AS n100
CROSS JOIN (SELECT 0 AS n UNION SELECT 1000 UNION SELECT 2000 UNION SELECT 3000 UNION SELECT 4000 UNION SELECT 5000 UNION SELECT 6000 UNION SELECT 7000 UNION SELECT 8000 UNION SELECT 9000) AS n1000
WHERE (n1.n +n10.n +n100.n +n1000.n) BETWEEN 0 AND @length - 1
) AS numbers
ORDER BY
n ASC;


RETURN;
END


Tony McGarry
Tony McGarry
Grasshopper
Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)

Group: General Forum Members
Points: 19 Visits: 25
I prefer this simple and elequent way to generate numbers.
I doesn't generate unnecessary numbers and is easy to understand


CREATE FUNCTION dbo.GetSequence
(
@Start BIGINT,
@End BIGINT,
@Increment BIGINT
)
RETURNS @ret TABLE(Number BIGINT)
AS

BEGIN

WITH
seq(num)
as
(
select @Start
union all
select num + @Increment from seq
where num + @Increment <= @End
)
INSERT INTO @ret(Number)
Select * From Seq

END

Matt Miller (4)
Matt Miller (4)
One Orange Chip
One Orange Chip (29K reputation)One Orange Chip (29K reputation)One Orange Chip (29K reputation)One Orange Chip (29K reputation)One Orange Chip (29K reputation)One Orange Chip (29K reputation)One Orange Chip (29K reputation)One Orange Chip (29K reputation)

Group: General Forum Members
Points: 29607 Visits: 19006
Jeff/Jacob:

first - Jacob - very nice. Good solid method for handling a common request.

Jeff - I'll chalk this up to being the forever contrarian... But the comparison gets more interesting when you "right-size" the Itzik method. Notice what one tiny little change does (since we KNOW how many results we want....)


SET STATISTICS TIME ON
GO
DECLARE @BitBucket INT
--=============================================================================
PRINT REPLICATE('=',100)
PRINT 'Itzek''s method:'
;WITH
L0 AS (SELECT 1 AS C UNION ALL SELECT 1), --2 rows
L1 AS (SELECT 1 AS C FROM L0 AS A, L0 AS B),--4 rows
L2 AS (SELECT 1 AS C FROM L1 AS A, L1 AS B),--16 rows
L3 AS (SELECT 1 AS C FROM L2 AS A, L2 AS B),--256 rows
L4 AS (SELECT 1 AS C FROM L3 AS A, L3 AS B),--65536 rows
L5 AS (SELECT 1 AS C FROM L4 AS A, L0 AS B),--131072 rows
num AS (SELECT ROW_NUMBER() OVER(ORDER BY C) AS N FROM L5)
SELECT @BitBucket = N FROM NUM WHERE N <= 1000000;

--=============================================================================
PRINT REPLICATE('=',100)
PRINT 'Jeff Moden''s Method'
; WITH cTally AS
(-----------------------------------------------------------------------------
--==== High performance CTE equivalent of a Tally or Numbers table
SELECT TOP (1000000)
ROW_NUMBER() OVER (ORDER BY t1.ID) AS N
FROM Master.sys.SysColumns t1
CROSS JOIN Master.sys.SysColumns t2
)-----------------------------------------------------------------------------
SELECT @BitBucket = N FROM cTally --Do your outer join with table being checked here
PRINT REPLICATE('=',100)


Can you spot the difference (it's in bold....hehe)?

Of course - we could debate why we're fighting so hard to reclaim 200ms or less, but - what's the fun in that?

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search