• Jeff Moden - Friday, October 20, 2017 7:47 PM

    While we're at it... when all this first started and while I was waiting for your Tally function, I cranked this out to generate test data.  I normally resort to the Cross Join type of pseudo-cursor for forum work so that people don't have to worry about having a Tally function or table.  It's what I've been using to test with.


    --===== Test data control variables and presets
    DECLARE  @LoDT DATETIME = '2000' --Inclusive
            ,@HiDT DATETIME = '2030' --Exclusive
            ,@MaxSpan INT   = 90 --Max number of days in period
            ,@Rows INT      = 1000000
            ,@Days INT

    ;
     SELECT @Days = DATEDIFF(dd,@LoDT,@HiDT)-@MaxSpan
    ;
    --===== Create the test table (added a column just to see)
         IF OBJECT_ID('tempdb..#TestData', 'U') IS NOT NULL
       DROP TABLE #TestData
    ;
     CREATE TABLE #TestData
            (
             RN     INT         NOT NULL
            ,beg_dt DATETIME    NOT NULL
            ,end_dt DATETIME    NOT NULL
            ,Span   AS CONVERT(FLOAT,end_dt-beg_dt) PERSISTED
             PRIMARY KEY CLUSTERED (beg_dt, end_dt)
                WITH (IGNORE_DUP_KEY = ON)
            )
    ;
    --===== Use "Minimal Logging" to populate the table with random "begin" dates
         -- and a random span to create the random "end" dates.
       WITH
    cteGenDate AS
    (
     SELECT TOP (@Rows)
            beg_dt = RAND(CHECKSUM(NEWID()))*@Days+@LoDT --Gotta love direct date math.
       FROM      sys.all_columns ac1
      CROSS JOIN sys.all_columns ac2
    )
     INSERT INTO #TestData WITH(TABLOCK) --For Minimal Logging
            (RN, beg_dt, end_dt)   
     SELECT  RN     = ROW_NUMBER() OVER (ORDER BY beg_dt)
            ,beg_dt
            ,end_dt = RAND(CHECKSUM(NEWID()))*@MaxSpan+beg_dt
       FROM cteGenDate
      ORDER BY beg_dt, end_dt   --For Minimal Logging with Clustered Index
     OPTION (RECOMPILE)         --For Minimal Logging
    ;

    Now this has some interesting stuff going on... You definitely took a more measured approach than I did.
    I just picked a date close to middle on the Calendar table, added two "CHECKSUM(NEWID()) % 999999" and simply added or subtracted 9's until it produced legitimate looking ranges.
    This... "beg_dt = RAND(CHECKSUM(NEWID()))*@Days+@LoDT  " ...  This took me a few minutes...I don't ever use RAND() and either didn't know or completely forgot that generates a float <= 1.
    For a second I thought you were using voodoo to constrain the output range... I'll have to spend some time using it to see if I like it as much as the % method (thank you for that one too BTW)
    Normally if I need "believable" test data I usually do something like this...
    SELECT TOP 1000
        ROW_NUMBER() OVER (ORDER BY c.object_id),
        DATEADD(DAY, ld.low_day, '20100101'),
        DATEADD(DAY, hd.high_day, '20100101')
    FROM
        sys.columns c    --< this is a lie... IRL it would be a tally...
        CROSS APPLY ( VALUES (ABS(CHECKSUM(NEWID())) % 1460) ) ld (low_day)                    -- 0-4 years
        CROSS APPLY ( VALUES (ABS(CHECKSUM(NEWID())) % (1825-ld.low_day) ) ) hd (high_day);    -- low-day-5 years

    You don't have to sell me on the direct date math or the INT math... there have been a few occasions that the INT math had me feeling like a 4 yr old at a magic show.
    Sadly, for whatever reason, they aren't solutions that usually hit me right off the bat.

    While on the topic of cool tricks, In the other test harness... I noticed you have a  CHECKPOINT... 
    It makes me suspect that you're grabbing test results from the transaction logs... If so, what's your verdict?
    I've been kicking around the idea of using extended events to capture test results, hopefully with less "observer effect" than plan capture & SET STATISTICS IO,TIME ON;
    https://docs.microsoft.com/en-us/sql/relational-databases/extended-events/quick-start-extended-events-in-sql-server

    Once again, thank you!