• berzat.museski (6/13/2011)


    No need for uncle Google in this case, spt_values is quite enough, unless you are querying some large data warehouse.

    However, I agree that GetNums is a cool trick to generate large order numbers.

    I was thinking of the following solution:

    I first add one computed column in the fact table (I assume that it is ok to do that) in order to keep the query SARGable. The new column will have the sales datetime rounded up to hour percision.

    ALTER TABLE #SalesTestData

    ADD SaleTimeHour AS CONVERT(DATETIME, CONVERT(VARCHAR,SaleTime,112)+' '+CAST(DATEPART(HOUR,SaleTime) AS VARCHAR(2))+':00:00.000',112) PERSISTED

    The report will then be formed by this single query:

    DECLARE @StartDate DATETIME, @EndDate DATETIME

    SET @StartDate='2011-01-01 00:00:00.000'

    SET @EndDate='2011-12-01 00:00:00.000'

    SELECT CAST(AllDates.ReportDayHour AS DATE) AS ReportDay, CAST(AllDates.ReportDayHour AS TIME) AS ReportHour,

    ISNULL(SUM(s.SaleQty),0) AS TotalSaleQty

    FROM (

    SELECT DATEADD(hh,h.number,DATEADD(dd,d.number,DATEADD(mm,m.number,@StartDate))) AS ReportDayHour

    FROM master..spt_values m

    CROSS JOIN master..spt_values d

    CROSS JOIN master..spt_values h

    WHEREm.type='p' AND m.number BETWEEN 0 AND DATEDIFF(mm,@StartDate,@EndDate)

    AND d.type='p' AND d.number BETWEEN 0 AND 30

    AND h.type='p' AND h.number BETWEEN 0 AND 23

    ) AS AllDates

    LEFT JOIN #SalesTestData s ON s.SaleTimeHour=AllDates.ReportDayHour

    GROUP BY AllDates.ReportDayHour

    ORDER BY AllDates.ReportDayHour

    The parameters are the start and end moments of the reports, as any manager would want 🙂

    Hi, Berzat,

    That's a very clever solution but, comparatively speaking, it's a bit slow compared to the methods offered in the article. After adding your computed column to the test table (as you did before), run the following code and see the difference in the time it takes...

    --===== Identify the run

    SET STATISTICS TIME OFF;

    PRINT REPLICATE('=',80);

    PRINT 'Method similar to article';

    PRINT REPLICATE('=',80);

    SET STATISTICS TIME ON;

    --===== Declare some obviously named variables. The ones with the @p

    -- prefix could be parameters in a stored procedure or function

    DECLARE @pStartMonth DATETIME,

    @pEndMonth DATETIME

    ;

    --===== Assign values to the input parameters

    SELECT @pStartMonth = '2011-01-15',

    @pEndMonth = 'Dec 2011'

    ;

    --===== "Normalize" the dates

    SELECT @pStartMonth = DATEADD(mm,DATEDIFF(mm,0,@pStartMonth),0),

    @pEndMonth = DATEADD(mm,DATEDIFF(mm,0,@pEndMonth)+1,0)

    ;

    --===== Create and populate the "hours" table

    DECLARE @Hours TABLE (HourStart DATETIME, HourNext DATETIME)

    ;

    INSERT INTO @Hours

    (HourStart, HourNext)

    SELECT TOP (DATEDIFF(hh, @pStartMonth, @pEndMonth))

    HourStart = DATEADD(hh,ROW_NUMBER() OVER (ORDER BY (SELECT NULL))-1,@pStartMonth),

    HourNext = DATEADD(hh,ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) ,@pStartMonth)

    FROM sys.all_columns ac1

    CROSS JOIN sys.all_columns ac2

    ;

    SELECT ReportDay = CONVERT(CHAR(10),MAX(bin.HourStart),120),

    ReportHour = DATEPART(hh,MAX(bin.HourStart)),

    TotalSales = SUM(ISNULL(sale.SaleQty,0))

    FROM @Hours bin

    LEFT OUTER JOIN #SalesTestData sale

    ON sale.SaleTime >= bin.HourStart

    AND sale.SaleTime < bin.HourNext

    GROUP BY bin.HourStart

    ORDER BY bin.HourStart

    ;

    GO

    -------------------------------------------------------------------------------

    --===== Identify the run

    SET STATISTICS TIME OFF;

    PRINT REPLICATE('=',80);

    PRINT 'Berzat''s method';

    PRINT REPLICATE('=',80);

    SET STATISTICS TIME ON;

    DECLARE @StartDate DATETIME, @EndDate DATETIME

    SET @StartDate='2011-01-01 00:00:00.000'

    SET @EndDate='2011-12-01 00:00:00.000'

    SELECT CAST(AllDates.ReportDayHour AS DATE) AS ReportDay, CAST(AllDates.ReportDayHour AS TIME) AS ReportHour,

    ISNULL(SUM(s.SaleQty),0) AS TotalSaleQty

    FROM (

    SELECT DATEADD(hh,h.number,DATEADD(dd,d.number,DATEADD(mm,m.number,@StartDate))) AS ReportDayHour

    FROM master..spt_values m

    CROSS JOIN master..spt_values d

    CROSS JOIN master..spt_values h

    WHERE m.type='p' AND m.number BETWEEN 0 AND DATEDIFF(mm,@StartDate,@EndDate)

    AND d.type='p' AND d.number BETWEEN 0 AND 30

    AND h.type='p' AND h.number BETWEEN 0 AND 23

    ) AS AllDates

    LEFT JOIN #SalesTestData s ON s.SaleTimeHour=AllDates.ReportDayHour

    GROUP BY AllDates.ReportDayHour

    ORDER BY AllDates.ReportDayHour

    ;

    SET STATISTICS TIME OFF;

    Here's the output on my ol' 9 year old, single CPU box. The method from the article is more than 12 times faster.

    ================================================================================

    Method similar to article

    ================================================================================

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    SQL Server Execution Times:

    CPU time = 94 ms, elapsed time = 106 ms.

    (8760 row(s) affected)

    (8760 row(s) affected)

    SQL Server Execution Times:

    [font="Arial Black"] CPU time = 390 ms, elapsed time = 731 ms.[/font]

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 1 ms.

    ================================================================================

    Berzat's method

    ================================================================================

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    Warning: Null value is eliminated by an aggregate or other SET operation.

    (8760 row(s) affected)

    SQL Server Execution Times:

    [font="Arial Black"] CPU time = 9328 ms, elapsed time = 12354 ms.[/font]

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)