Display differences between cumulative values

  • I have been gathering IO stall data every 15 minutes for several months now.

    I am wanting to use the data in a report to display the average IO stalls for each hour, for each day of the week, for a specified period.

    So far I can only do this by week number, which is too broad, but demonstrates what I'm trying to achieve.

    SELECTS1.DriveLetter,

    (S2.AvgReadStalls - S1.AvgReadStalls) AvgReadStalls,

    (S2.AvgWriteStalls - S1.AvgWriteStalls) AvgWriteStalls,

    (S2.AvgStalls - S1.AvgStalls) AvgStalls

    FROM(

    SELECTLEFT(Physical_name, 1) DriveLetter,

    AVG(avg_read_stall_ms) AvgReadStalls,

    AVG(avg_write_stall_ms) AvgWriteStalls,

    AVG(avg_io_stall_ms) AvgStalls,

    DATEPART(WK, SampleTime) WeekNr

    FROMStallsHistory

    WHERESampleTime >= '2014-01-01'

    GROUPBY LEFT(Physical_name, 1),

    DATEPART(WK, SampleTime)

    ) S1

    INNER JOIN

    (

    SELECTLEFT(Physical_name, 1) DriveLetter,

    AVG(avg_read_stall_ms) AvgReadStalls,

    AVG(avg_write_stall_ms) AvgWriteStalls,

    AVG(avg_io_stall_ms) AvgStalls,

    DATEPART(WK, SampleTime) WeekNr

    FROMStallsHistory

    WHERESampleTime >= '2014-01-01'

    GROUPBY LEFT(Physical_name, 1),

    DATEPART(WK, SampleTime)

    ) S2

    ONS1.DriveLetter = S2.DriveLetter

    AND S1.WeekNr = (S2.WeekNr - 1)

    ORDERBY S1.DriveLetter, S1.WeekNr

    -- I want to replace

    DATEPART(WK, SampleTime) WeekNr

    -- with

    DATEPART(DW, SampleTime) DayNr,

    DATEPART(HH, SampleTime) HourNr

    I think my approach is all wrong, but can't think of the required technique(s). (Without using temporary tables.)

    At the moment I am just looking for the different techniques I can use so that I can research and play around with them.

    Would be nice to know my options for SQL 2005, 2008.

    If I had 2012 I would use LEAD or LAG

    If really needed, I'll post same sample data.

    Cheers



    For better, quicker answers on T-SQL questions, read Jeff Moden's suggestions.[/url]

    "Million-to-one chances crop up nine times out of ten." ― Terry Pratchett, Mort

  • I believe this might get you started.

    WITH stalls

    AS ( SELECT LEFT(Physical_name, 1) DriveLetter ,

    avg_read_stall_ms AvgReadStalls ,

    avg_write_stall_ms AvgWriteStalls ,

    avg_io_stall_ms AvgStalls ,

    DATEPART(WK, SampleTime) WeekNr ,

    DATEPART(DAY, SampleTime) DayNr ,

    DATEPART(HOUR, SampleTime) HourNr

    FROM StallsHistory

    WHERE SampleTime >= '2014-01-01'

    )

    SELECT DriveLetter ,

    AVG(AvgReadStalls) OVER ( PARTITION BY WeekNr ) AS WeekLyAvgReadStalls ,

    AVG(AvgReadStalls) OVER ( PARTITION BY DayNr ) AS DailyAvgReadStalls ,

    AVG(AvgReadStalls) OVER ( PARTITION BY DayNr, HourNr ) AS HourlyAvgReadStalls ,

    AVG(AvgWriteStalls) OVER ( PARTITION BY WeekNr ) AS WeekLyAvgWriteStalls ,

    AVG(AvgWriteStalls) OVER ( PARTITION BY DayNr ) AS DailyAvgWriteStalls ,

    AVG(AvgWriteStalls) OVER ( PARTITION BY DayNr, HourNr ) AS HourlyAvgWriteStalls ,

    AVG(AvgStalls) OVER ( PARTITION BY WeekNr ) AS WeekLyAvgStalls ,

    AVG(AvgStalls) OVER ( PARTITION BY DayNr ) AS DailyAvgStalls ,

    AVG(AvgStalls) OVER ( PARTITION BY DayNr, HourNr ) AS HourlyAvgStalls ,

    WeekNr ,

    DayNr ,

    HourNr

    FROM stalls AS S

  • Thanks for your response Jack.

    Your query works fine, but does not return the results i am looking for.

    I'm want to see the difference between each row record. i.e. Row2value - Row1Value.

    I pumped some sample data into a new table in 2012 Express and can produce the wanted results with LAG().

    SELECTDriveLetter Drive,

    AvgStalls,

    AvgStalls - (LAG(AvgStalls, 1, AvgStalls) OVER (PARTITION BY DriveLetter ORDER BY DriveLetter)) CumLagDif,

    DATEPART(DW, DateVal) DowNr,

    DayName,

    HourNr

    FROMIOStalls

    Which returns:

    DriveAvgStallsCumLagDifDowNrDayNameHourNr

    F7.7090900.0000003Thu13

    F7.7090900.0000003Thu14

    F7.7090900.0000003Thu15

    F7.704545-0.0045453Thu16

    F7.700000-0.0045453Thu17

    F7.7000000.0000003Thu18

    F7.7000000.0000003Thu19

    I can then average the column CumLagDif to see how IO is doing during each hour of each day of the week.



    For better, quicker answers on T-SQL questions, read Jeff Moden's suggestions.[/url]

    "Million-to-one chances crop up nine times out of ten." ― Terry Pratchett, Mort

  • I added an Identity(1,1) to the StallsHistory table and made it the cluster PK.

    The join now uses S1.SurroID = (S2.SurroID - 1) AND S1.Drive = S2.Drive

    This looks pretty close to what I'm wanting. Not 100% sure though.

    It think this will skip the first row of each Drive chunk of rows.

    SET DATEFIRST 1

    SELECTS1.Drive,

    AVG(S2.AvgReadStalls - S1.AvgReadStalls) AvgReadStalls,

    AVG(S2.AvgWriteStalls - S1.AvgWriteStalls) AvgWriteStalls,

    AVG(S2.AvgStalls - S1.AvgStalls) AvgStalls,

    S1.WeekNr,

    S1.DayNr,

    S1.HourNr

    FROM(

    SELECTSurroID,

    LEFT(Physical_name, 1) Drive,

    avg_read_stall_ms AvgReadStalls,

    avg_write_stall_ms AvgWriteStalls,

    avg_io_stall_ms AvgStalls,

    DATEPART(WK, SampleTime) WeekNr,

    DATEPART(DW, SampleTime) DayNr,

    DATEPART(HH, SampleTime) HourNr

    FROMStallsHistory

    WHERESampleTime >= '2014-01-01'

    ) S1

    INNER JOIN

    (

    SELECTSurroID,

    LEFT(Physical_name, 1) Drive,

    avg_read_stall_ms AvgReadStalls,

    avg_write_stall_ms AvgWriteStalls,

    avg_io_stall_ms AvgStalls

    FROMStallsHistory

    WHERESampleTime >= '2014-01-01'

    ) S2

    ON S1.Drive = S2.Drive

    AND S1.SurroID = (S2.SurroID - 1)

    GROUPBY S1.Drive,

    S1.WeekNr,

    S1.DayNr,

    S1.HourNr



    For better, quicker answers on T-SQL questions, read Jeff Moden's suggestions.[/url]

    "Million-to-one chances crop up nine times out of ten." ― Terry Pratchett, Mort

  • Adapting my query to do what you want isn't that difficult. Something like this will do it (I'm only showing weekly, but you could do all of them either with multiple CTE's UNION'd at the end or in one query with a more complex JOIN)

    WITH stalls

    AS ( SELECT LEFT(Physical_name, 1) DriveLetter ,

    avg_read_stall_ms AvgReadStalls ,

    avg_write_stall_ms AvgWriteStalls ,

    avg_io_stall_ms AvgStalls ,

    DATEPART(WK, SampleTime) WeekNr ,

    DATEPART(DAY, SampleTime) DayNr ,

    DATEPART(HOUR, SampleTime) HourNr

    FROM StallsHistory

    WHERE SampleTime >= '2014-01-01'

    ),

    stallsOrder

    AS ( SELECT DriveLetter ,

    AVG(AvgReadStalls) OVER ( PARTITION BY WeekNr ) AS WeekLyAvgReadStalls ,

    AVG(AvgReadStalls) OVER ( PARTITION BY DayNr ) AS DailyAvgReadStalls ,

    AVG(AvgReadStalls) OVER ( PARTITION BY DayNr, HourNr ) AS HourlyAvgReadStalls ,

    AVG(AvgWriteStalls) OVER ( PARTITION BY WeekNr ) AS WeekLyAvgWriteStalls ,

    AVG(AvgWriteStalls) OVER ( PARTITION BY DayNr ) AS DailyAvgWriteStalls ,

    AVG(AvgWriteStalls) OVER ( PARTITION BY DayNr, HourNr ) AS HourlyAvgWriteStalls ,

    AVG(AvgStalls) OVER ( PARTITION BY WeekNr ) AS WeekLyAvgStalls ,

    AVG(AvgStalls) OVER ( PARTITION BY DayNr ) AS DailyAvgStalls ,

    AVG(AvgStalls) OVER ( PARTITION BY DayNr, HourNr ) AS HourlyAvgStalls ,

    ROW_NUMBER() OVER ( PARTITION BY WeekNr ORDER BY WeekNr ) AS WeekRow ,

    ROW_NUMBER() OVER ( PARTITION BY DayNr ORDER BY DayNr ) AS DayRow ,

    ROW_NUMBER() OVER ( PARTITION BY HourNr ORDER BY HourNr ) AS HourRow ,

    WeekNr ,

    DayNr ,

    HourNr

    FROM stalls AS S

    )

    SELECT A.DriveLetter ,

    A.WeekNr ,

    A.WeekLyAvgReadStalls ,

    B.WeekLyAvgReadStalls ,

    A.WeekLyAvgReadStalls - B.WeekLyAvgReadStalls AS readDIff ,

    A.WeekLyAvgWriteStalls ,

    B.WeekLyAvgWriteStalls ,

    A.WeekLyAvgWriteStalls - B.WeekLyAvgWriteStalls AS WriteDiff ,

    A.WeekLyAvgStalls ,

    B.WeekLyAvgStalls ,

    A.WeekLyAvgStalls - B.WeekLyAvgStalls AS totalDiff

    FROM stallsOrder AS A

    LEFT JOIN stallsOrder AS B ON A.DriveLetter = B.DriveLetter

    AND A.WeekRow = B.WeekRow - 1

  • The result of each the queries differ.

    Here is some sample data. 10 rows to easily check the results.

    USE TempDB

    GO

    IF OBJECT_ID('#CumDiffs') IS NOT NULL

    DROP TABLE #CumDiffs

    CREATE TABLE #CumDiffs

    (

    IDSmallInt IDENTITY,

    SampleDateTImeSmallDateTime,

    DriveLetterChar(1),

    SampleValueSmallInt

    )

    INSERT#CumDiffs

    (SampleDateTIme, DriveLetter, SampleValue)

    SELECTTOP 10 CAST(DATEADD(Mi, ROW_NUMBER() OVER (ORDER BY c1.[object_id]), GETDATE()) AS SmallDateTime),

    'D',--UPPER(CHAR(ABS(CHECKSUM(NEWID())) % 3 + 100)), -- This generates random letters D, E, F

    100 + (ABS(CHECKSUM(NEWID())) % 100 * CASE ABS(CHECKSUM(NEWID())) % 3 WHEN 0 THEN -1 ELSE 1 END)

    FROMsys.all_columns c1,

    sys.all_columns c2

    SQL 2012 code that gives the correct results.

    SELECTTOP 10

    DriveLetter, WeekNr, DoWNr, HourNr, AVG(ValDiff) AvgValDiff

    FROM(

    SELECTDriveLetter,

    DATEPART(WEEK, SampleDateTime) WeekNr,

    DATEPART(WEEKDAY, SampleDateTime) DoWNr,

    DATEPART(HOUR, SampleDateTime) HourNr,

    SampleValue Val_A,

    1.0*LAG(SampleValue, 1, SampleValue) OVER (PARTITION BY DriveLetter ORDER BY DriveLetter) val_B,

    1.0*(LAG(SampleValue, 1, SampleValue) OVER (PARTITION BY DriveLetter ORDER BY DriveLetter)) - SampleValue ValDiff

    FROM#CumDiffs

    ) Vals

    GROUPBY DriveLetter, WeekNr, DoWNr, HourNr

    ORDERBY DriveLetter, WeekNr, DoWNr, HourNr

    My inital 2005 code.

    SELECTTOP 10 S1.DriveLetter,

    S1.WeekNr,

    S1.DoWNr,

    S1.HourNr,

    AVG(S1.SampleValue) Val_A,

    AVG(S2.SampleValue) Val_B,

    AVG((S2.SampleValue) - (S1.SampleValue)) ValDiff

    FROM(

    SELECTID,

    DriveLetter,

    DATEPART(WEEK, SampleDateTime) WeekNr,

    DATEPART(WEEKDAY, SampleDateTime) DoWNr,

    DATEPART(HOUR, SampleDateTime) HourNr,

    1.0*SampleValue SampleValue

    FROM#CumDiffs

    ) S1

    LEFT JOIN

    (

    SELECTID,

    DriveLetter,

    1.0*SampleValue SampleValue

    FROM#CumDiffs

    ) S2

    ON S1.DriveLetter = S2.DriveLetter

    AND S1.ID = S2.ID - 1

    GROUPBY S1.DriveLetter, S1.WeekNr, S1.DoWNr, S1.HourNr

    ORDERBY DriveLetter, WeekNr, DoWNr, HourNr

    Your code. Differs alot from the previous 2 queries.

    ;WITH

    Vals AS

    (

    SELECTDriveLetter,

    DATEPART(WEEK, SampleDateTime) WeekNr,

    DATEPART(WEEKDAY, SampleDateTime) DoWNr,

    DATEPART(HOUR, SampleDateTime) HourNr,

    SampleValue

    FROM#CumDiffs

    ),

    ValOrder AS

    (

    SELECTDriveLetter,

    AVG(SampleValue) OVER (PARTITION BY WeekNr) WeekAvgVal,

    AVG(SampleValue) OVER (PARTITION BY DoWNr) DoWAvgVal,

    AVG(SampleValue) OVER (PARTITION BY DoWNr, HourNr) HourAvgVal,

    ROW_NUMBER() OVER (PARTITION BY WeekNr ORDER BY WeekNr) WeekRow,

    ROW_NUMBER() OVER (PARTITION BY DoWNr ORDER BY DoWNr) DoWRow,

    ROW_NUMBER() OVER (PARTITION BY DowNr, HourNr ORDER BY HourNr) HourRow,

    WeekNr,

    DoWNr,

    HourNr

    FROMVals V

    )

    SELECTTOP 10

    S1.DriveLetter,

    S1.WeekNr,

    S1.DoWNr,

    S1.HourNr,

    S1.HourAvgVal Val_A,

    S2.HourAvgVal Val_B,

    S2.HourAvgVal - S1.HourAvgVal ValDiff

    FROMValOrder S1

    LEFT JOIN ValOrder S2

    ON S1.DriveLetter = S2.DriveLetter

    AND S1.HourRow = S2.HourRow - 1

    ORDERBY DriveLetter, WeekNr, DoWNr, HourNr



    For better, quicker answers on T-SQL questions, read Jeff Moden's suggestions.[/url]

    "Million-to-one chances crop up nine times out of ten." ― Terry Pratchett, Mort

  • Bingo!

    Here's the full code. (Sample data, 2012 query, 2005 sub queries, 2005 cte).

    Is this also possible using some advanced APPLY tricks?

    USE TempDB

    GO

    IF OBJECT_ID('#CumDiffs') IS NOT NULL

    DROP TABLE #CumDiffs

    CREATE TABLE #CumDiffs

    (

    IDSmallInt IDENTITY,

    SampleDateTImeSmallDateTime,

    DriveLetterChar(1),

    SampleValueSmallInt

    )

    INSERT#CumDiffs

    (SampleDateTIme, DriveLetter, SampleValue)

    SELECTTOP 10000 CAST(DATEADD(Mi, ROW_NUMBER() OVER (ORDER BY c1.[object_id]), GETDATE()) AS SmallDateTime),

    'D',--UPPER(CHAR(ABS(CHECKSUM(NEWID())) % 3 + 100)), -- This generates random letters D, E, F

    100 + (ABS(CHECKSUM(NEWID())) % 100 * CASE ABS(CHECKSUM(NEWID())) % 3 WHEN 0 THEN -1 ELSE 1 END)

    FROMsys.all_columns c1,

    sys.all_columns c2

    UNION ALL

    SELECTTOP 10000 CAST(DATEADD(Mi, ROW_NUMBER() OVER (ORDER BY c1.[object_id]), GETDATE()) AS SmallDateTime),

    'E',

    10 + ((CHECKSUM(NEWID())) % 30 * CASE ABS(CHECKSUM(NEWID())) % 3 WHEN 0 THEN -1 ELSE 1 END)

    FROMsys.all_columns c1,

    sys.all_columns c2

    UNION ALL

    SELECTTOP 10000 CAST(DATEADD(Mi, ROW_NUMBER() OVER (ORDER BY c1.[object_id]), GETDATE()) AS SmallDateTime),

    'F',

    150 + (ABS(CHECKSUM(NEWID())) % 500)

    FROMsys.all_columns c1,

    sys.all_columns c2

    -- SQL 2012+

    SELECTDriveLetter, WeekNr, DoWNr, HourNr, AVG(ValDiff) AvgValDiff

    FROM(

    SELECTDriveLetter,

    DATEPART(WEEK, SampleDateTime) WeekNr,

    DATEPART(WEEKDAY, SampleDateTime) DoWNr,

    DATEPART(HOUR, SampleDateTime) HourNr,

    SampleValue Val_A,

    1.0*LAG(SampleValue, 1, SampleValue) OVER (PARTITION BY DriveLetter ORDER BY DriveLetter) val_B,

    1.0*(LAG(SampleValue, 1, SampleValue) OVER (PARTITION BY DriveLetter ORDER BY DriveLetter)) - SampleValue ValDiff

    FROM#CumDiffs

    ) Vals

    GROUPBY DriveLetter, WeekNr, DoWNr, HourNr

    ORDERBY DriveLetter, WeekNr, DoWNr, HourNr

    --SQL 2005+

    SELECTA.DriveLetter,

    A.WeekNr,

    A.DoWNr,

    A.HourNr,

    AVG(ISNULL(B.SampleValue, A.SampleValue) - A.SampleValue) ValDiff

    FROM(

    SELECTDriveLetter,

    DATEPART(WEEK, SampleDateTime) WeekNr,

    DATEPART(WEEKDAY, SampleDateTime) DoWNr,

    DATEPART(HOUR, SampleDateTime) HourNr,

    1.0*SampleValue SampleValue,

    ROW_NUMBER() OVER (PARTITION BY DriveLetter ORDER BY SampleDateTime) RowNr

    FROM#CumDiffs A

    ) A

    LEFT JOIN

    (

    SELECTDriveLetter,

    SampleValue,

    ROW_NUMBER() OVER (PARTITION BY DriveLetter ORDER BY SampleDateTime) RowNr

    FROM#CumDiffs

    ) B

    ONA.DriveLetter = B.DriveLetter

    AND A.RowNr = B.RowNr + 1

    GROUPBY A.DriveLetter, A.WeekNr, A.DoWNr, A.HourNr

    ORDERBY A.DriveLetter, A.WeekNr, A.DoWNr, A.HourNr

    --SQL 2005+

    ;WITH

    Vals AS

    (

    SELECTDriveLetter,

    DATEPART(WEEK, SampleDateTime) WeekNr,

    DATEPART(WEEKDAY, SampleDateTime) DoWNr,

    DATEPART(HOUR, SampleDateTime) HourNr,

    SampleValue,

    SampleDateTime

    FROM#CumDiffs

    ),

    ValOrder AS

    (

    SELECTDriveLetter,

    SampleValue,

    ROW_NUMBER() OVER (PARTITION BY DriveLetter ORDER BY SampleDateTime) RowNr,

    WeekNr,

    DoWNr,

    HourNr

    FROMVals V

    )

    SELECTS1.DriveLetter,

    S1.WeekNr,

    S1.DoWNr,

    S1.HourNr,

    AVG(ISNULL(1.0*S2.SampleValue, 1.0*S1.SampleValue) - 1.0*S1.SampleValue) ValDiff

    FROMValOrder S1

    LEFT JOIN ValOrder S2

    ON S1.DriveLetter = S2.DriveLetter

    AND S1.RowNr = S2.RowNr + 1

    GROUPBY S1.DriveLetter, S1.WeekNr, S1.DoWNr, S1.HourNr

    ORDERBY DriveLetter, WeekNr, DoWNr, HourNr



    For better, quicker answers on T-SQL questions, read Jeff Moden's suggestions.[/url]

    "Million-to-one chances crop up nine times out of ten." ― Terry Pratchett, Mort

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply