Aggregate data with convert rows into column

  • I have following sample table for reference.

    CREATE TABLE [dbo].[ProcessLog](

    [LogId] [int] IDENTITY(1,1) NOT NULL,

    [TableId] [int] NOT NULL,

    [LogDate] [datetime2](7) NOT NULL,

    [LogStatusId] [tinyint] NOT NULL,

    [RowCnt] [int] NULL

    CONSTRAINT [PK_ProcessLog] PRIMARY KEY CLUSTERED

    (

    [LogId] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    INSERT INTO [dbo].[ProcessLog] ([TableId],[LogDate],[LogStatusId],[RowCnt])

    select 1, '2019-05-16 11:48:26.4400000', 1, 0 union

    select 1, '2019-05-16 11:48:28.1900000', 2, 5875 union

    select 2, '2019-05-16 11:48:28.2400000', 1, 0 union

    select 2, '2019-05-16 11:48:32.4733333', 2, 358422 union

    GO

    I want to this generate the report in two rows like

    TableId  -   difference between two rows dates with logStatusId 1 and 2   - RowCount

    1 - datediff(row1 and row2 group by tableId(1)) - 5875

    2- datediff(row3 and row4 group by table(2)) - 358422

     

  • So just a couple tips for the next time you post.  Make sure you put any DDL statements in a code block so its easier to read.  Most people don't want to to create tables in their own environments so it's usually best to create statements using temp tables or table variables.

    So something like this:

    DECLARE @ProcessLog TABLE
    (
    [LogId] INT IDENTITY(1, 1) NOT NULL,
    [TableId] INT NOT NULL,
    [LogDate] DATETIME2(7) NOT NULL,
    [LogStatusId] TINYINT NOT NULL,
    [RowCnt] INT NULL
    );

    INSERT INTO @ProcessLog
    (
    TableId,
    LogDate,
    LogStatusId,
    RowCnt
    )
    select 1, '2019-05-16 11:48:26.4400000', 1, 0 union
    select 1, '2019-05-16 11:48:28.1900000', 2, 5875 union
    select 2, '2019-05-16 11:48:28.2400000', 1, 0 union
    select 2, '2019-05-16 11:48:32.4733333', 2, 358422;

    SELECT * FROM @ProcessLog

    Lastly, sample results really help people understand exactly what you are looking for.  Since you didn't I took a guess at what you meant.  You mentioned aggregating so I'm assuming maybe you wanted to sum the rowcnt?  As for the datediff...I don't know how many possible log statuses or values are possible.  Therefore, I provided an example of getting the difference between the earliest date and the latest date within a single tableId.  I suspect this is not quite what you're looking for but again I'm just taking a guess and trying to give you some ideas/tips while I'm at it.

    SELECT DISTINCT
    TableId,
    SUM(RowCnt) OVER (PARTITION BY TableId) AS RowCntTotal,
    DATEDIFF(SECOND,
    FIRST_VALUE(LogDate) OVER (PARTITION BY TableId ORDER BY LogDate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW),
    LAST_VALUE(LogDate) OVER (PARTITION BY TableId ORDER BY LogDate ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
    ) AS DiffSeconds

    FROM
    @ProcessLog

    • This reply was modified 4 years, 11 months ago by  Y.B.. Reason: Small edit to code
    • This reply was modified 4 years, 11 months ago by  Y.B..
    • This reply was modified 4 years, 11 months ago by  Y.B.. Reason: Updated Code


    SELECT quote FROM brain WHERE original = 1
    0 rows returned

  • another version assuming that you need diff in ms , there are two records per table (status 1 and 2)  and the amount of records is needed of the second record:

     


    WITH data
    AS (
    SELECT TableId
    ,datediff(ms, LogDate, LEAD(LogDate, 1, NULL) OVER (PARTITION BY TableId ORDER BY LogDate,LogStatusId)) [datediff_ms]
    ,LEAD(RowCnt, 1, RowCnt) OVER (PARTITION BY TableId ORDER BY LogDate,LogStatusId) [RowCount]
    FROM ProcessLog
    WHERE LogStatusId IN (1,2)
    )
    SELECT *
    FROM [data]
    WHERE [datediff_ms] IS NOT NULL

     

     

  • Y.B. wrote:

    So just a couple tips for the next time you post.  Make sure you put any DDL statements in a code block so its easier to read.  Most people don't want to to create tables in their own environments so it's usually best to create statements using temp tables or table variables. So something like this:

    DECLARE @ProcessLog TABLE
    (
    [LogId] INT IDENTITY(1, 1) NOT NULL,
    [TableId] INT NOT NULL,
    [LogDate] DATETIME2(7) NOT NULL,
    [LogStatusId] TINYINT NOT NULL,
    [RowCnt] INT NULL
    );

    INSERT INTO @ProcessLog
    (
    TableId,
    LogDate,
    LogStatusId,
    RowCnt
    )
    select 1, '2019-05-16 11:48:26.4400000', 1, 0 union
    select 1, '2019-05-16 11:48:28.1900000', 2, 5875 union
    select 2, '2019-05-16 11:48:28.2400000', 1, 0 union
    select 2, '2019-05-16 11:48:32.4733333', 2, 358422;

    SELECT * FROM @ProcessLog

    Lastly, sample results really help people understand exactly what you are looking for.  Since you didn't I took a guess at what you meant.  You mentioned aggregating so I'm assuming maybe you wanted to sum the rowcnt?  As for the datediff...I don't know how many possible log statuses or values are possible.  Therefore, I provided an example of getting the difference between the earliest date and the latest date within a single tableId.  I suspect this is not quite what you're looking for but again I'm just taking a guess and trying to give you some ideas/tips while I'm at it.

    SELECT DISTINCT
    TableId,
    SUM(RowCnt) OVER (PARTITION BY TableId) AS RowCntTotal,
    DATEDIFF(SECOND,
    FIRST_VALUE(LogDate) OVER (PARTITION BY TableId ORDER BY LogDate ),
    LAST_VALUE(LogDate) OVER (PARTITION BY TableId ORDER BY LogDate RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
    ) AS DiffSeconds

    FROM
    @ProcessLog

    You are better off using ROWS rather than RANGE.  I don't understand the complexities of it, but the upshot is that ROWS will never write to disk while RANGE will always write to disk.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Thanks for the tip Drew, I've made the change to the code.


    SELECT quote FROM brain WHERE original = 1
    0 rows returned

  • Y.B. wrote:

    Thanks for the tip Drew, I've made the change to the code.

    I see you updated the LAST_VALUE(), but not the FIRST_VALUE().  The default isRANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • drew.allen wrote:

    Y.B. wrote:

    Thanks for the tip Drew, I've made the change to the code.

    I see you updated the LAST_VALUE(), but not the FIRST_VALUE().  The default isRANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. Drew

    Done


    SELECT quote FROM brain WHERE original = 1
    0 rows returned

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply