Aggregate data with convert rows into column

Question

Aggregate data with convert rows into column

shamshad.ali

Hall of Fame

Points: 3850
More actions
May 16, 2019 at 11:18 am

#3644147

I have following sample table for reference.
CREATE TABLE [dbo].[ProcessLog](
[LogId] [int] IDENTITY(1,1) NOT NULL,
[TableId] [int] NOT NULL,
[LogDate] [datetime2](7) NOT NULL,
[LogStatusId] [tinyint] NOT NULL,
[RowCnt] [int] NULL
CONSTRAINT [PK_ProcessLog] PRIMARY KEY CLUSTERED
(
[LogId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
INSERT INTO [dbo].[ProcessLog] ([TableId],[LogDate],[LogStatusId],[RowCnt])
select 1, '2019-05-16 11:48:26.4400000', 1, 0 union
select 1, '2019-05-16 11:48:28.1900000', 2, 5875 union
select 2, '2019-05-16 11:48:28.2400000', 1, 0 union
select 2, '2019-05-16 11:48:32.4733333', 2, 358422 union
GO
I want to this generate the report in two rows like
TableId - difference between two rows dates with logStatusId 1 and 2 - RowCount
1 - datediff(row1 and row2 group by tableId(1)) - 5875
2- datediff(row3 and row4 group by table(2)) - 358422

Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Y.B. SSChampion Points: 11652 More actions · Answer 1

So just a couple tips for the next time you post. Make sure you put any DDL statements in a code block so its easier to read. Most people don't want to to create tables in their own environments so it's usually best to create statements using temp tables or table variables.

So something like this:

DECLARE @ProcessLog TABLE
(
    [LogId] INT IDENTITY(1, 1) NOT NULL,
    [TableId] INT NOT NULL,
    [LogDate] DATETIME2(7) NOT NULL,
    [LogStatusId] TINYINT NOT NULL,
    [RowCnt] INT NULL
);

INSERT INTO @ProcessLog
(
    TableId,
    LogDate,
    LogStatusId,
    RowCnt
)
select 1, '2019-05-16 11:48:26.4400000', 1, 0 union
select 1, '2019-05-16 11:48:28.1900000', 2, 5875 union
select 2, '2019-05-16 11:48:28.2400000', 1, 0 union
select 2, '2019-05-16 11:48:32.4733333', 2, 358422;

SELECT * FROM @ProcessLog

Lastly, sample results really help people understand exactly what you are looking for. Since you didn't I took a guess at what you meant. You mentioned aggregating so I'm assuming maybe you wanted to sum the rowcnt? As for the datediff...I don't know how many possible log statuses or values are possible. Therefore, I provided an example of getting the difference between the earliest date and the latest date within a single tableId. I suspect this is not quite what you're looking for but again I'm just taking a guess and trying to give you some ideas/tips while I'm at it.

SELECT DISTINCT
TableId,
SUM(RowCnt) OVER (PARTITION BY TableId) AS RowCntTotal,
DATEDIFF(SECOND, 
FIRST_VALUE(LogDate) OVER (PARTITION BY TableId ORDER BY LogDate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW),
LAST_VALUE(LogDate) OVER (PARTITION BY TableId ORDER BY LogDate ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
) AS DiffSeconds

FROM
@ProcessLog

This reply was modified 6 years, 9 months ago by Y.B.. Reason: Small edit to code
This reply was modified 6 years, 9 months ago by Y.B..
This reply was modified 6 years, 9 months ago by Y.B.. Reason: Updated Code

SELECT quote FROM brain WHERE original = 1
0 rows returned

Andrey SSC Eights! Points: 840 More actions · Answer 2

another version assuming that you need diff in ms , there are two records per table (status 1 and 2) and the amount of records is needed of the second record:


WITH data
AS (
SELECT TableId
,datediff(ms, LogDate, LEAD(LogDate, 1, NULL) OVER (PARTITION BY TableId ORDER BY LogDate,LogStatusId)) [datediff_ms]
,LEAD(RowCnt, 1, RowCnt) OVER (PARTITION BY TableId ORDER BY LogDate,LogStatusId) [RowCount]
FROM ProcessLog
WHERE LogStatusId IN (1,2)
)
SELECT *
FROM [data]
WHERE [datediff_ms] IS NOT NULL

drew.allen SSC Guru Points: 77014 More actions · Answer 3

Y.B. wrote:

So just a couple tips for the next time you post. Make sure you put any DDL statements in a code block so its easier to read. Most people don't want to to create tables in their own environments so it's usually best to create statements using temp tables or table variables. So something like this:
DECLARE @ProcessLog TABLE
(
    [LogId] INT IDENTITY(1, 1) NOT NULL,
    [TableId] INT NOT NULL,
    [LogDate] DATETIME2(7) NOT NULL,
    [LogStatusId] TINYINT NOT NULL,
    [RowCnt] INT NULL
);

INSERT INTO @ProcessLog
(
    TableId,
    LogDate,
    LogStatusId,
    RowCnt
)
select 1, '2019-05-16 11:48:26.4400000', 1, 0 union
select 1, '2019-05-16 11:48:28.1900000', 2, 5875 union
select 2, '2019-05-16 11:48:28.2400000', 1, 0 union
select 2, '2019-05-16 11:48:32.4733333', 2, 358422;

SELECT * FROM @ProcessLog
Lastly, sample results really help people understand exactly what you are looking for. Since you didn't I took a guess at what you meant. You mentioned aggregating so I'm assuming maybe you wanted to sum the rowcnt? As for the datediff...I don't know how many possible log statuses or values are possible. Therefore, I provided an example of getting the difference between the earliest date and the latest date within a single tableId. I suspect this is not quite what you're looking for but again I'm just taking a guess and trying to give you some ideas/tips while I'm at it.
SELECT DISTINCT
TableId,
SUM(RowCnt) OVER (PARTITION BY TableId) AS RowCntTotal,
DATEDIFF(SECOND, 
FIRST_VALUE(LogDate) OVER (PARTITION BY TableId ORDER BY LogDate ),
LAST_VALUE(LogDate) OVER (PARTITION BY TableId ORDER BY LogDate RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
) AS DiffSeconds

FROM
@ProcessLog

You are better off using ROWS rather than RANGE. I don't understand the complexities of it, but the upshot is that ROWS will never write to disk while RANGE will always write to disk.

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

Y.B. SSChampion Points: 11652 More actions · Answer 4

Thanks for the tip Drew, I've made the change to the code.

SELECT quote FROM brain WHERE original = 1
0 rows returned

drew.allen SSC Guru Points: 77014 More actions · Answer 5

Y.B. wrote:

Thanks for the tip Drew, I've made the change to the code.

I see you updated the LAST_VALUE(), but not the FIRST_VALUE(). The default isRANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA

Y.B. SSChampion Points: 11652 More actions · Answer 6

drew.allen wrote:

Y.B. wrote:
Thanks for the tip Drew, I've made the change to the code.
I see you updated the LAST_VALUE(), but not the FIRST_VALUE(). The default isRANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. Drew

Done

SELECT quote FROM brain WHERE original = 1
0 rows returned