Row_Number, partion and order by

  • chris.stuart

    SSCarpal Tunnel

    Points: 4965

    Hi all

    I have the following SQL

    SELECT 
    Account,
    LogDateTime,
    RowId,
    Cell,
    lag(LogDateTime) over (order by LogDateTime),
    ROW_NUMBER() OVER(partition by Cell ORDER BY Cell, LogDateTime) as RowNrOldest
    FROM Database
    where RowId = 1
    and Account = 1234
    order by LogDateTime

    I get the following output

    1234	2011-01-13 1	0.....944	NULL	      1
    1234 2011-05-13 1 0.....944 2011-01-13 2
    1234 2011-08-29 1 0.....944 2011-05-13 3
    1234 2011-08-29 1 0.....944 2011-08-29 4
    1234 2012-05-11 1 0.....944 2011-08-29 5
    1234 2013-02-02 1 0.....944 2012-05-11 6
    1234 2013-02-05 1 0.....944 2013-02-02 7
    1234 2013-02-05 1 0.....944 2013-02-05 8
    1234 2013-05-23 1 0.....944 2013-02-05 9
    1234 2014-05-09 1 0.....944 2013-05-23 10
    1234 2014-09-02 1 0.....944 2014-05-09 11
    1234 2015-05-14 1 0.....944 2014-09-02 12
    1234 2016-04-29 1 0.....598 2015-05-14 1
    1234 2016-05-11 1 0.....944 2016-04-29 13
    1234 2016-05-19 1 0.....944 2016-05-11 14
    1234 2016-05-26 1 0.....944 2016-05-19 15
    1234 2017-05-08 1 0.....944 2016-05-26 16
    1234 2017-06-17 1 0.....944 2017-05-08 17
    1234 2017-08-30 1 0.....944 2017-06-17 18
    1234 2017-08-30 1 0.....944 2017-08-30 19

    The second time the cell number resets to 0.....944, I dont want the row count to go onto 13, I want it to reset back to 1.

    Im looking only for the records where the Row_Number is 1 in order to build up a history of when last the cell number was reset for a user.

    I know Im missing something very small but for the life of me, I cant figure it out.

    Many thanks

  • chris.stuart

    SSCarpal Tunnel

    Points: 4965

    Hi

    I see the following, its a bit more involved and I dont know how its going to perform with my half a billion records but its worth a try

     

    https://stackoverflow.com/questions/27680999/resetting-row-number-according-to-record-data-change

     

  • Thom A

    SSC Guru

    Points: 98272

    This is more of a gaps and islands question. With functions like ROW_NUMBER it doesn't care about "gaps" in the data, if the value of the column (Cell in this case) is the same, it's in the same group. One way to achieve what you're after would be like the below:

    WITH VTE AS(
    SELECT *
    FROM (VALUES (1234,'20110113',1,'0.....944'),
    (1234,'20110513',1,'0.....944'),
    (1234,'20110829',1,'0.....944'),
    (1234,'20110829',1,'0.....944'),
    (1234,'20120511',1,'0.....944'),
    (1234,'20130202',1,'0.....944'),
    (1234,'20130205',1,'0.....944'),
    (1234,'20130205',1,'0.....944'),
    (1234,'20130523',1,'0.....944'),
    (1234,'20140509',1,'0.....944'),
    (1234,'20140902',1,'0.....944'),
    (1234,'20150514',1,'0.....944'),
    (1234,'20160429',1,'0.....598'),
    (1234,'20160511',1,'0.....944'),
    (1234,'20160519',1,'0.....944'),
    (1234,'20160526',1,'0.....944'),
    (1234,'20170508',1,'0.....944'),
    (1234,'20170617',1,'0.....944'),
    (1234,'20170830',1,'0.....944'),
    (1234,'20170830',1,'0.....944')) V(Account,LogDateTime, RowID,Cell)),
    Indicator AS(
    SELECT Account,
    LogDateTime,
    RowID,
    Cell,
    CASE LAG(Cell) OVER (ORDER BY LogDateTime) WHEN Cell THEN 0 ELSE 1 END AS GrpIndicator
    FROM VTE),
    --You can't use SUM on a window function, hence why we need 2 CTEs to achieve this.
    Grps AS(
    SELECT Account,
    LogDateTime,
    RowID,
    Cell,
    SUM(GrpIndicator) OVER (ORDER BY LogDateTime ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Grp
    FROM Indicator)
    SELECT Account,
    LogDateTime,
    RowID,
    Cell,
    LAG(LogDateTime) OVER (ORDER BY LogDateTime) AS PrevLogDateTime,
    ROW_NUMBER() OVER (PARTITION BY Grp ORDER BY LogDateTime) AS RN
    FROM Grps;

    Or, alternatively, you can use ROW_NUMBER twice to create the groups, and then again for your numbering (some find this version a little harder to understand):

    --WITH VTE....
    Grps AS(
    SELECT Account,
    LogDateTime,
    RowID,
    Cell,
    ROW_NUMBER() OVER (ORDER BY LogDateTime) -
    ROW_NUMBER() OVER (PARTITION BY Cell ORDER BY LogDateTime) AS Grp
    FROM VTE)
    SELECT Account,
    LogDateTime,
    RowID,
    Cell,
    LAG(LogDateTime) OVER (ORDER BY LogDateTime) AS PrevLogDateTime,
    ROW_NUMBER() OVER (PARTITION BY Grp ORDER BY LogDateTime) AS RN
    FROM Grps
    ORDER BY LogDateTime;

    • This reply was modified 4 months ago by  Thom A. Reason: Alternative solution

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply