To compare previous rows then update

  • I am using SQL Server 2012 and fairly new to writing queries.  I am stuck with this one for a number of days and have done a lot of research in different forums, but I still cannot find an answer to my question.  So any help or feedback is appreciated.

    DECLARE @hrly_chg TABLE([emp] char(1),[hourly] numeric (15, 6), [date] DATETIME, [seq] INT);
    INSERT INTO @hrly_chg VALUES ('A', 46.23,'03/05/2017',0);
    INSERT INTO @hrly_chg VALUES ('A',45.55,'6/6/2016',0);
    INSERT INTO @hrly_chg VALUES ('A',45.55,'6/5/2016',1);
    INSERT INTO @hrly_chg VALUES ('A',42.37,'6/5/2016',0);
    INSERT INTO @hrly_chg VALUES ('A',42.37,'1/10/2016',0);
    INSERT INTO @hrly_chg VALUES ('A',41.54,'8/16/2015',1);
    INSERT INTO @hrly_chg VALUES ('A',38.33,'8/16/2015',0);
    INSERT INTO @hrly_chg VALUES ('A',38.33,'2/8/2015',0);
    INSERT INTO @hrly_chg VALUES ('A',38.33,'1/20/2015',0);
    INSERT INTO @hrly_chg VALUES ('A',31.20,'1/10/2015',0);
    INSERT INTO @hrly_chg VALUES ('A',31.20,'1/9/2015',0);
    INSERT INTO @hrly_chg VALUES ('A',31.20,'1/8/2015',0);
    INSERT INTO @hrly_chg VALUES ('A',31.20,'12/14/2014',0);
    INSERT INTO @hrly_chg VALUES ('A',31.20,'8/19/2014',0);
    INSERT INTO @hrly_chg VALUES ('A',31.20,'8/19/2014',1);
    INSERT INTO @hrly_chg VALUES ('A',26.73,'1/1/2014',0);
    INSERT INTO @hrly_chg VALUES ('A',25.22,'8/19/2013',0);

    SELECT * FROM @hrly_chg ORDER by emp, date, seq

    I have used LAG, subqueries, time variables...etc but none of them gets to what I wanted.  The "previous row" order would be by date and seq.  This is the desirable output I am trying to get to:

    empeffdtseqhourlylast change dateprevious hourly
    A8/19/20130$25.22  
    A1/1/20140$26.738/19/2013$25.22
    A8/19/20140$31.201/1/2014$26.73
    A8/19/20141$31.201/1/2014$26.73
    A12/14/20140$31.201/1/2014$26.73
    A1/8/20150$31.201/1/2014$26.73
    A1/9/20150$31.201/1/2014$26.73
    A1/10/20150$31.201/1/2014$26.73
    A1/20/20150$38.338/19/2014$31.20
    A2/8/20150$38.338/19/2014$31.20
    A8/16/20150$38.338/19/2014$31.20
    A8/16/20151$41.541/20/2015$38.33
    A1/10/20160$42.378/16/2015$41.54
    A6/5/20160$42.378/16/2015$41.54
    A6/5/20161$45.551/10/2016$42.37
    A6/6/20160$45.551/10/2016$42.37
    A3/5/20170$46.236/5/2016$45.55
  • Are you sure you actually tried LAG?

    SELECT
        emp
    ,    date AS effdt
    ,    seq
    ,    hourly
    ,    LAG(date) OVER (PARTITION BY emp ORDER BY date, seq) AS last_change_date
    ,    LAG(hourly) OVER (PARTITION BY emp ORDER BY date, seq) AS previous_hourly
    FROM @hrly_chg
    ORDER BY emp, date, seq

    John

  • faylc - Thursday, April 20, 2017 10:12 AM

    I am using SQL Server 2012 and fairly new to writing queries.  I am stuck with this one for a number of days and have done a lot of research in different forums, but I still cannot find an answer to my question.  So any help or feedback is appreciated.

    DECLARE @hrly_chg TABLE([emp] char(1),[hourly] numeric (15, 6), [date] DATETIME, [seq] INT);
    INSERT INTO @hrly_chg VALUES ('A', 46.23,'03/05/2017',0);
    INSERT INTO @hrly_chg VALUES ('A',45.55,'6/6/2016',0);
    INSERT INTO @hrly_chg VALUES ('A',45.55,'6/5/2016',1);
    INSERT INTO @hrly_chg VALUES ('A',42.37,'6/5/2016',0);
    INSERT INTO @hrly_chg VALUES ('A',42.37,'1/10/2016',0);
    INSERT INTO @hrly_chg VALUES ('A',41.54,'8/16/2015',1);
    INSERT INTO @hrly_chg VALUES ('A',38.33,'8/16/2015',0);
    INSERT INTO @hrly_chg VALUES ('A',38.33,'2/8/2015',0);
    INSERT INTO @hrly_chg VALUES ('A',38.33,'1/20/2015',0);
    INSERT INTO @hrly_chg VALUES ('A',31.20,'1/10/2015',0);
    INSERT INTO @hrly_chg VALUES ('A',31.20,'1/9/2015',0);
    INSERT INTO @hrly_chg VALUES ('A',31.20,'1/8/2015',0);
    INSERT INTO @hrly_chg VALUES ('A',31.20,'12/14/2014',0);
    INSERT INTO @hrly_chg VALUES ('A',31.20,'8/19/2014',0);
    INSERT INTO @hrly_chg VALUES ('A',31.20,'8/19/2014',1);
    INSERT INTO @hrly_chg VALUES ('A',26.73,'1/1/2014',0);
    INSERT INTO @hrly_chg VALUES ('A',25.22,'8/19/2013',0);

    SELECT * FROM @hrly_chg ORDER by emp, date, seq

    I have used LAG, subqueries, time variables...etc but none of them gets to what I wanted.  The "previous row" order would be by date and seq.  This is the desirable output I am trying to get to:

    empeffdtseqhourlylast change dateprevious hourly
    A8/19/20130$25.22  
    A1/1/20140$26.738/19/2013$25.22
    A8/19/20140$31.201/1/2014$26.73
    A8/19/20141$31.201/1/2014$26.73
    A12/14/20140$31.201/1/2014$26.73
    A1/8/20150$31.201/1/2014$26.73
    A1/9/20150$31.201/1/2014$26.73
    A1/10/20150$31.201/1/2014$26.73
    A1/20/20150$38.338/19/2014$31.20
    A2/8/20150$38.338/19/2014$31.20
    A8/16/20150$38.338/19/2014$31.20
    A8/16/20151$41.541/20/2015$38.33
    A1/10/20160$42.378/16/2015$41.54
    A6/5/20160$42.378/16/2015$41.54
    A6/5/20161$45.551/10/2016$42.37
    A6/6/20160$45.551/10/2016$42.37
    A3/5/20170$46.236/5/2016$45.55

    If you're willing to think outside of the box, this can be achieved by a method called "quirky update". It has several rules to ensure the correctness, but it should get you what you need. You can read about it in here: http://www.sqlservercentral.com/articles/T-SQL/68467/
    Post any questions that you have after reading the article.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • John,

    That doesn't produce the desired results.

    faylc,

    I don't see any description of what those desired results represent, in any portion of your post.   I'm not at all sure what rules you have in place that determine what the last change date and previous hourly values are supposed to be.   I had come up with the exact same code and John, but then noticed that's not the result you're after.   Please explain in ANAL detail what the rules are for determining those new values for any given input record.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • try this:

    DECLARE @hrly_chg TABLE([emp] char(1),[hourly] numeric (15, 6), [date] DATETIME, [seq] INT);

    INSERT INTO @hrly_chg VALUES ('A', 46.23,'03/05/2017',0);

    INSERT INTO @hrly_chg VALUES ('A',45.55,'6/6/2016',0);

    INSERT INTO @hrly_chg VALUES ('A',45.55,'6/5/2016',1);

    INSERT INTO @hrly_chg VALUES ('A',42.37,'6/5/2016',0);

    INSERT INTO @hrly_chg VALUES ('A',42.37,'1/10/2016',0);

    INSERT INTO @hrly_chg VALUES ('A',41.54,'8/16/2015',1);

    INSERT INTO @hrly_chg VALUES ('A',38.33,'8/16/2015',0);

    INSERT INTO @hrly_chg VALUES ('A',38.33,'2/8/2015',0);

    INSERT INTO @hrly_chg VALUES ('A',38.33,'1/20/2015',0);

    INSERT INTO @hrly_chg VALUES ('A',31.20,'1/10/2015',0);

    INSERT INTO @hrly_chg VALUES ('A',31.20,'1/9/2015',0);

    INSERT INTO @hrly_chg VALUES ('A',31.20,'1/8/2015',0);

    INSERT INTO @hrly_chg VALUES ('A',31.20,'12/14/2014',0);

    INSERT INTO @hrly_chg VALUES ('A',31.20,'8/19/2014',0);

    INSERT INTO @hrly_chg VALUES ('A',31.20,'8/19/2014',1);

    INSERT INTO @hrly_chg VALUES ('A',26.73,'1/1/2014',0);

    INSERT INTO @hrly_chg VALUES ('A',25.22,'8/19/2013',0);

    ;with t0 as (

    select emp,

    hourly,

    max(date) [maxdt],

    row_number() over (order by max(date)) [seq]

    from @hrly_chg

    group by emp,hourly)

    , t1 as (

    select top 1 emp, NULL [hourly], NULL [maxdt], 0 [seq] from @hrly_chg

    union

    select emp,

    hourly,

    max(date) [maxdt],

    row_number() over (order by hourly) [seq]

    from @hrly_chg a

    group by emp,hourly)

    , t2 as (

    SELECT

    a.emp

    , a.date AS effdt

    , a.seq

    , a.hourly

    ,b.seq as joinseq

    --,c.maxdt, c.hourly

    FROM @hrly_chg a

    join t0 b

    on a.emp=b.emp

    and a.hourly=b.hourly )

    SELECT

    a.emp

    , a.effdt

    , a.seq

    , a.hourly

    ,b.maxdt

    , b.hourly

    FROM t2 a

    left join t1 b

    on a.emp = b.emp

    and a.joinseq-1=b.seq

    ORDER BY a.emp, a.effdt, a.seq

  • Thanks for the replies.

    John, I have tried that but it did not get to the results I wanted because the LAG function did not go back far enough to the first effective date (effdt) of the change in hourly.  For example, on 6/5/2016, the employee's hourly rate was increased $42.37 to $45.55 on the same date (distinguish by seq 0 and 1).  For the seq 1 row, I want to display the last increase was 1/10/2016 because that is when $42.37 first became effective.  So the LAG would only go back 1 row , which is not far enough.

    Steve, I will try to explain this in plain English, so please bear with me.  This is a salary increase table driven by the date (basically an effective date) and the sequence number.  When an employee gets a salary increase, it would insert a data row in there.  The tricky part (for me anyway) is, an employee could have multiple salary increases in the same day, and that is when the sequence number (0, 1, 2...etc) would come in to rank the order.  In other words, this table's is sorted in chronological order by date, and by sequence number, whenever there is a change, as evident in the example I posted.

    The solution I am trying to find is, let's take the last  (if sorted ascending by 'effdt' and 'seq') few rows as an example.

    empeffdtseqhourlylast change dateprevious hourly

    A    8/16/20151$41.541/20/2015$38.33
    A1/10/20160$42.378/16/2015$41.54
    A6/5/20160$42.378/16/2015$41.54
    A6/5/20161$45.551/10/2016$42.37
    A6/6/20160$45.551/10/2016$42.37
    A3/5/20170$46.236/5/2016$45.55

    So on 3/5/2017, this employee had an hourly rate changed from $45.55 to $46.23.  So in that row, I'd like to display the previous hourly rate ($45.55) and its minimum effective date (6/5/2016 seq 1).  For the 6/6/2016 row, where the hourly rate is $45.55, I'd like to display the last increase date from the previous hourly rate ($42.37) and the minimum effective date (1/10/2016 seq 0).

    I have tried to different ways like grouping the effdt, seq, and hourly in order to get the minimum effective date, but I could not get those data values to update correctly.

    So any help is appreciated.  Thanks again.

  • Hi Chenthor, I have tried your query and it returned the "previous 1 row" when the hourly was changed, but not the minimum date that I am trying to get at.  Here are some of the results from your query:

    A    2016-06-05 00:00:00.000    1    45.550000    2016-06-05 00:00:00.000    42.370000
    A    2016-06-06 00:00:00.000    0    45.550000    2016-06-05 00:00:00.000    42.370000
    A    2017-03-05 00:00:00.000    0    46.230000    2016-06-06 00:00:00.000    45.550000

    So for the 2017-03-05 row, I'd to display $45.55 as the previous hourly rate (in which is in your query result), and 2016-06-05 (seq 1) as the minimum effective date/row of the previous hourly ($42.37).

  • no worries, change the MAX(date) in both t0 and t1 queries to MIN(date) and that will get the first instance of the previous amount.

    emp effdt seq hourly maxdt hourly
    A 2016-06-05 00:00:00.000 0 42.370000 2015-08-16 00:00:00.000 41.540000
    A 2016-06-05 00:00:00.000 1 45.550000 2016-01-10 00:00:00.000 42.370000
    A 2016-06-06 00:00:00.000 0 45.550000 2016-01-10 00:00:00.000 42.370000
    A 2017-03-05 00:00:00.000 0 46.230000 2016-06-05 00:00:00.000 45.550000

  • I believe that this gives the correct results.  It's also requires far fewer reads than Chanthor's version.

    ;
    WITH starts AS
    (
        SELECT *,
            CASE WHEN hc.hourly <> LAG(hc.hourly, 1, 0) OVER(PARTITION BY hc.emp ORDER BY hc.date, hc.seq) THEN 1 ELSE 0 END AS is_start
        FROM @hrly_chg AS hc
    )
    , groups AS
    (
        SELECT *, SUM(s.is_start) OVER(PARTITION BY s.emp ORDER BY s.date, s.seq ROWS UNBOUNDED PRECEDING) AS grp
        FROM starts s
    )
    , prev_vals AS
    (
        SELECT *,
            CASE WHEN is_start = 1 THEN LAG(g.date) OVER(PARTITION BY emp, is_start ORDER BY grp) END AS last_change_date,
            CASE WHEN is_start = 1 THEN LAG(g.hourly) OVER(PARTITION BY emp, is_start ORDER BY grp) END AS previous_hourly
        FROM groups g
    )
    SELECT pv.emp, pv.date, pv.seq, pv.hourly,
        MAX(pv.last_change_date) OVER(PARTITION BY pv.emp, pv.grp) AS last_change_dt,
        MAX(pv.previous_hourly) OVER(PARTITION BY pv.emp, pv.grp) AS previous_hourly
    FROM prev_vals pv

    Part of the trick was to only include values from records where there was a change in the hourly wage.  That is why I included is_start in the partition.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Hi Chentor and Drew, thank you for the suggestions. The main server is down for maintenance and when it is up again, I will try both ways for efficiency and speed. Thanks again.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply