Difficult delete question

  • Hi,

    I have the following table and I want to remove alle entries except the 3 last ones per customer. (the last 3 records in time)

    Is this possible using an SQL statement?

    Timestamp Customer data

    3/03/2016 Aaaaaa

    1/02/2016 Abbbbbb

    1/01/2016 Acccccccc

    1/12/2015 Aaaaaa<< to be removed

    31/10/2015Bbbbbbb

    30/09/2015Bcccccccc

    30/08/2015Baaaaa

    30/07/2015Bbbbbbb<< to be removed

    29/06/2015Bcccccccc<< to be removed

    ....

    ....

  • We don't know anything about your table, so here is something that works on my environment

    ;WITH cteBaseData AS (

    SELECT [Timestamp], Customer

    , seq = ROW_NUMBER() OVER (PARTITION BY Customer ORDER BY [Timestamp] DESC)

    FROM #Customer

    )

    DELETE src

    FROM #Customer AS src

    INNER JOIN cteBaseData AS cte

    ON src.Customer = cte.Customer

    AND src.[Timestamp] = cte.[Timestamp]

    WHERE cte.seq > 3;

  • There's no need to join back to the table to do the DELETE. You can delete from the table by deleting from the CTE.

    Here's the OPs original data in a readily consumable format...

    --===== Create and populate a test table on-the-fly

    SET DATEFORMAT DMY

    ;

    SELECT [Timestamp] = CONVERT(DATE ,d.[Timestamp])

    ,Customer = CONVERT(CHAR(1) ,d.Customer)

    ,Data = CONVERT(VARCHAR(10),d.Data)

    INTO #TestTable -- DROP TABLE #TestTable

    FROM (

    SELECT '3/03/2016','A','aaaaa' UNION ALL

    SELECT '1/02/2016','A','bbbbbb' UNION ALL

    SELECT '1/01/2016','A','cccccccc' UNION ALL

    SELECT '1/12/2015','A','aaaaa' UNION ALL

    SELECT '31/10/2015','B','bbbbbb' UNION ALL

    SELECT '30/09/2015','B','cccccccc' UNION ALL

    SELECT '30/08/2015','B','aaaaa' UNION ALL

    SELECT '30/07/2015','B','bbbbbb' UNION ALL

    SELECT '29/06/2015','B','cccccccc'

    ) d ([Timestamp],Customer,Data)

    ;

    Here's the DELETE through the cte. Try it. It works just fine.

    --===== Delete row in the table through the cte

    WITH cteDelete AS

    (

    SELECT Customer

    ,RowNum = ROW_NUMBER() OVER (PARTITION BY Customer ORDER BY [TimeStamp] DESC)

    FROM #TestTable

    )

    DELETE FROM cteDelete

    WHERE RowNum > 3

    ;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (5/29/2016)


    There's no need to join back to the table to do the DELETE. You can delete from the table by deleting from the CTE.

    [/code]

    Bringing this point up is most relevant. Even though I already knew about it, I keep forgetting about it. While trying to evaluate the proposed solution in my head, I failed to see the join was superfluous.

    Got to admit, it is far from intuitive that acting on a CTE reflects directly on the table it is based on. So easy to use CTE's it is also easy to overlook "side effects" (not a side effect really but if you forget about the implications, then, an apparent side effect).

    Adding the extra joins offers the advantage of stating the intent clearly instead on relying on an "implicit" behaviour. Unless of course performance of such a construct becomes a performance bottleneck, which I doubt.

    Thanks

  • j-1064772 (5/30/2016)


    Adding the extra joins offers the advantage of stating the intent clearly instead on relying on an "implicit" behaviour. Unless of course performance of such a construct becomes a performance bottleneck, which I doubt.

    Although adding the extra join does add a bit of clarity, it also uses more resources and takes extra time. Although its not much, why use extra resources when it can be avoided? If you think other people won't understand, add a simple comment for clarity would make the code quite clear and not use any extra resources in the process. If you always write code that uses the fewest resources possible, then you like won't ever have to worry about such things becoming a bottleneck.

    And it's not "implicit" behavior. It's stated in Books Online... not as clearly as one would expect but it's there. It works similar to a view for these types of things.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I agree with all your points. It's just that off-hand, even though it is in BOL, the net result SEEMS like a side effect. (By "implicit" I meant not overtly explicit.)

    You're right, a comment would remove the "missing" bit of info.

  • j-1064772 (5/30/2016)


    Jeff Moden (5/29/2016)


    There's no need to join back to the table to do the DELETE. You can delete from the table by deleting from the CTE.

    [/code]

    Bringing this point up is most relevant. Even though I already knew about it, I keep forgetting about it. While trying to evaluate the proposed solution in my head, I failed to see the join was superfluous.

    Got to admit, it is far from intuitive that acting on a CTE reflects directly on the table it is based on. So easy to use CTE's it is also easy to overlook "side effects" (not a side effect really but if you forget about the implications, then, an apparent side effect).

    Adding the extra joins offers the advantage of stating the intent clearly instead on relying on an "implicit" behaviour. Unless of course performance of such a construct becomes a performance bottleneck, which I doubt.

    Thanks

    I'm right there with you...this seems to be one I always forget about but it would really come in handy at times.


    SELECT quote FROM brain WHERE original = 1
    0 rows returned

  • Thanks Jeff, great! 🙂

  • yvesql (5/31/2016)


    Thanks Jeff, great! 🙂

    You're welcome.

    Since it came up, I need to check... you understand how it works? If not, please post back and I'll explain more. It's a powerful tool (I've necessarily done it through 3 cascading CTEs in the past to keep things simple, efficient, and fast) and very worthwhile knowing.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff,

    I think your windowing function needs a DESC, like DesNorton's solution has. I feel like a jerk writing this, since your Jedi SQL skills far surpass mine, but since ROW_NUMBER() can be confusing for people new to these functions, I thought I should mention it.

    --=Chuck

  • Not on the original topic, but relevant: The day I discovered I could Merge into a CTE, and thus limit the target rows with whatever Where clause I felt like, was a very good day.

    CTEs, used appropriately, are a great way to simplify otherwise complex data manipulation tasks.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (5/31/2016)


    CTEs, used appropriately, are a great way to simplify otherwise complex data manipulation tasks.

    Just finishing K. Kellenberger's "Expert T-SQL Window functions in SQL Server". CTE's are used extensively. In chapter 5, she uses CTE's to present an SS2K12 extremely readable, compact way of producing a report showing new monthly subscriptions and cancelled monthly subscriptions. The performance is astonishing. She then refers to a Speed Phreakery competition dating back to the SS2K8 times where people with a gazillion bragging points (rightfully deserved) proposed performing solutions - solutions I would have NEVER figured out on my own. She even wrote a follow-up piece to explain the solutions.

    SS2K8 vs SS2K12 (why don't we just write SS2012 ?): because SS2K8 did not provide for the [font="Courier New"]ORDER BY[/font] component in the [font="Courier New"]OVER[/font] clause SUM () OVER ( ORDER BY n.TheMonth )

    Already extremely useful by themselves, CTE's combined with the new SS2K12 windowing functions are spectacular.

  • chuck.forbes (5/31/2016)


    Jeff,

    I think your windowing function needs a DESC, like DesNorton's solution has. I feel like a jerk writing this, since your Jedi SQL skills far surpass mine, but since ROW_NUMBER() can be confusing for people new to these functions, I thought I should mention it.

    --=Chuck

    Nah... not a jerk. In this case, I'm the jerk because I flat forgot to add it (DESC). Thanks for the catch, Chuck. I'll add it to the original code when I get home. I'd do it now but can't because my work connection has a script blocker on it and it won't allow me to submit any code.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • First, thanks again to chuck.forbes for picking up on me forgetting the DESC sort order. I've corrected the script and tested it. Looks like it's good to go.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • yvesql (5/31/2016)


    Thanks Jeff, great! 🙂

    I did, however, forget to add DESC to the ROW_NUMBER() sort. I've corrected the script. My apologies for my oversight.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply