Best way to find certain records.

  • I have to find certain data from a table based on date range & and activity.

    So, if a records is 60 days old for a certain activity and do not fall within 90 & 120 days for other activity and do not have anything less than 60 days as create date, we should delete it.

    The script is to crete table & data is as below.

    So from the given example we have to delete only client id 5.

    Create table

    TestPurgeClient(

    ClientId int,

    ActivityId int,

    CreateDate Datetime)

    --select * from TestPurgeClient

    --All

    insert into TestPurgeClient values(1,72, '2013-03-18 00:00:00.000')

    insert into TestPurgeClient values(1,33, '2013-02-16 00:00:00.000')

    insert into TestPurgeClient values(1,34,'2013-01-17 00:00:00.000')

    -- 60 day + Current

    insert into TestPurgeClient values(2,72, '2013-03-18 00:00:00.000')

    insert into TestPurgeClient values(2,72,GETDATE())

    --60 + 90 day

    insert into TestPurgeClient values(3,72, '2013-03-18 00:00:00.000')

    insert into TestPurgeClient values(3,33, '2013-02-16 00:00:00.000')

    --All + current

    insert into TestPurgeClient values(4,72, '2013-03-18 00:00:00.000')

    insert into TestPurgeClient values(4,33, '2013-02-16 00:00:00.000')

    insert into TestPurgeClient values(4,34,'2013-01-17 00:00:00.000')

    insert into TestPurgeClient values(4,34,GETDATE())

    --60 Day Only

    insert into TestPurgeClient values(5,72, '2013-03-18 00:00:00.000')

  • as a start...try this

    SELECT ClientId

    FROM TestPurgeClient

    GROUP BY ClientId

    HAVING (SUM(CASE WHEN DATEDIFF(d, CreateDate, GETDATE()) > 90 THEN 1 ELSE 0 END) = 0)

    AND (SUM(CASE WHEN DATEDIFF(d, CreateDate, GETDATE()) <= 60 THEN 1 ELSE 0 END) = 0)

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • Thank you very much for your kind response. For this case it is working fine. But in this SQL activity ID is not taken into consideration at all and that's my mistake, because I haven't explicitly mentioned that.

    So in the above example, we have to select records if:-

    Activity ID = 72 & CreateDate is older than 60 days.

    Activity ID = 33 & CreateDate is older than 90 days.

    Activity ID = 34 & CreateDate is older than 120 days.

    So if we create one more record, this query fails.

    insert into TestPurgeClient values(5,34, '2013-03-18 00:00:00.000')

    But again, thank you very much again for taking take to look into this.

  • New Person (5/18/2013)


    Thank you very much for your kind response. For this case it is working fine. But in this SQL activity ID is not taken into consideration at all and that's my mistake, because I haven't explicitly mentioned that.

    So in the above example, we have to select records if:-

    Activity ID = 72 & CreateDate is older than 60 days.

    Activity ID = 33 & CreateDate is older than 90 days.

    Activity ID = 34 & CreateDate is older than 120 days.

    So if we create one more record, this query fails.

    insert into TestPurgeClient values(5,34, '2013-03-18 00:00:00.000')

    But again, thank you very much again for taking take to look into this.

    I am assuming that you will want to delete all records for clients where the above rules apply.??

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • Yes. The issue I am facing that one client may have multiple create date & multiple activities, and I have to check that even if client falls with in that criteria, is not having any recent create date. Like the records with getdate().

  • can we start again with a bigger data set that demonstrates all your possibilities and the expected outcome.

    sorry...but i'm obviously not thinking clearly 😛

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • I am very sorry, for not being clear.

    Let me try again..

    --All

    insert into TestPurgeClient values(1,72, '2013-03-18 00:00:00.000')

    insert into TestPurgeClient values(1,33, '2013-02-16 00:00:00.000')

    insert into TestPurgeClient values(1,34,'2013-01-17 00:00:00.000')

    -- 60 day + Current

    insert into TestPurgeClient values(2,72, '2013-03-18 00:00:00.000')

    insert into TestPurgeClient values(2,72,GETDATE())

    --60 + 90 day

    insert into TestPurgeClient values(3,72, '2013-03-18 00:00:00.000')

    insert into TestPurgeClient values(3,33, '2013-02-16 00:00:00.000')

    --All + current

    insert into TestPurgeClient values(4,72, '2013-03-18 00:00:00.000')

    insert into TestPurgeClient values(4,33, '2013-02-16 00:00:00.000')

    insert into TestPurgeClient values(4,34,'2013-01-17 00:00:00.000')

    insert into TestPurgeClient values(4,34,GETDATE())

    --60 Day Only + 90 day

    insert into TestPurgeClient values(5,72, '2013-03-18 00:00:00.000')

    insert into TestPurgeClient values(5,34, '2013-03-18 00:00:00.000')

    --60 Day Only + all activity

    insert into TestPurgeClient values(6,72, '2013-03-18 00:00:00.000')

    insert into TestPurgeClient values(6,33, '2013-01-18 00:00:00.000')

    insert into TestPurgeClient values(6,34,'2012-01-17 00:00:00.000')

    --90 Day Only

    insert into TestPurgeClient values(7,33, '2013-02-16 00:00:00.000')

    insert into TestPurgeClient values(7,34, '2012-02-16 00:00:00.000')

    --120 Day only

    insert into TestPurgeClient values(8,34,'2013-01-17 00:00:00.000')

    The above mentioned records contains all the possibilities.

    In the above example only client 6, 7, & 8 are candidates for delete. So even if client Id 6 has multiple

    Create date with multiple activities, it is older than 60 days for activity 72 and older than 90 days

    for Activity 33 & older than 120 days for Activity 34.

    And it does not have any current activity, means less than 60 days.

    Same case with Client 7, the create date is older than 90 days for activity 33 and 120 days for activity

    34

    Same for Client 8, which is older than 120 days and do not fall in any activity less than 120 days.

  • And thank you very much again for taking time on Saturday morning for helping me out..

  • why doesn't ClientId "1" qualify?

    SELECT ClientId, ActivityId, CreateDate, DATEDIFF(d, CreateDate, GETDATE()) AS datediff

    FROM TestPurgeClient

    WHERE (ClientId = 1)

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • I am very sorry. You are right . Client Id 1 qualifies all the criteria.

  • New Person (5/18/2013)


    I am very sorry. You are right . Client Id 1 qualifies all the criteria.

    so can I assume ClientId 3 qualifies as well?

    SELECT ClientId, ActivityId, CreateDate, DATEDIFF(d, CreateDate, GETDATE()) AS datediff

    FROM TestPurgeClient

    WHERE (ClientId = 3)

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • Yes.

  • ok..try this...far from elegant, and performance may be poor dependent upon number of rows and indexes

    hopefully some else will chime in with a better idea.....

    SELECT DISTINCT ClientId

    FROM TestPurgeClient

    WHERE ClientId

    NOT IN (SELECT DISTINCT ClientId

    FROM TestPurgeClient

    WHERE (ActivityId = 72) AND (DATEDIFF ( d , CreateDate , GETDATE ( )) < 60) )

    AND ClientId

    NOT IN (SELECT DISTINCT ClientId

    FROM TestPurgeClient

    WHERE (ActivityId = 33) AND (DATEDIFF ( d , CreateDate , GETDATE ( )) < 90) )

    AND ClientId

    NOT IN (SELECT DISTINCT ClientId

    FROM TestPurgeClient

    WHERE (ActivityId = 34) AND (DATEDIFF ( d , CreateDate , GETDATE ( )) < 120) );

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • Perfect. Thank you very much. Looks good. These columns are part of a very big table but fortunately they are indexed.

    I checked this in our DB and it took 6 seconds, which I will consider not bad, since I have to run this as a part of nightly job, which will run off hours.

    Again thanks very much.. you are a gem.

  • I've not checked JLS's solution but, if that's the one that works, then you might be able to get a wee bit more performance out of it by making a tweek to the criteria so that it's 100% SARGable. Like this...

    SELECT DISTINCT ClientId

    FROM dbo.TestPurgeClient

    WHERE ClientId NOT IN (SELECT DISTINCT ClientId

    FROM dbo.TestPurgeClient

    WHERE ActivityId = 72 AND CreateDate > GETDATE()-60)

    AND ClientId NOT IN (SELECT DISTINCT ClientId

    FROM dbo.TestPurgeClient

    WHERE ActivityId = 33 AND CreateDate > GETDATE()-90)

    AND ClientId NOT IN (SELECT DISTINCT ClientId

    FROM dbo.TestPurgeClient

    WHERE ActivityId = 34 AND CreateDate > GETDATE()-120)

    ;

    The key is to not put columns into functions so that indexes can actually use them as part of a seek.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply