Records with 2 dates 48 hours from each other

  • Hi folks and thanks in advance for any help.

    This is boggling me a bit. I have a table that is to do with fault callouts for a company. If there is a fault in a shop store, an Engineer is called out and this is recorded in the table.

    I've been asked to identify when a callout for the same store has been made with 48 hours of the last one. This is to show if an engineer has not corrected the problem and has had to come back out to fix it. This will evaluate Engineers performance.

    Here is sample data that contains the storeID, the callout dates and always different faultIDs

    FAultID StoreID Callout Date

    6279170142222010-04-26 00:00:00.000

    6418740142222010-06-28 00:00:00.000

    10002780142222010-06-28 16:35:00.000

    10003010142222010-07-01 17:08:00.000

    10003950142222010-07-05 12:00:00.000

    10003220142222010-07-06 07:00:00.000

    10003980142222010-07-06 08:44:00.000

    As you can see, there are three sets of dates that are within 48 hours of each other:

    FaultIDs 641874 and 1000278 are 16.35 hours from each other.

    faultIDs 1000395, 1000322 are around 19 hours from each other.

    Also, 1000322 and 1000398 are aound 1.44 hours from each other.

    These five records would be shown with the callout notes to determine if an engineer has not corrected the fault properly and has had to come back out.

    It should look like this

    FAultID StoreID Callout Date

    6418740142222010-06-28 00:00:00.000

    10002780142222010-06-28 16:35:00.000

    10003950142222010-07-05 12:00:00.000

    10003220142222010-07-06 07:00:00.000

    10003980142222010-07-06 08:44:00.000

    FaultID 627917 and 1000301 are not included in this report because they're outwith 48 hours of any other dates in the column for that particular store number.

    I'm very unsure how to do grab only records that fit the above criteria. Is there anyone that can help me?

    Many thanks

  • I haven't checked performance on this but it should work:

    CREATE TABLE #temp

    (

    faultID INT,

    storeID VARCHAR(6),

    calloutDate DATETIME

    )

    INSERT INTO #temp

    SELECT 627917, '014222', '04/26/10'

    UNION ALL

    SELECT 641874, '014222','06/28/10'

    UNION ALL

    SELECT 1000278, '014222', '06/28/10 16:35'

    UNION ALL

    SELECT 1000307, '014222','07/01/10 17:08'

    UNION ALL

    SELECT 1000395, '014222','07/05/10 12:00'

    UNION ALL

    SELECT 1000322, '014222','07/06/10 7:00'

    UNION ALL

    SELECT 1000398, '014222','07/06/10 8:44'

    SELECT * FROM #temp T

    WHERE EXISTS (SELECT * FROM #temp T2

    WHERE t.storeID = t2.storeID

    AND t2.calloutDate > t.calloutDate

    AND t2.calloutDate < DATEADD(hh,48,t.calloutDate))

    union

    SELECT * FROM #temp T

    WHERE EXISTS (SELECT * FROM #temp T2

    WHERE t.storeID = t2.storeID

    AND t2.calloutDate < t.calloutDate

    AND t2.calloutDate > DATEADD(hh,-48,t.calloutDate))

    ORDER BY T.calloutDate

    DROP TABLE #temp

  • Hey Matt, very smart, thanks a lot for this. It seems to work and performance seems ok

    🙂

  • This form of a query is known as a "triangular join". You might want to check out this article[/url] for clues on why this query just might be the worse thing you've ever done to your server. If you are running this on a large table, you can quite literally bring the server to it's knees.

    Edit: Just an FYI: I ran this query against those 7 rows of sample data... it had to touch 98 rows to get the answer.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Okay, so now that I blasted the previous query, here's one that will perform better.

    Remarks are within the code.

    if OBJECT_ID('tempdb..#temp') IS NOT NULL DROP TABLE #temp;

    CREATE TABLE #temp

    (-- need to sequential number the rows, and have a clustered index on it

    RowID INT IDENTITY PRIMARY KEY CLUSTERED,

    faultID INT,

    storeID VARCHAR(6),

    calloutDate DATETIME);

    INSERT INTO #temp

    SELECT 627917, '014222', '04/26/10'

    UNION ALL

    SELECT 641874, '014222','06/28/10'

    UNION ALL

    SELECT 1000278, '014222', '06/28/10 16:35'

    UNION ALL

    SELECT 1000307, '014222','07/01/10 17:08'

    UNION ALL

    SELECT 1000395, '014222','07/05/10 12:00'

    UNION ALL

    SELECT 1000322, '014222','07/06/10 7:00'

    UNION ALL

    SELECT 1000398, '014222','07/06/10 8:44'

    -- get the results

    -- this first query gets the first record if it's within 48 hours of the next

    SELECT t1.*

    FROM #temp t1

    JOIN #temp t2

    ON t2.RowID = t1.RowID + 1

    AND DATEDIFF(hour, t1.calloutDate, t2.calloutDate) <= 48

    UNION -- to eliminate any duplicates, do not use UNION ALL

    -- this second query gets the second record if it's within 48 hours of the previous

    SELECT t1.*

    FROM #temp t1

    JOIN #temp t2

    ON t2.RowID +1= t1.RowID

    AND DATEDIFF(hour, t2.calloutDate, t1.calloutDate) <= 48

    For a comparison, this has to touch just 20 records. Almost 20% of the other query.

    From the IO stats: 2 scans vs 4.

    Edit: I forgot to note that there is probably an even more efficient way to do this, but I can't quite figure it out right now.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Wayne,

    Thanks for pointing out the triangular join I had going on. I never see them until someone points them out. I can definitely see the benefits you show in your query but it looks like that assumes order in the table and only 1 storeID exisiting. Since this is in the SQL 7 or 2000 section would the next thing to use would be a numbers table to join to the previous/next row.

  • This recent article:

    Linking To Previous Row

    If you're in 2k5 or greater, this is an elegant solution to your self referencing issues.

    [edit] Sorry, wrong section for that solution. [/edit]


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • matt6288 (9/1/2010)


    Wayne,

    Thanks for pointing out the triangular join I had going on. I never see them until someone points them out.

    No problem. (Just look for < or >)

    I can definitely see the benefits you show in your query but it looks like that assumes order in the table and only 1 storeID exisiting. Since this is in the SQL 7 or 2000 section would the next thing to use would be a numbers table to join to the previous/next row.

    The code I used should be SQL 7/2000 compatible. However, you are right - I don't think I considered different storeIDs. You guarantee the order by the identity column in the temp table - you just need to ensure that the select that populates it is ordered by storeID/Date.

    I first came up with a "quirky update" solution, but then realized that this was only getting me the rows that were 48 hours after the previous - it wasn't getting the previous row unless it also was within 48 hours of it's previous row. And this did handle multiple storeIDs. Looks like I might have to revisit it... I'll be back with a better version shortly.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Just to let you know. I tried this one from another forum and it touched 56 rows. I've not tried your method yet Wayne, but I wil let you know how I get on.

    create table#Sample

    (

    FaultID INT NOT NULL,

    StoreID INT NOT NULL,

    CallOut SMALLDATETIME NOT NULL

    )

    INSERT#Sample

    SELECT 627917, 14222, '2010-04-26 00:00' UNION ALL

    SELECT 641874, 14222, '2010-06-28 00:00' UNION ALL

    SELECT1000278, 14222, '2010-06-28 16:35' UNION ALL

    SELECT1000301, 14222, '2010-07-01 17:08' UNION ALL

    SELECT1000395, 14222, '2010-07-05 12:00' UNION ALL

    SELECT1000322, 14222, '2010-07-06 07:00' UNION ALL

    SELECT1000398, 14222, '2010-07-06 08:44'

    -- Solution here

    SELECT DISTINCTx.FaultID,

    x.StoreID,

    x.CallOut

    FROM(

    SELECTFaultID,

    StoreID,

    CallOut,

    DATEADD(HOUR, -48, CallOut) AS FromTime,

    DATEADD(HOUR, 48, CallOut) AS ToTime

    FROM#Sample

    ) AS s

    INNER JOIN#Sample AS x ON x.StoreID = s.StoreID

    WHEREx.CallOut BETWEEN s.FromTime AND s.ToTime

    AND x.FaultID <> s.FaultID

  • cidr (9/2/2010)


    Just to let you know. I tried this one from another forum and it touched 56 rows. I've not tried your method yet Wayne, but I wil let you know how I get on.

    create table#Sample

    (

    FaultID INT NOT NULL,

    StoreID INT NOT NULL,

    CallOut SMALLDATETIME NOT NULL

    )

    INSERT#Sample

    SELECT 627917, 14222, '2010-04-26 00:00' UNION ALL

    SELECT 641874, 14222, '2010-06-28 00:00' UNION ALL

    SELECT1000278, 14222, '2010-06-28 16:35' UNION ALL

    SELECT1000301, 14222, '2010-07-01 17:08' UNION ALL

    SELECT1000395, 14222, '2010-07-05 12:00' UNION ALL

    SELECT1000322, 14222, '2010-07-06 07:00' UNION ALL

    SELECT1000398, 14222, '2010-07-06 08:44'

    -- Solution here

    SELECT DISTINCTx.FaultID,

    x.StoreID,

    x.CallOut

    FROM(

    SELECTFaultID,

    StoreID,

    CallOut,

    DATEADD(HOUR, -48, CallOut) AS FromTime,

    DATEADD(HOUR, 48, CallOut) AS ToTime

    FROM#Sample

    ) AS s

    INNER JOIN#Sample AS x ON x.StoreID = s.StoreID

    WHEREx.CallOut BETWEEN s.FromTime AND s.ToTime

    AND x.FaultID <> s.FaultID

    This may be slightly better

    SELECT x.FaultID,

    x.StoreID,

    x.CallOut

    FROM #Sample x

    WHERE EXISTS (SELECT * FROM #Sample s

    WHERE x.FaultID <> s.FaultID

    AND x.CallOut BETWEEN DATEADD(HOUR, -48, s.CallOut) AND DATEADD(HOUR, 48, s.CallOut))

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • mark-101232 02/09/10

    This may be slightly better

    SELECT x.FaultID,

    x.StoreID,

    x.CallOut

    FROM #Sample x

    WHERE EXISTS (SELECT * FROM #Sample s

    WHERE x.FaultID <> s.FaultID

    AND x.CallOut BETWEEN DATEADD(HOUR, -48, s.Call

    I tried this with the whole table which ouputs 5233 rows and it uses a lazy spool. I'm not really sure about lazy spools and will have to look it up but it seems Actual number of rows was 10726980 amonst other activity.

  • Repeat post mistake

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply