Duplicates After Self Join

  • I have a table that has a record of all of the work orders that have been performed. there are two types of orders. Installs and Trouble Calls. My query is to find all of the trouble calls that have taken place within 30 days of an install and match that trouble call (TC) to the proper Install (IN). So the Trouble Call date has to happen after the install but no more than 30 days after. Additionally if there are two installs and two trouble calls for the same account all within 30 days and they happen in order the results have to reflect that. The problem I am having is I am getting an Install order matching to two different Trouble Calls (TC) and a Trouble Call(TC) that is matching to two different Installs(IN)

    In the example on SQL Fiddle pay close attention to the install order number 1234567810 and the Trouble Call order number 1234567890 and you will see the issue I am having.

    SQL Fiddle of the example

    For those that don't want to review what is on the SQL Fiddle here is the example

    create table workorders

    (

    WorkOrderNumber varchar(10) not null,

    AccountNumber varchar(10) not null,

    JobType varchar(2) not null,

    ScheduledDate date not null

    )

    insert into workorders values ('1234567890','666','TC','12/12/2015')

    insert into workorders values ('1234567820','666','IN','12/11/2015')

    insert into workorders values ('1234567810','666','IN','12/01/2015')

    insert into workorders values ('1234567830','666','TC','12/05/2015')

    insert into workorders values ('1234567822','777','IN','11/11/2015')

    insert into workorders values ('1234567444','777','TC','10/20/2015')

    insert into workorders values ('1234567333','777','TC','08/15/2015')

    insert into workorders values ('1234567777','555','TC','04/21/2015')

    insert into workorders values ('1234568888','555','IN','06/12/2015')

    insert into workorders values ('1234566666','555','IN','06/14/2015')

    insert into workorders values ('1234544444','555','IN','07/15/2015')

    insert into workorders values ('1234999999','555','TC','06/13/2015')

    insert into workorders values ('1234562222','555','TC','12/12/2015')

    select b.accountnumber,MAX(b.scheduleddate) as OriginalDate,b.workordernumber as OriginalOrder,b.jobtype as OriginalType,MIN(a.scheduleddate) as NewDate,a.workordernumber as NewOrder,a.jobtype as NewType from (

    select workordernumber,accountnumber,jobtype,scheduleddate from workorders where jobtype = 'TC'

    ) a join

    (select workordernumber,accountnumber,jobtype,scheduleddate from workorders where jobtype = 'IN'

    ) b

    on a.accountnumber = b.accountnumber

    group by b.accountnumber,b.scheduleddate,b.workordernumber,b.jobtype,a.accountnumber,a.scheduleddate,a.workordernumber,a.jobtype

    having MIN(a.scheduleddate) > MAX(b.scheduleddate) and DATEDIFF(day,MAX(b.scheduleddate),MIN(a.scheduleddate)) < 31

  • This should get you started:

    SELECT *

    FROM workorders i

    CROSS APPLY (

    SELECT *

    FROM workorders tc

    WHERE tc.JobType = 'TC'

    AND tc.AccountNumber = i.AccountNumber

    AND tc.ScheduledDate BETWEEN i.ScheduledDate AND DATEADD(DAY,30,i.ScheduledDate)

    ) x

    WHERE i.JobType = 'IN'

    AND NOT EXISTS (

    SELECT 1

    FROM workorders wi

    WHERE wi.AccountNumber = i.AccountNumber

    AND wi.JobType = 'IN'

    AND wi.ScheduledDate > i.ScheduledDate

    AND x.ScheduledDate BETWEEN wi.ScheduledDate AND DATEADD(DAY,30,wi.ScheduledDate)

    )

    ORDER BY i.AccountNumber

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Initial testing this appears to be the answer. I am waiting on my testing server to catch up to me, once I perform some tests I will respond on the results.

    Thank you

  • Here's an alternative solution, but with some additions to the sample data it gives different results than Chris'.

    What is the expected result with the following rows added?

    INSERT INTO workorders VALUES ('1234562221','111','IN','12/11/2015');

    INSERT INTO workorders VALUES ('1234562222','111','IN','12/12/2015');

    INSERT INTO workorders VALUES ('1234562223','111','TC','12/13/2015');

    INSERT INTO workorders VALUES ('1234562224','111','TC','12/14/2015');

    --My try at a solution

    WITH installs_matched_to_trouble AS

    (

    SELECT install.AccountNumber AS account_number,

    install.WorkOrderNumber AS install_work_order,

    install.ScheduledDate AS install_schedule_date,

    trouble.WorkOrderNumber AS trouble_work_order,

    trouble.ScheduledDate AS trouble_schedule_date,

    DENSE_RANK() OVER (PARTITION BY install.AccountNumber ORDER BY install.ScheduledDate ASC) AS install_dr,

    DENSE_RANK() OVER (PARTITION BY install.AccountNumber ORDER BY trouble.ScheduledDate ASC) AS trouble_dr

    FROM workorders AS install

    INNER JOIN (

    SELECT *

    FROM workorders

    WHERE jobtype='TC'

    ) trouble

    ON install.accountnumber=trouble.accountnumber

    AND

    DATEDIFF(dd,install.scheduleddate,trouble.scheduleddate) BETWEEN 1 AND 30

    WHERE install.jobtype='IN'

    )

    SELECT account_number,

    install_work_order,

    install_schedule_date,

    trouble_work_order,

    trouble_schedule_date

    FROM installs_matched_to_trouble

    WHERE instalL_dr=trouble_dr;

  • ChrisM@Work although with the initial semi clean data I provided it appeared your query did the trick but once placed into the live environment it did not catch all of the issues. and ended up providing duplicates, some of which I didn't even think would be an issue ended up being. but then again that is what this is all about trial and error.

    Jacob Wilkins: Your query along with my tweaking due to the live data is not as clean as what was provided. Is showing huge promise. So far it appears that it is not producing any of the duplicate issues I have been experiencing but tomorrow I will be testing to make sure that it is not leaving out records it shouldn't be.

    as far as understanding of the querys go. Jacob, yours was a lot easier to understand. Not to take anything away from Chris I am just not a SQL guru and have never dealt with CROSS APPLY before. The concept is not easy for me to grasp from what I have read online.

    as said before I am by far not a SQL guy what so ever, I write SQL out of necessity only.

Viewing 5 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply