Need to display a column with deleted overlapping dates

  • I want to get rid of the following overlapping records from the table and display the below output. Also, I want to check if these records exists in other fact tables.

    select pcode,pkey, RowEffectiveDate ,rowenddate from dimPwhere pcode='KO18'select pcode,pkey, RowEffectiveDate ,rowenddate from dimPwhere pcode='KO18'
    Current Output:
    Pcode Pkey     RowEffectiveDate           rowenddate
    KO18  502    1900-01-01 00:00:00.000   2017-12-06 23:59:59.997
    KO18  5431   2017-12-07 00:00:00.000   2018-01-29 23:59:59.997
    KO18  6852   2018-01-30 00:00:00.000   2018-02-09 23:59:59.997
    KO18  8861   2018-02-10 00:00:00.000   2018-04-24 23:59:59.997
    KO18  8862   2018-02-10 00:00:00.000   2018-04-25 23:59:59.997
    KO18  11764  2018-04-25 00:00:00.000   2018-04-25 23:59:59.997
    KO18  11797  2018-04-26 00:00:00.000   9999-12-31 00:00:00.000

    Expected output:
    Pcode PKey   RowEffectiveDate      RowEndDate
    KO18  502   1900-01-01 00:00:00.000  2017-12-06 23:59:59.997
    KO18  5431   2017-12-07 00:00:00.000  2018-01-29 23:59:59.997
    KO18  6852   2018-01-30 00:00:00.000  2018-02-09 23:59:59.997
    KO18  8861   2018-02-10 00:00:00.000  2018-04-24 23:59:59.997
    KO18  11764   2018-04-25 00:00:00.000  2018-04-25 23:59:59.997
    KO18  11797   2018-04-26 00:00:00.000  9999-12-31 00:00:00.000

    P.S: RowEffectiveDate should be the next day of RowEndDate

  • It really depends on what kinds of overlaps you have.  It's rarely as simple as two or more records having the same start dates, but I have coded something that assumes that you only have that most simple case.

    WITH CTE AS
    (

    SELECT pcode,pkey, RowEffectiveDate , ROW_NUMBER() OVER(ORDER BY RowEffectiveDate, PKey) AS rn
        FROM dimP
        WHERE pcode='KO18'
    )
    SELECT pcode, pkey, RowEffectiveDate, DATEADD(MICROSECOND, -3, LEAD(RowEffectiveDate) OVER(ORDER BY RowEffectiveDate)) AS RowEndDate
    FROM CTE
    WHERE rn = 1

    You should also be using half-closed intervals instead of (fully) closed intervals.  Specifically, you should not be guesstimating that your end time will be 23:59:59.997.  If someone changes the data type to datetime2, then those estimates will no longer work.  You should be using midnight for both ends of the range, but have the beginning compared using GREATER THAN OR EQUAL TO and the end using LESS THAN.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Hi Allen, thanks for your solution. But the query does not provide me with the solution i need. I get the following output when i run your query. While i am expecting the above one, I do understand that your query might be more effective. 
    Could you please help?
    Also, I am looking forward to using this query to identify related records(to be deleted) in other fact tables too.

    pcode    pkey    RowEffectiveDate        RowEndDate
    KO18     502   1900-01-01 00:00:00.000      NULL

  • drew.allen - Wednesday, June 27, 2018 2:44 PM

    It really depends on what kinds of overlaps you have.  It's rarely as simple as two or more records having the same start dates, but I have coded something that assumes that you only have that most simple case.

    WITH CTE AS
    (

    SELECT pcode,pkey, RowEffectiveDate , ROW_NUMBER() OVER(ORDER BY RowEffectiveDate, PKey) AS rn
        FROM dimP
        WHERE pcode='KO18'
    )
    SELECT pcode, pkey, RowEffectiveDate, DATEADD(MICROSECOND, -3, LEAD(RowEffectiveDate) OVER(ORDER BY RowEffectiveDate)) AS RowEndDate
    FROM CTE
    WHERE rn = 1

    You should also be using half-closed intervals instead of (fully) closed intervals.  Specifically, you should not be guesstimating that your end time will be 23:59:59.997.  If someone changes the data type to datetime2, then those estimates will no longer work.  You should be using midnight for both ends of the range, but have the beginning compared using GREATER THAN OR EQUAL TO and the end using LESS THAN.

    Drew

    I agree with Drew with regard to the nature of your intervals...   as relying on the specific data type might not be a good long-term idea.   However, as your desired output disagrees with Drew's code result, I could take the same approach and a relatively minor change would solve the problem for this specific case of data.   The question is whether the specific circumstance it will solve for is the only kind of overlap your data contains.

    EDIT: Code revised when I realized what was wrong with it:
    CREATE TABLE #dimP (
        pcode char(4),
        pkey int,
        RowEffectiveDate datetime,
        rowenddate datetime
    );
    INSERT INTO #dimP (pcode, pkey, RowEffectiveDate, rowenddate)
        VALUES    ('KO18', 502 , '1900-01-01 00:00:00.000', '2017-12-06 23:59:59.997'),
                ('KO18', 5431 , '2017-12-07 00:00:00.000', '2018-01-29 23:59:59.997'),
                ('KO18', 6852 , '2018-01-30 00:00:00.000', '2018-02-09 23:59:59.997'),
                ('KO18', 8861 , '2018-02-10 00:00:00.000', '2018-04-24 23:59:59.997'),
                ('KO18', 8862 , '2018-02-10 00:00:00.000', '2018-04-25 23:59:59.997'),
                ('KO18', 11764, '2018-04-25 00:00:00.000', '2018-04-25 23:59:59.997'),
                ('KO18', 11797, '2018-04-26 00:00:00.000', '9999-12-31 00:00:00.000');

    SELECT *, ROW_NUMBER() OVER(PARTITION BY RowEffectiveDate ORDER BY rowenddate, PKey) AS rn
    FROM #dimP;

    WITH CTE AS (

    SELECT pcode,pkey, RowEffectiveDate, rowenddate,
        ROW_NUMBER() OVER(PARTITION BY RowEffectiveDate ORDER BY rowenddate, PKey) AS rn
    FROM #dimP
    WHERE pcode='KO18'
    )
    SELECT rn, pcode, pkey, RowEffectiveDate, rowenddate
    FROM CTE
    WHERE rn = 1
    ORDER BY rn;

    DROP TABLE #dimP;

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • sanket.wagh7689 - Wednesday, June 27, 2018 4:35 PM

    Hi Allen, thanks for your solution. But the query does not provide me with the solution i need. I get the following output when i run your query. While i am expecting the above one, I do understand that your query might be more effective. 
    Could you please help?
    Also, I am looking forward to using this query to identify related records(to be deleted) in other fact tables too.

    pcode    pkey    RowEffectiveDate        RowEndDate
    KO18     502   1900-01-01 00:00:00.000      NULL

    My name is Drew.

    As Steve mentioned, the problem was with the OVER clause.  The RowEffectiveDate should have been used as the PARTITION rather than the first in the ORDER BY clause.  If you had provided readily consumable data, I would have been able to test this and fix it before posting it.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Thanks Steve and Drew for all your help. I appreciate your help. You both are geniuses. Thanks again!

  • sgmunson - Thursday, June 28, 2018 6:49 AM

    drew.allen - Wednesday, June 27, 2018 2:44 PM

    It really depends on what kinds of overlaps you have.  It's rarely as simple as two or more records having the same start dates, but I have coded something that assumes that you only have that most simple case.

    WITH CTE AS
    (

    SELECT pcode,pkey, RowEffectiveDate , ROW_NUMBER() OVER(ORDER BY RowEffectiveDate, PKey) AS rn
        FROM dimP
        WHERE pcode='KO18'
    )
    SELECT pcode, pkey, RowEffectiveDate, DATEADD(MICROSECOND, -3, LEAD(RowEffectiveDate) OVER(ORDER BY RowEffectiveDate)) AS RowEndDate
    FROM CTE
    WHERE rn = 1

    You should also be using half-closed intervals instead of (fully) closed intervals.  Specifically, you should not be guesstimating that your end time will be 23:59:59.997.  If someone changes the data type to datetime2, then those estimates will no longer work.  You should be using midnight for both ends of the range, but have the beginning compared using GREATER THAN OR EQUAL TO and the end using LESS THAN.

    Drew

    I agree with Drew with regard to the nature of your intervals...   as relying on the specific data type might not be a good long-term idea.   However, as your desired output disagrees with Drew's code result, I could take the same approach and a relatively minor change would solve the problem for this specific case of data.   The question is whether the specific circumstance it will solve for is the only kind of overlap your data contains.

    EDIT: Code revised when I realized what was wrong with it:
    CREATE TABLE #dimP (
        pcode char(4),
        pkey int,
        RowEffectiveDate datetime,
        rowenddate datetime
    );
    INSERT INTO #dimP (pcode, pkey, RowEffectiveDate, rowenddate)
        VALUES    ('KO18', 502 , '1900-01-01 00:00:00.000', '2017-12-06 23:59:59.997'),
                ('KO18', 5431 , '2017-12-07 00:00:00.000', '2018-01-29 23:59:59.997'),
                ('KO18', 6852 , '2018-01-30 00:00:00.000', '2018-02-09 23:59:59.997'),
                ('KO18', 8861 , '2018-02-10 00:00:00.000', '2018-04-24 23:59:59.997'),
                ('KO18', 8862 , '2018-02-10 00:00:00.000', '2018-04-25 23:59:59.997'),
                ('KO18', 11764, '2018-04-25 00:00:00.000', '2018-04-25 23:59:59.997'),
                ('KO18', 11797, '2018-04-26 00:00:00.000', '9999-12-31 00:00:00.000');

    SELECT *, ROW_NUMBER() OVER(PARTITION BY RowEffectiveDate ORDER BY rowenddate, PKey) AS rn
    FROM #dimP;

    WITH CTE AS (

    SELECT pcode,pkey, RowEffectiveDate, rowenddate,
        ROW_NUMBER() OVER(PARTITION BY RowEffectiveDate ORDER BY rowenddate, PKey) AS rn
    FROM #dimP
    WHERE pcode='KO18'
    )
    SELECT rn, pcode, pkey, RowEffectiveDate, rowenddate
    FROM CTE
    WHERE rn = 1
    ORDER BY rn;

    DROP TABLE #dimP;

    Thank you so much Sir!

  • drew.allen - Thursday, June 28, 2018 7:25 AM

    sanket.wagh7689 - Wednesday, June 27, 2018 4:35 PM

    Hi Allen, thanks for your solution. But the query does not provide me with the solution i need. I get the following output when i run your query. While i am expecting the above one, I do understand that your query might be more effective. 
    Could you please help?
    Also, I am looking forward to using this query to identify related records(to be deleted) in other fact tables too.

    pcode    pkey    RowEffectiveDate        RowEndDate
    KO18     502   1900-01-01 00:00:00.000      NULL

    My name is Drew.

    As Steve mentioned, the problem was with the OVER clause.  The RowEffectiveDate should have been used as the PARTITION rather than the first in the ORDER BY clause.  If you had provided readily consumable data, I would have been able to test this and fix it before posting it.

    Drew

    Sorry for mentioning your last name sir. Thank you Drew. I appreciate your help on this one!

  • sanket.wagh7689 - Thursday, June 28, 2018 10:05 AM

    Thanks Steve and Drew for all your help. I appreciate your help. You both are geniuses. Thanks again!

    You;re very welcome!

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • Hi Steve,

    Sorry to get back to a question related to this discussion.
    I observed that this query does not identify all the overlapping records, for instance in the above list of records it could identify the overlapping dates and as mentioned in the question the RowEffectiveDate should be the next day of RowEndDate, but it is kind off difficult for this particular record. It will just remove 2 records 8903 and 8904 from the table and will keep all other records. I have marked the overlapping and wrong records with bold and italic.
    Any help would be appreciated.
    The query that i am using is mentioned below:

    CREATE TABLE #dimP (
      pcode char(4),
      pkey int,
      RowEffectiveDate datetime,
      rowenddate datetime
    );
    INSERT INTO #dimP (pcode, pkey, RowEffectiveDate, rowenddate)
      VALUES 
                 ('PL56',8855,'1900-01-01 00:00:00.000','2018-02-13 23:59:59.997'),
                 ('PL56',8856,'2018-02-09 15:09:52.000','2018-02-14 23:59:59.997'),
                 ('PL56',8903,'2018-02-09 15:09:52.000','2018-02-15 23:59:59.997'),
                 ('PL56',8904,'2018-02-09 15:09:52.000','2018-02-16 23:59:59.997'),
                 ('PL56',8935,'2018-02-14 00:00:00.000','2018-02-17 23:59:59.997'),
                 ('PL56',8944,'2018-02-15 00:00:00.000','2018-02-18 23:59:59.997'),
                 ('PL56',8955,'2018-02-16 00:00:00.000','2018-03-02 23:59:59.997'),
                 ('PL56',8965,'2018-02-17 00:00:00.000','2018-03-03 23:59:59.997'),
                 ('PL56',8972,'2018-02-18 00:00:00.000','2018-03-04 23:59:59.997'),
                 ('PL56',8973,'2018-02-19 00:00:00.000','2018-03-05 23:59:59.997'),
                 ('PL56',9114,'2018-03-03 00:00:00.000','2018-04-24 23:59:59.997'),
                 ('PL56',9115,'2018-03-04 00:00:00.000','2018-04-25 23:59:59.997'),
                 ('PL56',9136,'2018-03-05 00:00:00.000','2018-04-26 23:59:59.997'),
                 ('PL56',9140,'2018-03-06 00:00:00.000','2018-04-29 23:59:59.997'),
                 ('PL56',11597,'2018-04-25 00:00:00.000','2018-04-25 23:59:59.997'),
                 ('PL56',11820,'2018-04-26 00:00:00.000','2018-04-26 23:59:59.997'),
                 ('PL56',11837,'2018-04-27 00:00:00.000','2018-04-29 23:59:59.997'),
                 ('PL56',11839,'2018-04-30 00:00:00.000','2018-05-22 23:59:59.997'),
                 ('PL56',12372,'2018-05-23 00:00:00.000','9999-12-31 00:00:00.000');

    SELECT *, ROW_NUMBER() OVER(PARTITION BY RowEffectiveDate ORDER BY rowenddate, PKey) AS rn
    FROM #dimP;

    WITH CTE AS (

    SELECT pcode,pkey, RowEffectiveDate, rowenddate,
      ROW_NUMBER() OVER(PARTITION BY RowEffectiveDate ORDER BY rowenddate, PKey) AS rn
    FROM #dimP
    WHERE pcode='KO18'
    )
    SELECT rn, pcode, pkey, RowEffectiveDate, rowenddate
    FROM CTE
    WHERE rn = 1
    ORDER BY rn;

    DROP TABLE #dimP;

    Current output:

    pcode                   pkey         RowEffectiveDate                  rowenddate
    PL5688551900-01-01 00:00:00.0002018-02-13 23:59:59.997
    PL5688562018-02-09 15:09:52.0002018-02-14 23:59:59.997
    PL5689032018-02-09 15:09:52.0002018-02-15 23:59:59.997
    PL5689042018-02-09 15:09:52.0002018-02-16 23:59:59.997
    PL5689352018-02-14 00:00:00.0002018-02-17 23:59:59.997
    PL5689442018-02-15 00:00:00.0002018-02-18 23:59:59.997
    PL5689552018-02-16 00:00:00.0002018-03-02 23:59:59.997
    PL5689652018-02-17 00:00:00.0002018-03-03 23:59:59.997
    PL5689722018-02-18 00:00:00.0002018-03-04 23:59:59.997
    PL5689732018-02-19 00:00:00.0002018-03-05 23:59:59.997
    PL5691142018-03-03 00:00:00.0002018-04-24 23:59:59.997
    PL5691152018-03-04 00:00:00.0002018-04-25 23:59:59.997
    PL5691362018-03-05 00:00:00.0002018-04-26 23:59:59.997
    PL5691402018-03-06 00:00:00.0002018-04-29 23:59:59.997
    PL56115972018-04-25 00:00:00.0002018-04-25 23:59:59.997
    PL56118202018-04-26 00:00:00.0002018-04-26 23:59:59.997
    PL56118372018-04-27 00:00:00.0002018-04-29 23:59:59.997
    PL56118392018-04-30 00:00:00.0002018-05-22 23:59:59.997
    PL56123722018-05-23 00:00:00.0009999-12-31 00:00:00.000

     
    Expected Output should be:

    pcodepkeyRowEffectiveDaterowenddate
    PL5688551900-01-01 00:00:00.0002018-02-13 23:59:59.997
    PL5689352018-02-14 00:00:00.0002018-02-17 23:59:59.997
    PL5689722018-02-18 00:00:00.0002018-03-04 23:59:59.997
    PL5691362018-03-05 00:00:00.0002018-04-26 23:59:59.997
    PL56118372018-04-27 00:00:00.0002018-04-29 23:59:59.997
    PL56118392018-04-30 00:00:00.0002018-05-22 23:59:59.997
    PL56123722018-05-23 00:00:00.0009999-12-31 00:00:00.000

  • sanket.wagh7689 - Monday, July 2, 2018 12:21 PM

    Hi Steve,

    Sorry to get back to a question related to this discussion.
    I observed that this query does not identify all the overlapping records, for instance in the above list of records it could identify the overlapping dates and as mentioned in the question the RowEffectiveDate should be the next day of RowEndDate, but it is kind off difficult for this particular record. It will just remove 2 records 8903 and 8904 from the table and will keep all other records. I have marked the overlapping and wrong records with bold and italic.
    Any help would be appreciated.

    pcode                   pkey         RowEffectiveDate                  rowenddate
    PL5688551900-01-01 00:00:00.0002018-02-13 23:59:59.997
    PL5688562018-02-09 15:09:52.0002018-02-14 23:59:59.997
    PL5689032018-02-09 15:09:52.0002018-02-15 23:59:59.997
    PL5689042018-02-09 15:09:52.0002018-02-16 23:59:59.997
    PL5689352018-02-14 00:00:00.0002018-02-17 23:59:59.997
    PL5689442018-02-15 00:00:00.0002018-02-18 23:59:59.997
    PL5689552018-02-16 00:00:00.0002018-03-02 23:59:59.997
    PL5689652018-02-17 00:00:00.0002018-03-03 23:59:59.997
    PL5689722018-02-18 00:00:00.0002018-03-04 23:59:59.997
    PL5689732018-02-19 00:00:00.0002018-03-05 23:59:59.997
    PL5691142018-03-03 00:00:00.0002018-04-24 23:59:59.997
    PL5691152018-03-04 00:00:00.0002018-04-25 23:59:59.997
    PL5691362018-03-05 00:00:00.0002018-04-26 23:59:59.997
    PL5691402018-03-06 00:00:00.0002018-04-29 23:59:59.997
    PL56115972018-04-25 00:00:00.0002018-04-25 23:59:59.997
    PL56118202018-04-26 00:00:00.0002018-04-26 23:59:59.997
    PL56118372018-04-27 00:00:00.0002018-04-29 23:59:59.997
    PL56118392018-04-30 00:00:00.0002018-05-22 23:59:59.997
    PL56123722018-05-23 00:00:00.0009999-12-31 00:00:00.000

     

    As I said earlier in this thread, it really depends on what kinds of overlaps you are seeing.  I specifically mentioned that my approach only worked where the start dates were the same.

    The more information that you can provide, the more likely people are to work on it, and the more likely it is to actually work the way that you want it to.  For example, you have given us your starting data, but not your expected results.

    Furthermore, you haven't made your data easily consumable.  Data should ALWAYS be present with a script to create a temporary table or declare a table variable and also an insert statement to insert the data into the temporary table/table variable.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • drew.allen - Monday, July 2, 2018 12:59 PM

    sanket.wagh7689 - Monday, July 2, 2018 12:21 PM

    Hi Steve,

    Sorry to get back to a question related to this discussion.
    I observed that this query does not identify all the overlapping records, for instance in the above list of records it could identify the overlapping dates and as mentioned in the question the RowEffectiveDate should be the next day of RowEndDate, but it is kind off difficult for this particular record. It will just remove 2 records 8903 and 8904 from the table and will keep all other records. I have marked the overlapping and wrong records with bold and italic.
    Any help would be appreciated.

    pcode                   pkey         RowEffectiveDate                  rowenddate
    PL5688551900-01-01 00:00:00.0002018-02-13 23:59:59.997
    PL5688562018-02-09 15:09:52.0002018-02-14 23:59:59.997
    PL5689032018-02-09 15:09:52.0002018-02-15 23:59:59.997
    PL5689042018-02-09 15:09:52.0002018-02-16 23:59:59.997
    PL5689352018-02-14 00:00:00.0002018-02-17 23:59:59.997
    PL5689442018-02-15 00:00:00.0002018-02-18 23:59:59.997
    PL5689552018-02-16 00:00:00.0002018-03-02 23:59:59.997
    PL5689652018-02-17 00:00:00.0002018-03-03 23:59:59.997
    PL5689722018-02-18 00:00:00.0002018-03-04 23:59:59.997
    PL5689732018-02-19 00:00:00.0002018-03-05 23:59:59.997
    PL5691142018-03-03 00:00:00.0002018-04-24 23:59:59.997
    PL5691152018-03-04 00:00:00.0002018-04-25 23:59:59.997
    PL5691362018-03-05 00:00:00.0002018-04-26 23:59:59.997
    PL5691402018-03-06 00:00:00.0002018-04-29 23:59:59.997
    PL56115972018-04-25 00:00:00.0002018-04-25 23:59:59.997
    PL56118202018-04-26 00:00:00.0002018-04-26 23:59:59.997
    PL56118372018-04-27 00:00:00.0002018-04-29 23:59:59.997
    PL56118392018-04-30 00:00:00.0002018-05-22 23:59:59.997
    PL56123722018-05-23 00:00:00.0009999-12-31 00:00:00.000

     

    As I said earlier in this thread, it really depends on what kinds of overlaps you are seeing.  I specifically mentioned that my approach only worked where the start dates were the same.

    The more information that you can provide, the more likely people are to work on it, and the more likely it is to actually work the way that you want it to.  For example, you have given us your starting data, but not your expected results.

    Furthermore, you haven't made your data easily consumable.  Data should ALWAYS be present with a script to create a temporary table or declare a table variable and also an insert statement to insert the data into the temporary table/table variable.

    Drew

    Hi Drew,
    Sorry to have posted not according to the standards of this discussion. I am learning and will improve.
    I have edited my post to mention all the details required.

  • sgmunson - Thursday, June 28, 2018 10:42 AM

    sanket.wagh7689 - Thursday, June 28, 2018 10:05 AM

    Thanks Steve and Drew for all your help. I appreciate your help. You both are geniuses. Thanks again!

    You;re very welcome!

    Hi Steve,
    Could you please help me with this one?
    I had some questions with the query that you had helped me out with.

  • Maybe this does help  :

    Two rows A and B overlap when :
    (A.stop > B.start   AND   B.stop > A.start)

    But there is plenty of situations which maybe require different solutions.
    Four overlapping situations. (Events are ordered by time occuring).
    1 A starts B starts A stops B stops.
    2 A starts B starts B stops A stops.
    3 B starts A starts A stops B stops.
    4 B starts A starts B stops A stops.

    You have to decide what to do with each situation. When overlapping, take the row which starts as the first row en discard the other, or go for the lowest starttime and the highest stop time.

    Ben

  • Here's a recursive CTE that returns the values you want.  Seems you wants the contiguous keys and ignore all the noise in between


    Drop table if exists #dimP
    go

    CREATE TABLE #dimP (
      pcode char(4),
      pkey int,
      RowEffectiveDate datetime,
      rowenddate datetime
    );
    INSERT INTO #dimP (pcode, pkey, RowEffectiveDate, rowenddate)
    VALUES
         ('PL56',8855,'1900-01-01 00:00:00.000','2018-02-13 23:59:59.997'),
         ('PL56',8856,'2018-02-09 15:09:52.000','2018-02-14 23:59:59.997'),
         ('PL56',8903,'2018-02-09 15:09:52.000','2018-02-15 23:59:59.997'),
         ('PL56',8904,'2018-02-09 15:09:52.000','2018-02-16 23:59:59.997'),
         ('PL56',8935,'2018-02-14 00:00:00.000','2018-02-17 23:59:59.997'),
         ('PL56',8944,'2018-02-15 00:00:00.000','2018-02-18 23:59:59.997'),
         ('PL56',8955,'2018-02-16 00:00:00.000','2018-03-02 23:59:59.997'),
         ('PL56',8965,'2018-02-17 00:00:00.000','2018-03-03 23:59:59.997'),
         ('PL56',8972,'2018-02-18 00:00:00.000','2018-03-04 23:59:59.997'),
         ('PL56',8973,'2018-02-19 00:00:00.000','2018-03-05 23:59:59.997'),
         ('PL56',9114,'2018-03-03 00:00:00.000','2018-04-24 23:59:59.997'),
         ('PL56',9115,'2018-03-04 00:00:00.000','2018-04-25 23:59:59.997'),
         ('PL56',9136,'2018-03-05 00:00:00.000','2018-04-26 23:59:59.997'),
         ('PL56',9140,'2018-03-06 00:00:00.000','2018-04-29 23:59:59.997'),
         ('PL56',11597,'2018-04-25 00:00:00.000','2018-04-25 23:59:59.997'),
         ('PL56',11820,'2018-04-26 00:00:00.000','2018-04-26 23:59:59.997'),
         ('PL56',11837,'2018-04-27 00:00:00.000','2018-04-29 23:59:59.997'),
         ('PL56',11839,'2018-04-30 00:00:00.000','2018-05-22 23:59:59.997'),
         ('PL56',12372,'2018-05-23 00:00:00.000','9999-12-31 00:00:00.000');
    INSERT INTO #dimP (pcode, pkey, RowEffectiveDate, rowenddate)
      VALUES  ('KO18', 502 , '1900-01-01 00:00:00.000', '2017-12-06 23:59:59.997'),
        ('KO18', 5431 , '2017-12-07 00:00:00.000', '2018-01-29 23:59:59.997'),
        ('KO18', 6852 , '2018-01-30 00:00:00.000', '2018-02-09 23:59:59.997'),
        ('KO18', 8861 , '2018-02-10 00:00:00.000', '2018-04-24 23:59:59.997'),
        ('KO18', 8862 , '2018-02-10 00:00:00.000', '2018-04-25 23:59:59.997'),
        ('KO18', 11764, '2018-04-25 00:00:00.000', '2018-04-25 23:59:59.997'),
        ('KO18', 11797, '2018-04-26 00:00:00.000', '9999-12-31 00:00:00.000');

    /*
    SELECT *, ROW_NUMBER() OVER(PARTITION BY RowEffectiveDate ORDER BY rowenddate, PKey) AS rn
    FROM #dimP
    order by 1, 2
    */

    ;WITH CTE AS (
    select pcode,pkey, RowEffectiveDate, rowenddate,RowEffectiveDateCalc,RowEndDateCalc
    from (
             SELECT pcode,pkey, RowEffectiveDate, rowenddate
                        ,cast(RowEffectiveDate as date) RowEffectiveDateCalc
                        ,cast(rowenddate as date) RowEndDateCalc
                        , ROW_NUMBER() OVER(PARTITION BY pcode ORDER BY RowEffectiveDate, rowenddate, PKey) AS rn
             FROM #dimP
    --         where PCode = 'PL56'
            ) X
        where rn = 1
    union all
        SELECT NextRow.PCode, NextRow.Pkey, NextRow.RowEffectiveDate, NextRow.rowenddate
                ,cast(NextRow.RowEffectiveDate as date) RowEffectiveDateCalc
                        ,cast(NextRow.rowenddate as date) RowEndDateCalc
        FROM CTE CurRow
            join #dimP NextRow
            on CurRow.PCode = NextRow.PCode
            and DateDiff(day, CurRow.RowEndDateCalc, cast(NextRow.RowEffectiveDate as date)) = 1
    )

    select * from CTE
    order by 1, 2

    For better, quicker answers, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Viewing 15 posts - 1 through 15 (of 34 total)

You must be logged in to reply to this topic. Login to reply