Recursive Query

  • Hi All,

    Thanks in advance for those who will contribute.

    I have a recordset of the following

    PlanAccountKeyCPNY_CODE EXTL_ACCT_NMBR PLAN_NMBR CATE_NMBR STRT_DATE END_DATE

    519426 001 001065007745823101491 2 2003090420040903

    519427 001 001065007745823101491 2 2004090320030904

    519428 001 001065007745823101491 2 2004092220040903

    I want to write a query to return TRUE if the END_DATE of next record is the STRT_DATE of the previous record. So the above example will return TRUE. But the following will return FALSE

    PlanAccountKeyCPNY_CODE EXTL_ACCT_NMBR PLAN_NMBR CATE_NMBR STRT_DATE END_DATE

    519426 001 001065007745823101491 2 2003090420040903

    519427 001 001065007745823101491 2 2004090320040922

    519428 001 001065007745823101491 2 2004092220110309

    The table is name tblPlanAccount.

    Somehow, I am planning to use CTE but not sure how to go about of doing this.

    Thanks,

  • You can use a cte for this.  I'm assuming that the EXTL_ACCT_NMBR is the key.

     

    declare @t table (

    PlanAccountKey int,

    CPNY_CODE nvarchar(3),

    EXTL_ACCT_NMBR nvarchar(30),

    PLAN_NMBR int,

    CATE_NMBR tinyint,

    STRT_DATE datetime,

    END_DATE Datetime)

     

    insert into @t

    select 519426,'001','001065007745823',101491, 2, '2003-09-04','2004-09-03' union all

    select 519427,'001','001065007745823',101491, 2, '2004-09-03','2003-09-04' union all

    select 519428,'001','001065007745823',101491, 2, '2004-09-22','2004-09-03' union all

     

    select 519430,'001','001065007745825',101492, 2, '2003-09-04','2004-09-03' union all

    select 519431,'001','001065007745825',101492, 2, '2004-09-03','2003-09-22' union all

    select 519432,'001','001065007745825',101492, 2, '2004-09-22','2004-03-09'

     

    ;with cte as

    (select PlanAccountKey, CPNY_CODE, EXTL_ACCT_NMBR, PLAN_NMBR,

                                    CATE_NMBR, STRT_DATE, END_DATE,

                                    ROW_NUMBER() over (partition by EXTL_ACCT_NMBR order by PlanAccountKey) RowNum

       from @t)

      

    select CurRow.EXTL_ACCT_NMBR, CurRow.Strt_Date, CurRow.End_Date, NextRow.End_Date NextStartDate,

                       PrevRow.End_Date PrevEndDate

      from cte CurRow

                    inner join cte NextRow

                                    on CurRow.EXTL_ACCT_NMBR = NextRow.EXTL_ACCT_NMBR

                       and CurRow.RowNum = NextRow.RowNum - 1

                       and CurRow.STRT_DATE = NextRow.END_DATE

        inner join cte PrevRow

                                    on CurRow.EXTL_ACCT_NMBR = PrevRow.EXTL_ACCT_NMBR

                       and CurRow.RowNum = PrevRow.RowNum + 1

                       and CurRow.STRT_DATE = PrevRow.END_DATE

                   

     

    For better, quicker answers, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Try this...

    SELECT t1.PlanAccountKey, CASE t1.STRT_DATE WHEN t2.END_DATE THEN 'TRUE' ELSE 'FALSE' END AS op

    FROM @tblPlanAccount t1

    INNER JOIN @tblPlanAccount t2

    ON t1.PlanAccountKey = t2.PlanAccountKey-1

  • rupex2101 (4/17/2012)


    Try this...

    SELECT t1.PlanAccountKey, CASE t1.STRT_DATE WHEN t2.END_DATE THEN 'TRUE' ELSE 'FALSE' END AS op

    FROM @tblPlanAccount t1

    INNER JOIN @tblPlanAccount t2

    ON t1.PlanAccountKey = t2.PlanAccountKey-1

    But if PlanAccountKey has gaps then Mike's solution would be great!

  • I'll confess I'm a little confused by your setup data because you've got a start date that's greater than its corresponding end date. Surely that can't be right. So let's try some additional data that you can compare against both solutions to see which you prefer.

    declare @t table (

    PlanAccountKey int,

    CPNY_CODE nvarchar(3),

    EXTL_ACCT_NMBR nvarchar(30),

    PLAN_NMBR int,

    CATE_NMBR tinyint,

    STRT_DATE datetime,

    END_DATE Datetime)

    insert into @t

    select 519426,'001','001065007745823',101491, 2, '2003-09-04','2004-09-03' union all

    select 519427,'001','001065007745823',101491, 2, '2004-09-03','2004-09-04' union all

    select 519428,'001','001065007745823',101491, 2, '2004-09-22','2004-09-23' union all

    select 519430,'001','001065007745825',101492, 2, '2003-09-04','2004-09-03' union all

    select 519431,'001','001065007745825',101492, 2, '2004-09-03','2004-09-22' union all

    select 519432,'001','001065007745825',101492, 2, '2004-09-22','2004-09-23' union all

    select 519433,'001','001065007745827',101493, 2, '2003-09-04','2004-09-03' union all

    select 519434,'001','001065007745827',101493, 2, '2004-09-03','2003-09-04' union all

    select 519435,'001','001065007745827',101493, 2, '2004-09-22','2004-09-03' union all

    select 519436,'001','001065007745828',101494, 2, '2003-09-04','2004-09-03' union all

    select 519437,'001','001065007745828',101494, 2, '2004-09-03','2003-09-22' union all

    select 519437,'001','001065007745828',101494, 2, '2004-09-22','2004-03-09'

    Now assuming you're interesting in ignoring the earliest start date, which cannot of course have a matching end date on a prior record, I'd suggest you take a look at something along the lines of the following.

    ;WITH Dates AS (

    SELECT PlanAccountKey, CPNY_CODE, EXTL_ACCT_NMBR, PLAN_NMBR

    ,CATE_NMBR, STRT_DATE, END_DATE

    ,(SELECT TOP 1 END_DATE

    FROM @t t2

    WHERE t1.strt_date = t2.end_date and t1.CPNY_CODE = t2.CPNY_CODE and

    t1.EXTL_ACCT_NMBR = t2.EXTL_ACCT_NMBR

    ) As PriorEndDate

    ,ROW_NUMBER() OVER (PARTITION BY CPNY_CODE, EXTL_ACCT_NMBR ORDER BY CPNY_CODE, EXTL_ACCT_NMBR, STRT_DATE) AS rk

    FROM @t t1)

    SELECT PlanAccountKey, CPNY_CODE, EXTL_ACCT_NMBR, PLAN_NMBR

    ,CATE_NMBR, STRT_DATE, END_DATE

    ,CASE WHEN STRT_DATE = PriorEndDate THEN 'TRUE' ELSE 'FALSE' END As Flag

    FROM Dates

    WHERE rk <> 1

    I believe you'll find that there's a massive difference in query plan costs because my way avoids all the JOINS of the table on itself. That might actually drive a better performing query of course, or it might not. You'd need to try it on your actual data.

    It might even be a little easier on the eyes, depending on your point of view.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • I realise that this is a 2005 T-SQL forum, but in T-SQL 2012 this can be accomplished by the new LAG() and LEAD() analytic functions http://msdn.microsoft.com/en-us/library/hh231256.aspx and http://msdn.microsoft.com/en-us/library/hh213125.aspx respectively.

    Take a look at the links provided by Hakan in this thread http://www.sqlservercentral.com/Forums/Topic630262-392-1.aspx#bm636044 - it contains a link to the Connect item raised by Itzik Ben-Gan and Sujata Mehta requesting the LAG and LEAD enhancements (amongst other things). It gives several alternative methods of simulating the LAG/LEAD functionality using T-SQL 2005. The method Dwain has shown is one of those included in the document.

    Dave

  • Gee whiz Dave, I thought I came up with that on my own! 😉

    Actually I'm itching to try out LAG and LEAD, if only I had a 2012 database to play in.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Dwain,

    Why re-invent the wheel, I know none of my code has been truly innovative, but has been an adaptation of someone else's ideas. (Oh, there was that time I was talking to Tim Berners Lee about an idea I had, but that's another story) 😉

    I had a problem recently that needed the LAG()/LEAD() functionality (I had used it in Oracle, so I was aware of the concept using SQL) but I can only use T-SQL 2005. I used the third method (using row_number()) from the document, similar to that used above by Mike. I have to admit I haven't tried the other two solutions to check which performs best in my situation. I will do, but I'm at a proof of concept stage at the moment and getting a solution that returned the correct results in an acceptable time frame is good enough for now.

    In my opinion, it's healthy to adopt/adapt code as long as you make the effort to understand what it does. I'm always looking for code snippets to adopt/adapt. I thought the document in presenting 3 alternatives for this functionality was useful in understanding the concepts and just like the many threads in these forums (or is that fora) show there are several ways to accomplish the same task, and depending on your environment, some are better than others.

    Dave

  • I'm with you on not reinventing the wheel. I reuse code all the time.

    I was just having a bit of fun when I said I'd come up with it on my own. While I did (in other words, I didn't Google up a solution), having a code snippet handy to solve a problem like this is an excellent means of speeding up your delivery.

    One problem I've found with Googled solutions is that you're never really sure if you're getting it from a good source. Too many people I've known use the first solution they find and that can be a very, very bad thing. There are some obvious good choices (like the article you quoted) based on knowing who the author is. Unfortunately, that isn't always a widespread known.

    My feeling is that, if you know the techniques whether you have an available snippet or not, you should still think through the data transformation and at least compare two alternatives. While the first you hit on may work good enough, you should always go with the more scalable solution when you have alternatives. Wait until your tables grow to be 20,000,000 rows or more and you'll appreciate this advice.

    Just as an example, we have one table in a system we support that has the 20M row issue. There's a couple of common questions people ask about data in this table that can be answered by joining the table on itself. That is simply a bad idea. I have about 3 alternatives to that approach that can be used in a similar manner. And generally when this is needed in something that is going to be run repeatedly (e.g., a report) I test each of the alternatives until I'm satisfied I've picked the one that performs best. If I don't, you'd be surprised how easily a bad choice can bring the system to its knees.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Dwain,

    I agree with everything you say. I realised you were jesting re the originality of the code. I've never met Tim Berners Lee either.

    Like you, I hardly ever trust the first hit returned by Google. If several hits return the same idea, then possibly I'll accept its validity, unless its from a name I have in my 'trusted list', then I may be a little less circumspect.

    If my post came across as overly serious or in anyway questioning your ability/integrity that wasn't my intention, as I appreciate your contributions to the threads.

    In my particular example the self join was on a small table (about 10K records), in an isolated system, well away from any production servers.

    Dave

  • Dave,

    No offense taken from your post.

    Google up split string and check how many times you see the classic looping solution. Baaaad choice!

    So the most frequent result is not necessarily the best choice either (Jeff Moden has posted one for this case).

    Using a trusted list is a good idea. So far my circle of trust is not large. No one should take offense at that (I just haven't been around in the SQL world long enough to build it up).


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply