UDF not working

  • Hi everyone,

    I am trying to create a UDF to apply to a dataset. The code for it is below. I think the issue is in the while loop. I am not sure the while loop is required when I used a recursive CTE. When I take the while loop out my function breaks. I am currently testing it on a HL account to ensure it works then will apply to a full dataset. I have taken the code from the function outside of it and run it in steps over different lags and it works (that is what makes me think the issue is in the while loop).

    Use sand;

    Alter Function UDF_HL_clean;
    ( @H_account NVCHAR(255) )

    Returns @terms table (
    H_account NVCHAR(255),
    seq INT,
    Cardedratedate Date,
    expdate Date,
    term INT,
    Condit VARCHAR(1)
    )

    As
    Begin
    Declare @udftemptable TABLE (
    H_account NVCHAR(255),
    seq INT,
    Cardedratedate Date,
    expdate Date,
    term INT,
    Condit VARCHAR(1)
    )

    insert into @terms select * from sand.dbo.HL_function_test where H_account = @H_account
    declare @lag INT = 0, @loopmax INT

    SET @loopmax = (Select count(*) FROM (select H_account, expdate, lead(expdate, 1) Over (partition by H_account, expdate Order by H_account, seq desc) as lag_exp From dbo.homeloans) a
    where lag_exp = expdate
    AND H_account = @H_account
    Group by H_account, expdate
    )

    while @lag <= @loopmax
    Begin
    with laggedrates (H_account, seq, lag, lageffectivedate, lagexpdate, expdate, lagterm, condit) AS
    (
    select H_account,
    seq,
    @lag as lag,
    lag(cardedratedate, @lag) over (partition by H_account, expdate order by H_account, seq desc),
    lag(expdate, @lag) over (partition by H_account, expdate order by H_account, seq desc),
    expdate,
    lag(term, @lag) over (partition by H_account, expdate order by H_account, seq desc),
    case when lead(expdate,1) over (partition by H_account, expdate order by H_account, seq desc) = expdate THEN 'F' Else Condition END
    from @terms

    Union all

    select a.H_account,
    a.seq,
    b.lag + 1 as lag,
    lag(a.cardedratedate, b.lag + 1) over (partition by a.H_account, a.expdate order by a.H_account, a.seq desc) as lageffectivedate,
    lag(a.expdate, b.lag + 1) over (partition by a.H_account, a.expdate order by a.H_account, a.seq desc) as lagexpdate,
    a.expdate,
    case when lag(a.expdate,b.lag+1) over (partition by a.H_account order by a.H_account, a.seq desc) = a.expdate
    THEN lag(a.term, b.lag +1) over (partition by a.H_account, a.expdate order by a.H_account, a.seq desc)
    else a.term end
    as lagterm
    case when lead(a.expdate,1) over (partition by a.H_account, a.expdate order by a.H_account, a.seq desc) = a.expdate THEN 'F' Else a.Condit END AS Condit
    from @terms a
    inner join laggedrates b
    on a.expdate = b.lagexpdate
    where @lag <=loopmax and b.lagexpdate is NULL
    )

    insert into @udftemptable Select H_account, seq, lageffectivedate, expdate, lagterm, condit
    from laggedrates option (maxrecursion 0)

    delete from @terms
    Insert into @terms select * from @udftemptable
    deleted from udftemptable
    set @lag = @lag + 1
    end
    return
    end
    ;

    Note: I apologise if there are tpyos, I had to manually type that out and change the variable names due to data security.

    There is an issue in my data where a Home loan term can be split across seq when they are part of the same term. This causes the effective date to change to the date of the split and term (date difference between effective date and expiry date) to become no standard term (8 months instead of 12 months).

    Can anyone help me get this function working (probably an impossible task with out the data to trial and error on) or help me get rid of the while loop so I can test the function that way.

    Thank you everyone

  • could you supply some example representative data and what your expected outcome is?

    ***The first step is always the hardest *******

  • Not only some representative data along with the expected output but how about a description of what the function is supposed to do, please.

    I would also get into the habit of documenting your code because even you won't remember what you did or why six months from now and the idea of of "well, just read the code to find out" is just an excuse fore being lazy. 😉

    We'd also need some data from and embedded objects such as sand.dbo.HL_function_test and it would really be nice if that type of information were in a readily consumable format.  See the article at the first link in my signature line below for one of the multiple ways to do that.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "Dear Lord... I'm a DBA so please give me patience because, if you give me strength, I'm going to need bail money too!"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff, one of the best things I ever heard about comments is this:

    Comments don't exist to describe what the code is doing.    They exist to describe what the code is trying to do.

    Obviously if the code is working correctly and efficiently, it will rarely be looked at.     But when there is a problem, it is much faster to have a clear statement of the purpose of the code as opposed to having to intuit its purpose.   This benefits not only other developers but also your future self.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • The Dixie Flatline wrote:

    Jeff, one of the best things I ever heard about comments is this:

    Comments don't exist to describe what the code is doing.    They exist to describe what the code is trying to do.

    Obviously if the code is working correctly and efficiently, it will rarely be looked at.     But when there is a problem, it is much faster to have a clear statement of the purpose of the code as opposed to having to intuit its purpose.   This benefits not only other developers but also your future self.

    BWAAA-HAAA!!!!!  Yep.  I certainly agree with that but that's definitely not the only use.  People frequently crab about things like code reuse.  With a name like "UDF_HL_clean", it think that even those with the proper esoteric knowledge of what an "HL" might be will have trouble determining what the function does.  A simple flower box with a purpose and simple usage example would go a long way in answering the question of "can I use this function to do what I need to do" would go a long way.

    Of course, then there's code like this function.  It might work perfectly functionally speaking but it has so many poor performance flares in it that it's begging to be helped.  What's really bad is the OP is in the process of writing it and just assumes that everyone knows what it does or has the time to figure it out.   Never mind a "future benefit", the benefit would be realized now.  😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "Dear Lord... I'm a DBA so please give me patience because, if you give me strength, I'm going to need bail money too!"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden wrote:

    The Dixie Flatline wrote:

    Jeff, one of the best things I ever heard about comments is this:

    Comments don't exist to describe what the code is doing.    They exist to describe what the code is trying to do.

    Obviously if the code is working correctly and efficiently, it will rarely be looked at.     But when there is a problem, it is much faster to have a clear statement of the purpose of the code as opposed to having to intuit its purpose.   This benefits not only other developers but also your future self.

    BWAAA-HAAA!!!!!  Yep.  I certainly agree with that but that's definitely not the only use.  People frequently crab about things like code reuse.  With a name like "UDF_HL_clean", it think that even those with the proper esoteric knowledge of what an "HL" might be will have trouble determining what the function does.  A simple flower box with a purpose and simple usage example would go a long way in answering the question of "can I use this function to do what I need to do" would go a long way.

    Of course, then there's code like this function.  It might work perfectly functionally speaking but it has so many poor performance flares in it that it's begging to be helped.  What's really bad is the OP is in the process of writing it and just assumes that everyone knows what it does or has the time to figure it out.   Never mind a "future benefit", the benefit would be realized now.  😀

    Let's not forget that a little bit of formatting also helps - a lot.  It is also much easier to document with a simple snippet in those more complex areas of the code to help identify the purpose of that calculation or expression.

    Jeffrey Williams
    Problems are opportunities brilliantly disguised as insurmountable obstacles.

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Ok apologies I am new to this. On another note how do you find a tables details, like column names and data type (for example in SAS you have proc contents)?

    Drop table if exists #data;
    create table #data
    (
    H_account NVCHAR(255),
    seq INT,
    Cardedratedate Datetime,
    expdate Datetime,
    term INT,
    Condit VARCHAR(1)
    )

    --===== All Inserts into the IDENTITY column
    SET IDENTITY_INSERT #data ON
    --===== Insert the test data into the test table
    INSERT INTO #data
    (H_account, seq, Cardedratedate, expdate, term, Condit )
    SELECT 'eee555ww325','1','2019-04-02 00:00:00.000','2019-10-02 00:00:00.000','6','F' UNION ALL
    SELECT 'eee555ww325','2','2017-04-01 00:00:00.000','2019-04-01 00:00:00.000','24','F' UNION ALL
    SELECT 'eee555ww325','3','2015-03-28 00:00:00.000','2017-03-31 00:00:00.000','24','F' UNION ALL
    SELECT 'eee555ww325','7','2014-05-12 00:00:00.000','2015-03-13 00:00:00.000','10','D' UNION ALL
    SELECT 'eee555ww325','8','2014-04-11 00:00:00.000','2015-03-13 00:00:00.000','11','D' UNION ALL
    SELECT 'eee555ww325','5','2014-05-23 00:00:00.000','2015-03-13 00:00:00.000','10','F' UNION ALL
    SELECT 'eee555ww325','6','2014-05-13 00:00:00.000','2015-03-13 00:00:00.000','10','D' UNION ALL
    SELECT 'eee555ww325','9','2014-03-13 00:00:00.000','2015-03-13 00:00:00.000','12','D' UNION ALL
    SELECT 'eee555ww325','10','2012-10-01 00:00:00.000','2017-10-01 00:00:00.000','60','D'
    --===== Set the identity insert back to normal
    SET IDENTITY_INSERT #data ON

    If you order by seq the data will make more sense (I purposefully unordered it because it's not ordered in my data).

    Seq  5-9 are all part of the same term. But the customer has done something to their loan that has cause it to split across seq and there for the terms are non standard. A standard HL term is 6,12,24,36,48,60 months. What I want the data to look like is:

    Drop table if exists #data_I_want;
    create table #data_I_want
    (
    H_account NVCHAR(255),
    seq INT,
    Cardedratedate Datetime,
    expdate Datetime,
    term INT,
    Condit VARCHAR(1)
    )

    --===== All Inserts into the IDENTITY column
    SET IDENTITY_INSERT #data_I_want ON
    --===== Insert the test data into the test table
    INSERT INTO #data
    (H_account, seq, Cardedratedate, expdate, term, Condit )
    SELECT 'eee555ww325','1','2019-04-02 00:00:00.000','2019-10-02 00:00:00.000','6','F' UNION ALL
    SELECT 'eee555ww325','2','2017-04-01 00:00:00.000','2019-04-01 00:00:00.000','24','F' UNION ALL
    SELECT 'eee555ww325','3','2015-03-28 00:00:00.000','2017-03-31 00:00:00.000','24','F' UNION ALL
    SELECT 'eee555ww325','5','2014-03-13 00:00:00.000','2015-03-13 00:00:00.000','12','F' UNION ALL
    SELECT 'eee555ww325','6','2014-03-13 00:00:00.000','2015-03-13 00:00:00.000','12','F' UNION ALL
    SELECT 'eee555ww325','7','2014-03-13 00:00:00.000','2015-03-13 00:00:00.000','12','F' UNION ALL
    SELECT 'eee555ww325','8','2014-03-13 00:00:00.000','2015-03-13 00:00:00.000','12','F' UNION ALL
    SELECT 'eee555ww325','9','2014-03-13 00:00:00.000','2015-03-13 00:00:00.000','12','F' UNION ALL
    SELECT 'eee555ww325','10','2012-10-01 00:00:00.000','2017-10-01 00:00:00.000','60','D'
    --===== Set the identity insert back to normal
    SET IDENTITY_INSERT #data_I_want ON

    Currently the condit part of the function is working. But for the Cardedratedate and expdate I am getting NULLs.

    The function with comments is:

    Use sand;

    Alter Function UDF_HL_clean;
    -- Input variable
    ( @H_account NVCHAR(255) )
    --output table
    Returns @terms table (
    H_account NVCHAR(255),
    seq INT,
    Cardedratedate Date,
    expdate Date,
    term INT,
    Condit VARCHAR(1)
    )

    As
    Begin
    --temp table to load data into and then clear at each loop
    Declare @udftemptable TABLE (
    H_account NVCHAR(255),
    seq INT,
    Cardedratedate Date,
    expdate Date,
    term INT,
    Condit VARCHAR(1)
    )
    --sets up the first table
    insert into @terms select * from sand.dbo.HL_function_test where H_account = @H_account
    declare @lag INT = 0, @loopmax INT
    --Sets the loop max. the dummy table is one record but there are many HL with this issue. Sometimes the
    --issue is across 4 seq and sometimes only 2
    SET @loopmax = (Select count(*) FROM (select H_account, expdate, lead(expdate, 1) Over (partition by H_account, expdate Order by H_account, seq desc) as lag_exp From dbo.homeloans) a
    where lag_exp = expdate
    AND H_account = @H_account
    Group by H_account, expdate
    )
    --loops over each seq where there is an issue while the number is equal or less than the maximum amount of time there is an issue for each HL
    while @lag <= @loopmax
    Begin
    --Names the columns
    with laggedrates (H_account, seq, lag, lageffectivedate, lagexpdate, expdate, lagterm, condit) AS
    (
    select H_account,
    seq,
    @lag as lag,
    --gets the carded rate date from step before because that is the actual date the rate is from
    lag(cardedratedate, @lag) over (partition by H_account, expdate order by H_account, seq desc),
    --Not sure this step is required since the expiry date is correct in the first data.
    lag(expdate, @lag) over (partition by H_account, expdate order by H_account, seq desc),
    expdate,
    -- gets the standard term from step before. need to get the standard term.
    lag(term, @lag) over (partition by H_account, expdate order by H_account, seq desc),
    --this step is actually working. gets condition from in front. These are fixed loans so condit = 'F'
    --THis issue is causing the condit to be 'D'
    case when lead(expdate,1) over (partition by H_account, expdate order by H_account, seq desc) = expdate THEN 'F' Else Condition END
    from @terms

    Union all

    select a.H_account,
    a.seq,
    --adding to the lag
    b.lag + 1 as lag,
    --taking the next lag to look at the next step
    lag(a.cardedratedate, b.lag + 1) over (partition by a.H_account, a.expdate order by a.H_account, a.seq desc) as lageffectivedate,
    --taking the next lag to look at the next step
    lag(a.expdate, b.lag + 1) over (partition by a.H_account, a.expdate order by a.H_account, a.seq desc) as lagexpdate,
    a.expdate,
    --taking the next lag only while the expiry dates are still matching
    case when lag(a.expdate,b.lag+1) over (partition by a.H_account order by a.H_account, a.seq desc) = a.expdate
    THEN lag(a.term, b.lag +1) over (partition by a.H_account, a.expdate order by a.H_account, a.seq desc)
    else a.term end
    as lagterm
    case when lead(a.expdate,1) over (partition by a.H_account, a.expdate order by a.H_account, a.seq desc) = a.expdate THEN 'F' Else a.Condit END AS Condit
    from @terms a
    --recurcisve CTE to join steps together
    inner join laggedrates b
    on a.expdate = b.lagexpdate
    --ending CTE when loopmax is reached
    where @lag <=loopmax and b.lagexpdate is NULL
    )

    insert into @udftemptable Select H_account, seq, lageffectivedate, expdate, lagterm, condit
    from laggedrates option (maxrecursion 0)

    delete from @terms
    Insert into @terms select * from @udftemptable
    deleted from udftemptable
    set @lag = @lag + 1
    end
    return
    end
    ;

    I hope that is everything you guys need. I cannot directly copy and past data or code for security reasons so I apologise if there is tpyos. Let me know if I have missed anything.

    Thanks again

Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply