Scalar Function vs Calculated field

  • This is probably a silly question so don't put me in the worst questions posted thread (plz), however I have tried to google it with no luck. What is the performance difference between using Scalar Functions vs a Calculated Field and how are they processed?

    The reason I ask is, I am trying to figure out if my method of using a Calculated field for determining someone's age is the same, better or worse than using a scalar function. I haven't had a chance to test with different row counts as I have started my new job and they are burring me with backlogged work all the way to January. A generalization is fine as I know "depends" is a real thing.

    Thanks guys!

    ***SQL born on date Spring 2013:-)

  • I am assuming that you are not talking about a persisted calculated column. That calculation, iirc, is done once and then stored. It isn't updated unless the column it is based on is updated. If I am wrong here, please, some one correct me.

    An ordinary calculated column is calculated each time the row is accessed including the column itself. If the column is not accessed I don't believe the calculation is done, just seems to make logical sense.

    This would be done on a row by row basis. Is this faster than a scalar function on the same column, don't know as I haven't tested it. Sounds like a something you might want to try and report back on, either in this thread or perhaps by writing an article about it.

    If you are going to use a function in a query, I would steer away from the scalar function and look at using an inline table valued function written to return a one row table and use the cross apply in the from clause.

  • Lynn Pettis (7/11/2014)


    I am assuming that you are not talking about a persisted calculated column. That calculation, iirc, is done once and then stored. It isn't updated unless the column it is based on is updated. If I am wrong here, please, some one correct me.

    An ordinary calculated column is calculated each time the row is accessed including the column itself. If the column is not accessed I don't believe the calculation is done, just seems to make logical sense.

    This would be done on a row by row basis. Is this faster than a scalar function on the same column, don't know as I haven't tested it. Sounds like a something you might want to try and report back on, either in this thread or perhaps by writing an article about it.

    If you are going to use a function in a query, I would steer away from the scalar function and look at using an inline table valued function written to return a one row table and use the cross apply in the from clause.

    Thanks Lynn, and you are correct in your assumption I am not referring to a persisted computed column. I don't know very much about functions yet. So I guess maybe I should start researching it much more instead of bugging on here first. I'll look up the Table Valued Functions and see how to create and use those. As far as the article goes I don't have the smarts for something like that yet.

    This was the example that sparked my curiosity on which one would be better used.

    AGE = convert(int,DATEDIFF(d, 'DateofBirth', getdate())/365.25)

    VS. this Scalar Function.

    BEGIN

    IF @BirthDate > @CurrentDate

    RETURN -1

    DECLARE @Age INT

    SELECT @Age = DATEDIFF(YY, @BirthDate, @CurrentDate) -

    CASE WHEN(

    (MONTH(@BirthDate)*100 + DAY(@BirthDate)) >

    (MONTH(@CurrentDate)*100 + DAY(@CurrentDate))

    ) THEN 1 ELSE 0 END

    RETURN @Age

    END

    GO

    ***SQL born on date Spring 2013:-)

  • Thought you'd like to see some code that will help:

    declare @BirthDate date;

    set @BirthDate = '1959-07-29';

    select datediff(year,@BirthDate,getdate()) - case when dateadd(year,datediff(year,@BirthDate,getdate()),@BirthDate) > getdate()

    then 1

    else 0

    end;

    set @BirthDate = '1959-07-06';

    select datediff(year,@BirthDate,getdate()) - case when dateadd(year,datediff(year,@BirthDate,getdate()),@BirthDate) > getdate()

    then 1

    else 0

    end;

    go

    -- The following is itvf function and the code that follows shows how it is used.

    create function dbo.BirthAge (@BirthDate date)

    returns table

    with schemabinding

    as return(

    select datediff(year,@BirthDate,getdate()) - case when @BirthDate > dateadd(year,-datediff(year,@BirthDate,getdate()),getdate())

    then 1

    else 0

    end Age);

    go

    create table #TestFunc(

    Birthdate date

    );

    insert into #TestFunc

    values ('1960-09-18'),('1966-05-23'),('1990-11-07'),('1994-05-13'),('1998-07-12');

    select

    tf.Birthdate,

    ba.Age

    from

    #TestFunc tf

    cross apply (select Age from dbo.BirthAge(tf.Birthdate))ba

    drop table #TestFunc;

    go

  • Lynn, if you haven't been told before "Your Amazin" 😀

    ***SQL born on date Spring 2013:-)

  • Thank you.

    I did make a slight change to itvf function in my post above.

  • I made that into a permanent table and populated dates of birth all the way back to 01/01/1900 and then ran the Itvf against the computed column. The execution plans where identical below is th stats. This was against approx. 40,000 rows. I'm not sure exactly what it means though as I'm still learning this stuff.

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    SELECT *

    ,AGE = convert(int,DATEDIFF(d, t.Birthdate, getdate())/365.25)

    FROM dbo.testfunc1 t

    ------------------Above is Calculated Field----------

    ------------Below IVT Function-----------------------

    SELECT

    tf.Birthdate,

    ba.Age

    FROM

    TestFunc1 tf

    cross apply (SELECT Age FROM dbo.BirthAge(tf.Birthdate))ba

    I got this SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 0 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    SQL Server parse and compile time:

    CPU time = 4 ms, elapsed time = 4 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    (40025 row(s) affected)

    Table 'TestFunc1'. Scan count 1, logical reads 68, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    (1 row(s) affected)

    SQL Server Execution Times:

    CPU time = 15 ms, elapsed time = 284 ms.

    (40025 row(s) affected)

    Table 'TestFunc1'. Scan count 1, logical reads 68, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    (1 row(s) affected)

    SQL Server Execution Times:

    CPU time = 31 ms, elapsed time = 282 ms.

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 0 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    ***SQL born on date Spring 2013:-)

  • A computed age column could not be persisted in any event because it uses a non-deterministic value (todays_date).

    Whether you use the MONTH+DAY version or the -YEAR version could be based on the results you want for Feb 29th birthdays. The former will treat Mar 1 as the birthday in non-leap years, the latter will treat Feb28th as the birthday. (In some jurisdictions, for certain anniversary calculations, there are actually laws/ordinances that specify which date must be used.)

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply