Computed Columns

  • Comments posted to this topic are about the item Computed Columns

  • I like your articles because are very didactic and thinking in a lot of ways how to do it.

    This one with so many "steps" (approximations), has been useful for me to learn that exists the 'PERSISTED' keyword. 🙂

  • It blows me out...I have held off computed columns for the longest time as I thought they would be a performance hit.

    Theoretical questions:

    1. Would it be faster when using numbers than characters?

    2. Will it accept db standard functions e.g. cast/convert, ltrim, rtrim?

    2.1 corollary, will it accept user-defined functions?

  • We used computed columns for an interesting purpose back in SQL 2000. Then it stopped working in 2005. We posted about it on a couple of forums (including MS Technet) - never got a resolution.

    I've cobbled this post together from those places.

    Here is a script of the CREATE

    CREATE TABLE [SchoolPeriod_T] (

    [Yr] [int] NOT NULL ,

    [Bldg] [tinyint] NOT NULL ,

    [PerDay] [varchar] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,

    [PerSlot] [tinyint] NOT NULL ,

    [PerStart] [varchar] (4) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,

    [PerEnd] [varchar] (4) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,

    [Period] [varchar] (3) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,

    [PerType] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,

    [DayOrder] AS (case (left([PerDay],1)) when 'M' then '1' when 'T' then '2' when 'W' then '3' when 'R' then '4' when 'F' then '5' else '0' end) ,

    CONSTRAINT [PKSchoolPeriod] PRIMARY KEY CLUSTERED

    (

    [Yr],

    [Bldg],

    [DayOrder],

    [PerDay],

    [PerSlot]

    ) ON [PRIMARY] ,

    CONSTRAINT [FKSchoolPeriodSchoolYr] FOREIGN KEY

    (

    [Bldg],

    [Yr]

    ) REFERENCES [SchoolYr_T] (

    [Bldg],

    [Yr]

    )

    ) ON [PRIMARY]

    GO

    The DAYORDER was made part of the CLUSTERED PK.

    This worked until SQL 2005

    Now in a SPROC we had this did not work anymore

    .

    .

    .

    Set ArithAbort On

    Set CONCAT_NULL_YIELDS_NULL On

    Set QUOTED_IDENTIFIER On

    Set ANSI_NULLS On

    Set ANSI_PADDING On

    Set ANSI_WARNINGS On

    Set NUMERIC_ROUNDABORT Off

    .

    .

    .

    If @Copy_School_Period$N_Y='Y'

    Begin

    insert into schoolperiod_t (Yr, Bldg, PerDay, PerSlot, PerStart, PerEnd, Period, PerType)

    SELECT @Current_Yr+1, Bldg, PerDay, PerSlot, PerStart, PerEnd, Period, PerType

    FROM Stufiles.dbo.SchoolPeriod_T Where Yr=@Current_Yr and Bldg=@Bldg$$

    Getting this error:

    INSERT failed because the following SET options have incorrect settings: 'ARITHABORT'.

    err # -2147217900

    But this same thing works in a query window

    Begin Tran

    insert into schoolperiod_t (Yr, Bldg, PerDay, PerSlot, PerStart, PerEnd, Period, PerType)

    SELECT 2008, Bldg, PerDay, PerSlot, PerStart, PerEnd, Period, PerType

    FROM Stufiles.dbo.SchoolPeriod_T Where Yr=2007 and Bldg=10

    RollbackAnd this link talks about the problem...

    http://msdn2.microsoft.com/en-gb/library/aa933126(SQL.80).aspx

    - but apparently I'm making all those settings - any ideas??

  • One thing that limits the use of persited calculated columns is the many requirements it has, which can be a headache for legacy systems.

    The main issue for legacy code is that all procedures etc that insert / update / delete from the table / index must have certain set options set to on or off or the insert / update / delete fails. Any select against the column will also not use the index if the set options are incorrect.

    There are a few other requirements such as functions used in the calculation having the same owner etc. which you can find in BOL.

    They are a very good option, but you may not be able to easily take full advantage of them.

  • I believe I had to actually create the procedures with the appropriate options before the problem was resolved, and not just set them within the procedure.

    SET NUMERIC_ROUNDABORT OFF

    SET ANSI_NULLS ON

    SET ANSI_PADDING ON

    SET ANSI_WARNINGS ON

    SET ARITHABORT ON

    SET CONCAT_NULL_YIELDS_NULL ON

    SET QUOTED_IDENTIFIER ON

    GO

    CREATE PROCEDURE

    .....

    GO

  • Better didactic than diabolical! I appreciate the compliment!

    As far as numbers vs chars, it falls into 'it depends'. In general the smaller the column the more effective the index. It's possible to carry that philosphy to extremes though. As far as the restrictions you're right, I'm spoiled in that I've been doing mostly new work and not had to struggle with that much.

  • On more complex calc's or those calling functions, computed columns provide a significant hit on performance..I used to use them heavily...I would suggest using them sparingly.

    Also, you cannot use alter to modify the calc which is another pain. Modifications to the computed col require a drop and recreate.

    My blog: jetlounge.net

  • David Leibowitz (2/12/2008)


    On more complex calc's or those calling functions, computed columns provide a significant hit on performance..I used to use them heavily...I would suggest using them sparingly.

    Also, you cannot use alter to modify the calc which is another pain. Modifications to the computed col require a drop and recreate.

    do you use IsPersisted in those cases? Because I've seen very LITTLE performance hits once you mark them as persisted. Of course - if the calculation drags down your insert and update performance, that wouldn't be good either.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • not all calc's can be persisted...

    especially those that call functions.

    in reality, i see no difference between a persisted computed column and an update trigger that sets a value on a column to a static value. the real value of computed columns is the runtime evaluation of an algorithm..unfortunately, there is a big performance hit...as expected i suppose. this isn't magic.

    My blog: jetlounge.net

  • I believe it's all about where you like to put your business logic.

    We put that computed column in that table 6 years ago so that we could create a Clustered PK and get what we thought would be better performance loading that table with a SELECT.

    When we realized the pain we encountered with the SET option rules we realized that we only had one STORED PROCEDURE that UPDATED/INSERTED into that table - so making it a real field with the business logic in the SPROC was more our style anyway.

    So we dropped the COMPUTED column.

    I personally don't like business logic in the DDL - we don't do defaults - very few triggers. But that's because we are a 100% SPROC shop so we like business logic in SPROCS.

    My 2 cents...

  • I appreciate this article. I'm new to the 'official' world of DBAs...in that it is now actually documented in someone's hr file somewhere... but i've been administering sql for several years now. Thanks to the wonderful world of office politics I have metamorphisized (is that a word?) from helpdesk support....to network admin....to developer.... and now....drumroll please.... DBA. I have often had the pleasure of taking the brunt for all the holes in our infrastructure. I try my best to tune and optimize our environment but still.... because the customer is always right...right?..... I run into failed connections....poor response times....down servers.... failed backups...etc....because I am trying to give the customer what they want.

    My most recent projects have been report writing....and unfortunately some application development. I've relied heavily on including computed values into my queries....which I have found are very costly!

    I have a few reports to deliver this week and I plan on implementing this....

    thanks,

    ~emc!

  • In my usual, "Gotta test everything", obsession, I tried some stuff with computed columns.

    I created three tables, 1 with just the date, 1 with the date and some computed columns, 1 with the date and some persisted, computed columns. (All code and results attached.)

    If you look at the results, I was inserting 31,622,400 into each table. Just the select alone, without any inserts, took 26,421 ms of CPU time.

    First table: 303,032 ms of CPU time.

    Second table: 289,718

    Third table: 385,547

    I unloaded the proc cache between each insert, to eliminate that variable. The second insert took slightly less time (13 seconds out of 303, approx 3% faster), but that's within the usual margin of error. The last insert, with the persisted columns, took 82 extra seconds, about 26% slower.

    First table: 2.445 Meg

    Second table: 2.445 Meg

    Third table: 5.344 Meg

    Again, as expected.

    I then tested some selects on the second and third tables without indexes.

    Five tests on each, each test involved 10 selects of "select 1 where dateday = X, datemonth = Y, dateyear = 2008", with X and Y being varied per-select, but with the same values applied to each table.

    Table 2:

    4640 ms

    4560

    4700

    4593

    4546

    Table 3:

    4560

    4546

    4580

    4516

    4516

    Unindexed, the persisted columns gave about a 10% speed increase on selects.

    Then I indexed the day, month and year columns (1 covering idex) on each table.

    1000 selects in each test (10 was too small to get measureable results):

    Table2:

    2203 ms

    30

    30

    30

    30

    Table3:

    2156

    30

    30

    30

    33

    Which means once the columns are handled by a covering index, the persisted columns have no significant benefit.

    I also tried an update (a large one) on each table. 1,036,800 rows.

    Table 2: 319,297 ms CPU time

    Table 3: 271,265

    I'm not sure why the table with the persisted columns took less time to update. Possibly because of the way the indexes work with the persisted columns, but that's just guessing.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Realized I'd left a permutation out of my tests. I ran a set of ten selects against the table with no calculated columns, based on "where datepart(day, date) = @day...", with day, month and year. Took 144,593 ms. (Approximately 25 times as long as with the calculated columns, even without them persisted or indexed; approximately 4,800 times as long as with non-persisted, indexed calculated columns.)

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • My conclusion is that having calculated, indexed columns (not persisted) can be quite useful in the right circumstances. Dates are especially subject to this, because selecting against them is quite often a painful process that involves index scans on large tables.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply