SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Using Computed Columns


Using Computed Columns

Author
Message
Luis Martin Caballero
Luis Martin Caballero
SSC Rookie
SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)

Group: General Forum Members
Points: 29 Visits: 156
Comments posted to this topic are about the item Using Computed Columns
Mike D
Mike D
Say Hey Kid
Say Hey Kid (675 reputation)Say Hey Kid (675 reputation)Say Hey Kid (675 reputation)Say Hey Kid (675 reputation)Say Hey Kid (675 reputation)Say Hey Kid (675 reputation)Say Hey Kid (675 reputation)Say Hey Kid (675 reputation)

Group: General Forum Members
Points: 675 Visits: 32
Why not use a view for computed columns? Is this not clearer?
Alan Burstein
Alan Burstein
SSC Guru
SSC Guru (57K reputation)SSC Guru (57K reputation)SSC Guru (57K reputation)SSC Guru (57K reputation)SSC Guru (57K reputation)SSC Guru (57K reputation)SSC Guru (57K reputation)SSC Guru (57K reputation)

Group: General Forum Members
Points: 57589 Visits: 9174
Good article and great topic. I'm a big fan of computed columns so it's nice to see a detailed article on the topic. I would warn, however, against using a T-SQL scalar UDF for a computed column if performance is important. Once a scalar UDF's (pre-SQL 2019) becomes part of the table queries that involve that table will become slower. For starters, queries involving that table can no longer get a parallel execution plan even when that column is not referenced! Even index rebuilds must be performed with a serial plan. Based on my testing, non-inline Scalar UDFs bring these kind of problems when used for computed columns or as constraints (Check & Default). Note these two articles by Eric Darling:

Another Hidden Parallelism Killer: Scalar UDFs In Check Constraints
Another reason why scalar functions in computed columns is a bad idea

Again - I enjoyed the article and am looking forward to the next one. I'm not being critical of it in any way, I just wanted to make people aware of the performance implications of using scalar UDFs for this. A correctly developed CLR, on the other hand, does not have this problem. Fortunately, too, SQL 2019 adds inline scalar UDFs which solves this problem on the T-SQL side.

-- Alan Burstein


Helpful links:

Best practices for getting help on SQLServerCentral -- Jeff Moden
How to Post Performance Problems -- Gail Shaw

Nasty fast set-based string manipulation functions:
For splitting strings try DelimitedSplit8K or DelimitedSplit8K_LEAD (SQL Server 2012+)
To split strings based on patterns try PatternSplitCM
Need to clean or transform a string? try NGrams, PatExclude8K, PatReplace8K, DigitsOnlyEE, or Translate8K

I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code. -- Itzik Ben-Gan 2001

Luis Martin Caballero
Luis Martin Caballero
SSC Rookie
SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)

Group: General Forum Members
Points: 29 Visits: 156
Hi Mike,
There is no universal solution for every situation. Computed columns are another posibility like views.
When you use computed columns you can persist the information so you will have it stored whith the rest of the table. This way you get better performance when accesing the column and also you reduce the number of objects to administer. Computed columns can be used in some constraints what can also be an advantage.
As usual, neither of the two options can always be used, so it will be necessary to see each particular case to find which of the two options is the one that best fits.
Luis Martin Caballero
Luis Martin Caballero
SSC Rookie
SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)

Group: General Forum Members
Points: 29 Visits: 156
Hi Alan,
Thank you very much for the clarification. The idea was mainly to show the possibilities of the use of calculated columns, but you are right that it's necessary to be careful with the performance according to the solution that we are going to adopt.
thisisfutile
thisisfutile
Hall of Fame
Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)

Group: General Forum Members
Points: 3038 Visits: 1146
Mike D - Tuesday, February 12, 2019 7:55 AM
Why not use a view for computed columns? Is this not clearer?

Our software vendor uses a lot of integer columns for status levels. For example, in the Customer table, the CustomerStatus can be 0,1,2 or 3. They then have a CustomerStatusComputed column for the convenience of anyone that's writing queries or reports. This column shows 'Open','Hold','Closed' or 'Prospect' respectively. This column acts as a sort of anchor or index location for human eyes. Yes, a view can be used and many views already exist with this exact information but, to me, it's much more convenient to look at the table for this information.

drew.allen
drew.allen
SSC Guru
SSC Guru (71K reputation)SSC Guru (71K reputation)SSC Guru (71K reputation)SSC Guru (71K reputation)SSC Guru (71K reputation)SSC Guru (71K reputation)SSC Guru (71K reputation)SSC Guru (71K reputation)

Group: General Forum Members
Points: 71144 Visits: 18742
Mike D - Tuesday, February 12, 2019 7:55 AM
Why not use a view for computed columns? Is this not clearer?

I think that they're both equally clear. Views are good for complex computations, e.g., ones involving sub-queries, but I think they're overkill if you are just doing simple computed columns.

Drew


J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
How to post data/code on a forum to get the best help.
How to Post Performance Problems
Make sure that you include code in the appropriate IFCode tags, e.g. [code=sql]<your code here>[/code]. You can find the IFCode tags under the INSERT options when you are writing a post.
thisisfutile
thisisfutile
Hall of Fame
Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)

Group: General Forum Members
Points: 3038 Visits: 1146
We have a database that is filled with tables utilizing an integer column for things like 'Status'. For example, Customer table has a Status column and 0-3 are the integers used in this column. It also has a StatusComputed column that's computed as (case [Status] when 0 then 'Open' when 1 then 'Hold' when 2 then 'Closed' when 3 then 'Prospect' else 'Unknown' end). Without doing any testing I'd have to guess that the query optimizer would perform better with a query that has WHERE Status = 0 rather than WHERE StatusComputed = 'Open'. Does anyone have any opinions?
drew.allen
drew.allen
SSC Guru
SSC Guru (71K reputation)SSC Guru (71K reputation)SSC Guru (71K reputation)SSC Guru (71K reputation)SSC Guru (71K reputation)SSC Guru (71K reputation)SSC Guru (71K reputation)SSC Guru (71K reputation)

Group: General Forum Members
Points: 71144 Visits: 18742
thisisfutile - Tuesday, February 12, 2019 11:57 AM
We have a database that is filled with tables utilizing an integer column for things like 'Status'. For example, Customer table has a Status column and 0-3 are the integers used in this column. It also has a StatusComputed column that's computed as (case [Status] when 0 then 'Open' when 1 then 'Hold' when 2 then 'Closed' when 3 then 'Prospect' else 'Unknown' end). Without doing any testing I'd have to guess that the query optimizer would perform better with a query that has WHERE Status = 0 rather than WHERE StatusComputed = 'Open'. Does anyone have any opinions?

I prefer using lookup tables for this. It's much easier to add new rows to a lookup table than it is to modify an existing table if you need to add another status, especially if all code changes need to go through a change control process. (Adding a row isn't usually considered a code change, so it doesn't need to go through the change control process.)

Drew


J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
How to post data/code on a forum to get the best help.
How to Post Performance Problems
Make sure that you include code in the appropriate IFCode tags, e.g. [code=sql]<your code here>[/code]. You can find the IFCode tags under the INSERT options when you are writing a post.
thisisfutile
thisisfutile
Hall of Fame
Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)

Group: General Forum Members
Points: 3038 Visits: 1146
drew.allen - Tuesday, February 12, 2019 12:26 PM
thisisfutile - Tuesday, February 12, 2019 11:57 AM
We have a database that is filled with tables utilizing an integer column for things like 'Status'. For example, Customer t has a Status column and 0-3 are the integers used in this column. It also has a StatusComputed column that's computed as (case [Status] when 0 then 'Open' when 1 then 'Hold' when 2 then 'Closed' when 3 then 'Prospect' else 'Unknown' end). Without doing any testing I'd have to guess that the query optimizer would perform better with a query that has WHERE Status = 0 rather than WHERE StatusComputed = 'Open'. Does anyone have any opinions?

I prefer using lookup tables for this. It's much easier to add new rows to a lookup table than it is to modify an existing table if you need to add another status, especially if all code changes need to go through a change control process. (Adding a row isn't usually considered a code change, so it doesn't need to go through the change control process.)

Drew

I'm actually not aware of what a lookup table is. Are you referring to a normalized table? Like status is the key field in the Customer table and they reference a lookup table called "CustomerStatus"?

For reference, the DB I'm referring to is maintained by our software developer. It's their design and the ERP software references the Status column (0-3) and displays Open, Hold, Closed, or Prospect based on its C# code. They decided to put the computed columns in for us customers simply for reference. No more need to write a case statement in the select clause, instead just select the StatusComputed column.

I'm interested to know more about lookup tables though. I searched it and my first hit was a Joe Celko article. Since he's not for the faint of heart (and since I fainted) I figured I'd let you respond to point me in the right direction. BigGrin

drew.allen
drew.allen
SSC Guru
SSC Guru (71K reputation)SSC Guru (71K reputation)SSC Guru (71K reputation)SSC Guru (71K reputation)SSC Guru (71K reputation)SSC Guru (71K reputation)SSC Guru (71K reputation)SSC Guru (71K reputation)

Group: General Forum Members
Points: 71144 Visits: 18742
thisisfutile - Tuesday, February 12, 2019 12:54 PM
drew.allen - Tuesday, February 12, 2019 12:26 PM
thisisfutile - Tuesday, February 12, 2019 11:57 AM
We have a database that is filled with tables utilizing an integer column for things like 'Status'. For example, Customer t has a Status column and 0-3 are the integers used in this column. It also has a StatusComputed column that's computed as (case [Status] when 0 then 'Open' when 1 then 'Hold' when 2 then 'Closed' when 3 then 'Prospect' else 'Unknown' end). Without doing any testing I'd have to guess that the query optimizer would perform better with a query that has WHERE Status = 0 rather than WHERE StatusComputed = 'Open'. Does anyone have any opinions?

I prefer using lookup tables for this. It's much easier to add new rows to a lookup table than it is to modify an existing table if you need to add another status, especially if all code changes need to go through a change control process. (Adding a row isn't usually considered a code change, so it doesn't need to go through the change control process.)

Drew

I'm actually not aware of what a lookup table is. Are you referring to a normalized table? Like status is the key field in the Customer table and they reference a lookup table called "CustomerStatus"?

For reference, the DB I'm referring to is maintained by our software developer. It's their design and the ERP software references the Status column (0-3) and displays Open, Hold, Closed, or Prospect based on its C# code. They decided to put the computed columns in for us customers simply for reference. No more need to write a case statement in the select clause, instead just select the StatusComputed column.

I'm interested to know more about lookup tables though. I searched it and my first hit was a Joe Celko article. Since he's not for the faint of heart (and since I fainted) I figured I'd let you respond to point me in the right direction. BigGrin

It's just what it sounds like. You use one value to look up another. Normalization is a separate, but related concept. Lookup tables are usually normalized, but a normalized table isn't necessarily considered a lookup table.

Generally, you have a single code that you look up in a table to get the description. Joe mentions the "One True Lookup Table" in which case, you will need two (or more) codes which will give you a unique combination, but otherwise the idea is the same--you look up the unique combination to get a description.

Also, you have your terminology a little mixed up. The key field of a table MUST be unique. So, status cannot be the "key field in the Customer table," because it is not unique in the customer table.

Drew


J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
How to post data/code on a forum to get the best help.
How to Post Performance Problems
Make sure that you include code in the appropriate IFCode tags, e.g. [code=sql]<your code here>[/code]. You can find the IFCode tags under the INSERT options when you are writing a post.
thisisfutile
thisisfutile
Hall of Fame
Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)Hall of Fame (3K reputation)

Group: General Forum Members
Points: 3038 Visits: 1146
drew.allen - Tuesday, February 12, 2019 2:05 PM

Also, you have your terminology a little mixed up. The key field of a table MUST be unique. So, status cannot be the "key field in the Customer table," because it is not unique in the customer table.

Drew

I'm sorry, I thought 'foreign' was understood, just as 'primary' is in your statement.

drew.allen
drew.allen
SSC Guru
SSC Guru (71K reputation)SSC Guru (71K reputation)SSC Guru (71K reputation)SSC Guru (71K reputation)SSC Guru (71K reputation)SSC Guru (71K reputation)SSC Guru (71K reputation)SSC Guru (71K reputation)

Group: General Forum Members
Points: 71144 Visits: 18742
thisisfutile - Wednesday, February 13, 2019 1:34 PM
drew.allen - Tuesday, February 12, 2019 2:05 PM

Also, you have your terminology a little mixed up. The key field of a table MUST be unique. So, status cannot be the "key field in the Customer table," because it is not unique in the customer table.

Drew

I'm sorry, I thought 'foreign' was understood, just as 'primary' is in your statement.

The word primary is optional, because it's the default kind of key. The word foreign is required, because it's not the default kind of key. At least, that is my understanding of how the language works, and I haven't seen anything to indicate otherwise (although that could just be observational bias).

Drew


J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
How to post data/code on a forum to get the best help.
How to Post Performance Problems
Make sure that you include code in the appropriate IFCode tags, e.g. [code=sql]<your code here>[/code]. You can find the IFCode tags under the INSERT options when you are writing a post.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum









































































































































































SQLServerCentral


Search