The Mighty, Mighty Computed Column
I’ve been working with SQL Server for several years now, and I am always searching for new ways to do things. A few days ago, I was experimenting with computed columns in SQL Server 2000, and came across something really, well…neat! It is actually a very simple concept, but one I had never thought of before, and one that I think could be very powerful if used carefully.
We’ve probably all used a computed column before at some point, whether just experimenting or in production use and I think we can all agree that they can be pretty useful, but also somewhat limited. Most likely, most of us have used a computed column similar to the following context:
CREATE TABLE TestComputedColumn
TestColID TINYINT IDENTITY(1,1) PRIMARY KEY,
ComputedColumn AS (Value1+Value2)*(Multiplier)
Nothing new here, just create a simple table and assign a simple statement to the computed column. Now, lets add some values.
INSERT INTO TestComputedColumn(Value1, Value2, Multiplier) VALUES(2,2,2)
INSERT INTO TestComputedColumn(Value1, Value2, Multiplier) VALUES(3,3,2)
INSERT INTO TestComputedColumn(Value1, Value2, Multiplier) VALUES(4,5,3)
INSERT INTO TestComputedColumn(Value1, Value2, Multiplier) VALUES(2,7,3)
INSERT INTO TestComputedColumn(Value1, Value2, Multiplier) VALUES(2,2,7)
If we run a select statement on the TestComputedColumn table, it looks something like this:
Looks like our computed columns is working as intended. Now, we are faced with somewhat of an issue. The issue is that if we want to use some complex logic, not only do we have to hard-code the field names into our expressions in the table, but we also have to duplicate the code in different tables if we want to use the same logic on different tables. This can become very cumbersome. Also, in our expression we cannot reference any other tables. It would be very, very nice if we could encapsulate our code into some sort of module, and reference other tables if needed. The solution to our problem is the use of a user-defined function.
In SQL 2000, we can use a user-defined function as the value for a computed column (and as default values for columns in a table if we want to). This is a very powerful tool. This allows us not only to encapsulate our code, but also to reference other tables.
In our above example, we used the expression:
ComputedColumn AS (Value1+Value2)*(Multiplier)
to define our computed column. But, we can create a user defined function to do the same thing and tie that function to the table.
CREATE FUNCTION dbo.udf_FunctionForComputedColumn ( @ValueField1 MONEY, @ValueField2 MONEY, @MultiplierField INT ) RETURNS MONEY AS BEGIN RETURN((@ValueField1+@ValueField2)*(@MultiplierField)) END
Now, we can run the following statements to alter the table to tie the new function to the computed column.
ALTER TABLE TestComputedColumn DROP COLUMN ComputedColumn ALTER TABLE TestComputedColumn ADD ComputedColumn AS dbo.udf_FunctionForComputedColumn(Value1, Value2, Multiplier)
Sure enough, if we do a select on our table, we get the same result as before.
So, we’ve solved our problem for code reuse. We can now tie this same function to any table we need to. It must be noted that if we want to change our function after we have tied it to a computed column (or a column default), we cannot alter or delete our function because the field is now dependent on it. We can view this in the database by using the simple query:
SELECT OBJECT_NAME(depid) AS Name FROM sysdepends WHERE id = OBJECT_ID('TestComputedColumn') AND id depid AND depid >0
Which returns the resultset:
Lets take the easy route, and just delete our computed column again.
ALTER TABLE TestComputedColumn DROP COLUMN ComputedColumn
Now, what if we want to change the way in which we use our Multiplier in our computed column expression. What if we want to pull this value from a constants table in our database, one that can be changed at any point in time, and that will allow our computed columns to reflect this value. We can tie a user-defined function to a table that references another table, and anytime we do a select on this table, it can reference another table (or N number of tables if needed). But, as Spiderman says, “with great power comes great responsibility”. The thing is that for every row returned in the result set for our select statement, that user-defined function will get called. Depending our function, this could seriously degrade performance for functions that require a lot of I/O. We can see that this function gets returned for every row in our resultset if we run SQL Profiler. The output is below:
We know that have 5 records in our table, and we see that Profiler returns 5 records referencing the user-defined function after our SELECT * FROM TestComputedColumn. The main idea here is to be very careful with these computed columns, especially if you plan to add complex logic in them or have them query other tables, which may take precious I/O in a production environment.
So, back the example. Lets create a SystemConstants table that we can use to store our Multiplier value.
CREATE TABLE SystemConstants ( SystemConstantID TINYINT IDENTITY(1,1), LookupValue VARCHAR(15), ConstantValue VARCHAR(10), ValueDescription VARCHAR(50) ) GO INSERT INTO SystemConstants(ConstantValue, LookupValue, ValueDescription) VALUES('9', 'Multiplier', 'Muliplier value we plan to use.') GO
Now, we have a value in our SystemConstants table that we can use in our user-defined function to return our computed column value. We can alter the function we defined before to look something like this:
ALTER FUNCTION udf_FunctionForComputedColumn ( @ValueField1 MONEY, @ValueField2 MONEY ) RETURNS MONEY AS BEGIN DECLARE @Multiplier INT SELECT @Multiplier = CAST(ConstantValue AS INT) FROM SystemConstants WHERE LookupValue = 'Multiplier' RETURN((@ValueField1+@ValueField2)*(@Multiplier)) END
Now, if we select the rows from the TestComputedColumn table, the records returned will look like this:
We can see that the Value1 and Value2 fields are added together and multiplied by the value in our SystemConstants table, which happens to be the value 9.
It is also definitely worth noting that because the use of our user-defined function is labeled as “non-deterministic” because it references another table, we cannot create an index on our computed column. This is just another trade-off in the use of the UDF.
I hope I have outlined something in SQL Server 2000 (haven’t tested it yet in 2005) that you can use to be more productive and efficient. There are limitless ways to use computed columns with the use of user-defined functions, and they are especially handy if you look to avoid computing the same data continuously.
However, the rule is to be careful when you do use them because misuse can lead to performance degradation. If you use a User-Defined Function for your computed column then every record returned in any select query on the table which includes that computed column, the UDF will be invoked. Just something to think about if you decide to try it out, because calling a UDF which queries other tables for each row of the parent table in a production environment could potentially cripple the system.
I'd love to hear comments about this article. I can be contacted directly at email@example.com.