One of the most confusing things about learning SQL is that there are so many ways to do the same thing! Should you be using temporary tables? Or table variables? Or common table expressions? This article gives 6 different ways to do the same simple thing in SQL, and considers the pros and cons of each.
The example we'll use
You can download and run the simple script attached to this article to generate a simple database. What you'll get are two tables: one of actors, and one of movie directors:
What I want to show is how to write a simple script to show the names of all of the actors and directors who are female. The results of running the query in every case would be as follows:
I'm well aware, by the way, that the easiest way to do this would be to use a UNION query, but the point of this article is to show the various techniques available for working with sets of data in Transact-SQL.
The Six Methods
The 6 different techniques this article will show are:
- Temporary tables
- Table variables
- Inline table-valued functions
- Multi-statement table-valued functions
- Derived tables
- Common table expressions (CTEs)
What I'll try to do is to show the pros and cons of each technique, so that you can decide for yourself which one is for you.
Solution 1 - Temporary Tables
I like temporary tables because they're simple to understand: they work in virtually every respect exactly like normal tables. Here's a query which would solve the above problem:
-- first delete any left over temporary table BEGIN TRY DROP TABLE #females END TRY BEGIN CATCH -- if table didn't exist, nothing to do here END CATCH -- put the female directors into a table SELECT DirectorName AS PersonName, 'Director' AS Job, DirectorDob AS Dob INTO #females FROM tblDirector WHERE DirectorGender = 'Female' -- now add in the female actors INSERT INTO #females SELECT ActorName AS PersonName, 'Actor' AS Job, ActorDob AS Dob FROM tblActor WHERE ActorGender = 'Female' -- finally, show the results SELECT * FROM #females
This would create a temporary table (they always begin with a prefix of #) in the system TEMPDB folder, populate it and show the results. Here are the advantages:
|Simplicity||Because temporary tables work just like normal ones, you can sort, filter and join them just like normal tables.|
|Speed||SQL Server doesn't have as many logging and locking overheads for temporary tables, so they run more quickly.|
|Security||You can do what you like to a temporary table, without worrying about making changes to permanent tables.|
Against this are the disadvantages:
|Not that fast||While temporary tables are faster than permanent ones, they're still possibly not as fast as table variables (see below).|
|Limitations||You can't use INSERT, UPDATE or DELETE statements against temporary tables in user-defined functions.|
Solution 2 - Table Variables
An alternative way to solve the above problem is to create a special sort of variable: one to hold a table, rather than a scalar value. Here's some code which would do this:
-- declare the table variable DECLARE @people TABLE ( PersonId int IDENTITY(1,1) PRIMARY KEY, PersonName varchar(20), Job varchar(10), Dob datetime ) -- add in the female directors INSERT INTO @people SELECT DirectorName AS PersonName, 'Director' AS Job, DirectorDob AS Dob FROM tblDirector WHERE DirectorGender = 'Female' -- now add in the female actors INSERT INTO @people SELECT ActorName AS PersonName, 'Actor' AS Job, ActorDob AS Dob FROM tblActor WHERE ActorGender = 'Female' -- finally, show the results SELECT * FROM @people
This is nice and easy to understand: create a table in memory and then add records into it. Table variables have good points:
|Speed||Because table variables exist only in memory, there should in theory be no locking or logging overhead for SQL Server - they therefore run quickly. However, table variables still write information to and read from the TEMPDB database, so the performance gain is actually illusory.|
|Security||You can do what you like to the rows of a table variable, without having to worry about whether you're corrupting your company's data.|
... and bad points:
|Scope||A table variable is limited in scope (so, for example, you can't refer to a table variable in one stored procedure called from another which has declared it).|
|Finicky joining||If you want to join a table variable to another table, yiou must first give it an alias (it's easy to forget this if you're not in the habit of giving all of your tables aliases).|
|No SELECT INTO||You can't select records into a table variable - you must define it first and then use INSERT INTO.|
For me, temporary tables are easier to understand and to use, but I've no doubt a chorus of comments will tell me why I'm wrong!
Solution 3 - Inline Table-Valued Function
An inline table-valued function is just that: a function which instead of returning a single value returns a set of records. You could use the following code to create 2 table-valued functions and then call them:
-- create function to show a list of directors CREATE FUNCTION fnDirectors( -- the gender we're interested in @Gender varchar(20) ) RETURNS TABLE AS RETURN -- the only thing which can go in an inline TVF is a single -- SELECT statement SELECT DirectorName AS PersonName, 'Director' AS Job, DirectorDob AS Dob FROM tblDirector WHERE DirectorGender = @Gender GO -- create another function to get a list of actors CREATE FUNCTION fnActors( -- the gender we're interested in @Gender varchar(20) ) RETURNS TABLE AS RETURN -- again, the only thing which can go in an inline TVF is a single -- SELECT statement SELECT ActorName AS PersonName, 'Actor' AS Job, ActorDob AS Dob FROM tblActor WHERE ActorGender = @Gender GO -- now show results using a UNION query (we didn't really need the function) SELECT * FROM dbo.fnDirectors('Female') UNION ALL SELECT * FROM dbo.fnActors('Female')
Although this is a bit of a silly example (it's hard to see what we've gained by using of using table-valued functions in this case), they are a good alternative to views and stored procedures. We have standard TVFs (that's what you should call them to impress people) to show, for example:
- A list of all the courses we run between two given dates; and
- A list of all the delegates for a given course id number.
We'll consider the pros and cons of inline table-valued functions after we've looked at their cousins: multi-statement table-valued functions, or MSTVFs (if ever something needed a different name, this is it).
Solution 4 - Multi-Statement Table-Valued Functions
A multi-statement table-valued function - aside from being a bit of a mouthful - returns a set of records given some initial input parameters. The difference is that the function can perform a whole range of processing within it, and in this respect is like a cross between a table variable and a stored procedure.
Here's an example of an MSTVF to solve our problem:
CREATE FUNCTION fnPeople( -- pass in the gender: M or F @gender varchar(10) ) -- spit out a table containing the people RETURNS @people TABLE ( PersonId int IDENTITY(1,1) PRIMARY KEY, PersonName varchar(20), Job varchar(10), Dob datetime ) AS BEGIN -- add in the female directors INSERT INTO @people SELECT DirectorName AS PersonName, 'Director' AS Job, DirectorDob AS Dob FROM tblDirector WHERE DirectorGender = @gender -- now add in the female actors INSERT INTO @people SELECT ActorName AS PersonName, 'Actor' AS Job, ActorDob AS Dob FROM tblActor WHERE ActorGender = @gender RETURN END GO SELECT * FROM dbo.fnPeople('Female')
- Creates what is essentially a table variable in memory
- Populates this with the female (or male) directors and actors; then
- Spits this set of records back out
Table functions are great when you frequently want to return the same set of records, but with varying input parameters. For example:
- All of the accounts for a particular region of a country; or
- All of the products for a particular sales manager
They do, however, have a number of limitations that you should be aware of:
|No side effects||A function can't insert, update or delete rows in permanent tables.|
|No stored procedures||A function can't call a standard stored procedure (although it can call extended stored procedures and other functions).|
|Deterministic functions only||A function can't call certain non-deterministic functions (ie ones whose return value isn't predictable) such as the rand function to generate a random number.|
|No temporary tables||You can't use temporary tables within a user-defined function.|
|Limited error trapping||You can't use TRY / CATCH clausee in functions that you create.|
Although the limitations of functions aren't particularly onerous, it's sometimes easier to do everything in a stored procedure. You can see a detailed comparison of the speeds of ITVFs and MSTVFs here.
Solutions 5 and 6 - Derived Tables and CTEs
These two solutions are essentially the same thing: Common Table Expressions (CTEs) were introduced in SQL Server 2005, and make derived tables more readable.
Here's a solution to our problem using a common table expression (and a UNION query, rendering this solution a bit more long-winded than it needs to be!):
-- get list of female directors WITH FemaleDirectors AS ( SELECT DirectorName AS PersonName, 'Director' AS Job, DirectorDob AS Dob FROM tblDirector WHERE DirectorGender = 'Female' ) SELECT * FROM FemaleDirectors UNION ALL SELECT ActorName AS PersonName, 'Actor' AS Job, ActorDob AS Dob FROM tblActor WHERE ActorGender = 'Female'
Here's the same solution, but using a derived table:
SELECT * FROM ( SELECT DirectorName AS PersonName, 'Director' AS Job, DirectorDob AS Dob FROM tblDirector WHERE DirectorGender = 'Female' -- you have to give the derived table an alias ) AS FemaleDirectors UNION ALL SELECT ActorName AS PersonName, 'Actor' AS Job, ActorDob AS Dob FROM tblActor WHERE ActorGender = 'Female'
In both cases we've created a temporary set of records on-the-fly, given it an alias of FemaleDirectors, and then referred to this in a subsequent SELECT statement.
I would use CTEs in preference to derived tables any time, since they read far more like English. However, there are a few things to watch out for - more quirks than disadvantages:
|Semi-colons||If the CTE isn't the first statement in a batch, the previous statement must end with ; (ie a semi-colon).|
|Brackets||You must enclose both a CTE and a derived table in parentheses.|
|Short life span||A CTE must be followed immediately by a SELECT statement using it; CTEs don't have a long life!|
The main conclusion to this article is that there aren't any conclusions! What I've tried to do is to show in a single article the range of table-based tools available in SQL, so programmers can choose their own. I've used every one of these in anger apart from derived tables (which I always substitute with CTEs), so they all have their niches.