Stairway to Advanced T-SQL

Stairway to Advanced T-SQL Level 6: Creating Rows Of Data Using The UNPIVOT Operator

,

Explanation of UNPIVOT Operation

From a simplistic perspective the UNPIVOT operator takes a set of rows, and for every row returns multiple rows. To better understand this, review Figure 1.

UNPIVOT Operation

Figure 1: UNPIVOT operation

From reviewing Figure 1 you can see the UNPIVOT operator took 2 rows that had multiple price values and turned that in to 6 rows, where each product price is on a different row.

The UNPIVOT command specifies two different types of columns. The first type are those columns that are not being unpivoted. In my example above the ID, and ProductName columns are this type of column. The second type of column are those which are being unpivoted. In my example that would be the ProductCode, Wholesale and Retail columns. From my example above you can see that the column values for the columns not being unpivoted are repeated for every unique set of column values for those columns being unpivoted.

Syntax of UNPIVOT operator

The UNPIVOT operator transforms data in a record set that has multiple data values on a single row to a series of rows where there is a single data value output per row. An individual UNPIVOT operation is only able to unpivot a single set of columns.

In order to unpivot data you need to identify the column or columns that the UNPIVOT command will rotate the data on,which you can think of as the key columns, as well as the column names for the data being unpivoted. Below is the syntax for the UNPIVOT command:

SELECT [columns not unpivoted],
 [unpivot_column],
       [value_column],
FROM
(<source query>)
AS <alias for the source data>
UNPIVOT ( [value_column] FOR [unpivot_column] IN ( <column_list> ) ) 
   AS <alias for unpivot>

Where:

  • [columns not unpivoted]: Identifies the list of column names that are not being unpivoted. Also known as key columns.
  • [unpivot_column]: Identifies a column name that represent the column names that are being unpivoted.
  • [value_column]: Identifies a column name that represents the data values that are being unpivoted.
  • <source query>: Identifies the query that provides the data for the UNPIVOT operator.
  • <alias for the source data>: Identifies a table alias for the source data to the UNPIVOT operator.
  • <column_list>: represent the column names for those columns that are being unpivoted.
  • <alias for unpivot>: Identifies an alias name for the record set produced by the unpivot operation.

To better understand this syntax let me show you are few UNPIVOT examples.

Example of Simple UNPIVOT Operation

To demonstrate a simple UNPIVOT command let's assume we have a table named PhoneNumbers. Where each row contains multiple phone numbers for a given a PersonID. I've created and populated my PhoneNumbers table using the code in Listing 1.

USE tempdb;
GO
IF object_id('PhoneNumbers') IS NOT NULL DROP TABLE PhoneNumbers;
GO
CREATE TABLE PhoneNumbers (
PersonID int, 
HomePhone varchar(12),
CellPhone varchar(12), 
Workphone varchar(12), 
FaxNumber varchar(12));
INSERT INTO PhoneNumbers VALUES 
(1,Null,'444-555-2931',Null,Null),
(2,'444-555-1950','444-555-2931',Null, Null),
(3,'444-555-1950', Null,'444-555-1324','444-555-2310'),
(4,'444-555-1950','444-555-2931','444-555-1324',
        '444-555-1987');

Listing 1: Create and populated PhoneNumbers table.

From this PhoneNumbers table I want to produce a normalized result set that contains the PersonID along with the type of phone and the phone number. In this result set most PersonID's will have multiple rows, one for each of their phone numbers. I can meet these requirements by using the code in Listing 2.

SELECT PersonID, PhoneType, PhoneNumber
FROM (
SELECT PersonID, HomePhone, CellPhone, Workphone, FaxNumber
FROM PhoneNumbers ) AS Src
UNPIVOT (
PhoneNumber FOR PhoneType IN 
(HomePhone, CellPhone, WorkPhone, FaxNumber)) AS UNPVT;

Listing 2: Example of using UNPIVOT operator

When I run the code in Listing 2 I get the output in Result 1.

PersonID    PhoneType    PhoneNumber
----------- ------------ ------------
1           CellPhone    444-555-2931
2           HomePhone    444-555-1950
2           CellPhone    444-555-2931
3           HomePhone    444-555-1950
3           Workphone    444-555-1324
3           FaxNumber    444-555-2310
4           HomePhone    444-555-1950
4           CellPhone    444-555-2931
4           Workphone    444-555-1324
4           FaxNumber    444-555-1987

Result 1: Results when running Listing 2

By reviewing the output of the UNPIVOT operation in Result 1 you can see that each row contains only a single phone number. This output represents all the phone numbers contained in my denormalized PhoneNumbers table. Also note that all PersonID values greater than 1 have multiple rows, whereas PersonID = 1 only has a single row of output.

The UNPIVOT command allowed me an easy method for creating a normalized list of PhoneNumbers from my original data. To understand the UNPIVOT command a little better let me review the code a little closer so we can understand how to construct the UNPIVOT query.

Let's review the different columns returned to help you understand the different parts of the UNPIVOT command. The first column in Result 1 is the “PersonID” column. This is known as the “Key” column. The Key columns are the columns found in the source query that are not referenced in the UNPIVOT clause. In my example there is only one Key column, which is “PersonID” in Listing 2. Note that my example only had a single key column, but like in Figure 1 there may be more than one key column. The second column returned is the “PhoneType”. PhoneType is the new column name for the unpivoted column. The unpivoted column contains the column heading names for the columns that are unpivoted. The unpivoted columns are found in the IN clause of the UNPIVOT operator. In Listing 2 that would be Homephone, CellPhone, Workphone and FaxNumber columns. The last column returned is the PhoneNumber column. This column is the Value column. The Value column identifies the column heading that will be used for the data values returned from the UNPIVOT operation. In Listing 2 the Value column is the PhoneNumber column. As you can see in the results the PhoneNumber column contains all the different phone number data values.

Now that you have a general understanding of a simple UNPIVOT operation let's move on to a slightly more complicated example.

Using Two UNPIVOT Operations

In the first example my source table contained a single set of columns that I wanted to unpivot. But there are times where you might have a series of name/value pairs that you want to unpivot together. For my second example I'm going to use two UNPIVOT operations to unpivot a set of name/value pair columns.

Suppose you have a denormalized table that contains a series of columns that represent name value pairs that you want to unpivot, like in the CustPerf table I'm creating in Listing 3.

IF object_id('CustPref') IS NOT NULL DROP TABLE CustPref;
GO
CREATE TABLE CustPref(CustID int identity, CustName varchar(20), 
             Pref1Type varchar(20),  Pref1Data varchar(100),
             Pref2Type varchar(20),  Pref2Data varchar(100),
Pref3Type varchar(20),  Pref3Data varchar(100),
 Pref4Type varchar(20),  Pref4Data varchar(100),
 );
GO
INSERT INTO CustPref (CustName, Pref1Type, Pref1Data,
                                Pref2Type, Pref2Data, 
                                Pref3Type, Pref3Data,
                                Pref4Type, Pref4Data)
VALUES 
('David Smith','Pool', 'Yes',
              'Children', 'Yes',
  'Bed', 'King',
  'Pets', 'No'),
('Randy Johnson','Vehicle', 'Convertible',
              'PriceRange', '$$$',
  null, null,
  null, null),
('Dr. John Fluke','Email', 'DrJ@Pain.com',
              'Office Phone', '555-444-9845',
  'Emergency Phone', '555-444-9846',
  null,null);
SELECT * FROM CustPref;
GO

Listing 3: Sample data for using two UNPIVOT operations

In table CustPref. I have 4 sets of name value pairs. The name value pairs are shown in Result2.

Pref1Type Pref1Data    Pref2Type    Pref2Data     Pref3Type      Pref3Data    Pref4Type Pref4Data
--------- ------------ ------------ ------------- -------------- ------------ --------- ---------
Pool      Yes          Children     Yes          Bed             King         Pets      No
Vehicle   Convertible  PriceRange   $$$          NULL            NULL         NULL      NULL
Email     DrJ@Pain.com Office Phone 555-444-9845 Emergency Phone 555-444-9846 NULL      NULL

Result 2: Data in Sample Table

My example will use two different UNPIVOT operations to create a record set where each row will contain the preference type (PrefType) and the preference value (PrefValue) for each Customer (CustID and CustName). The reason multiple UNPIVOT operation are needed is because each UNPIVOT operation is only able to unpivot a single set of a columns. The first UNPIVOT operation will unpivot the PrefData columns (PrefData1, PrefData2, PerfData3, and PrefData4). The second UNPIVOT operation will unpivot the PrefType columns (PrefType1, PrefType2, PrefType3 and PrefType4) . Basically each UNPIVOT operation creates a single column of output based on the columns unpivoted. Since I have two different sets of columns I want to unpivot I need two PIVOT operations. To meet my requirements of displaying customer preferences, one preference name/value pair per row, I will run the code in Listing 4.

SELECT CustId, CustName, PrefType, PrefValue   
FROM    (SELECT CustID, CustName
         , Pref1Type, Pref1Data, Pref2Type, Pref2Data 
 , Pref3Type, Pref3Data, Pref4Type, Pref4Data 
   FROM CustPref) Pref
UNPIVOT
   (PrefValue FOR PrefValues IN 
      (Pref1Data, Pref2Data, Pref3Data, Pref4Data) )AS UP1
UNPIVOT
   (PrefType FOR PrefTypes IN 
      (Pref1Type, Pref2Type, Pref3Type, Pref4Type)
 
) AS UP2 
WHERE 
substring(PrefValues,5,1) = substring(PrefTypes,5,1);
GO

Listing 4. Unpivoting multiple sets of columns.

When I run the UNPIVOT statement in Listing 4 I get the output in Result 3.

CustId      CustName             PrefType             PrefValue
----------- -------------------- -------------------- --------------
1           David Smith          Pool                 Yes
1           David Smith          Children             Yes
1           David Smith          Bed                  King
1           David Smith          Pets                 No
2           Randy Johnson        Vehicle              Convertible
2           Randy Johnson        PriceRange           $$$
3           Dr. John Fluke       Email                DrJ@Pain.com
3           Dr. John Fluke       Office Phone         555-444-9845
3           Dr. John Fluke       Emergency Phone      555-444-9846

Result 3: Output from running Listing 4

By reviewing the output in Result 3 you can see that a different set of PrefType and PrefValue column pairings are displayed for each key value of CustId and CustName code. The query in Listing 4 accomplished this by having two different UNPIVOT operations and then using the WHERE clause to combine the output of the two UNPIVOT operations based on the type and value column names. The first UNPIVOT operation unpivots the preference value columns (Pref1Data, Pref2Data, Pref3Data and Pref4Data) to create the PrefValue column value. Whereas the second UNPIVOT operation unpivots the preference type columns (Pref1Type, Pref2Type, Pref3Type and Pref4Type) values to create the PrefType column value. To combine the output from these two UNPIVOT operations I used a WHERE clause. The WHERE constraint compared the fifth character of PrefTypes column to the fifth character of the PrefValues column. Remember now the PrefTypes column contains the values Pref1Type, Pref2Type, Pref3Type and Pref4type, whereas the PrefValues column contains the values Perf1Data, Perf2Data, Perf3Data and Perf4Data. Therefore by using a WHERE constraint on the 5th character of the PrefTypes and PrefValues columns I was able to get the name and value pairs for each preference.

Dynamically Building an UNPIVOT Query

Suppose you have a table with hundreds of name/value pairings in a series of named column. It would be very tedious to write a TSQL SELECT statement with two UNPIVOT statements for this many name/value pairs. In this example I will show you how to dynamically build an UNPIVOT query based on the table definition I created in Listing 3. The code in Listing 5 contains my code to dynamically produce the same code I typed in manually in Listing 4.

USE tempdb;
GO
DECLARE @ColNames varchar(1000);
SET @ColNames = '';
-- Get PrefValue Columns
SELECT @ColNames=stuff((
    SELECT DISTINCT ',' + QUOTENAME(COLUMN_NAME)
    FROM INFORMATION_SCHEMA.COLUMNS p2
    WHERE TABLE_NAME = 'CustPref'
  AND COLUMN_NAME like 'Pref_Type'
    FOR XML PATH(''), TYPE).value('.', 'varchar(max)')
            ,1,1,'')
-- Get PrefType Columns
DECLARE @ColValues varchar(1000);
SET @ColValues = '';
SELECT @ColValues=stuff((
    SELECT DISTINCT ',' + QUOTENAME(COLUMN_NAME)
    FROM INFORMATION_SCHEMA.COLUMNS p2
    WHERE TABLE_NAME = 'CustPref'
  AND COLUMN_NAME like 'Pref_Data'
    FOR XML PATH(''), TYPE).value('.', 'varchar(max)')
            ,1,1,'')
-- Generate UNPIVOT Statement
DECLARE @CMD nvarchar(2000);
SET @CMD = 'SELECT CustId, CustName, PrefType, PrefValue FROM ' + 
           '(SELECT CustID, CustName, ' + @ColNames + ',' + @ColValues + 
   ' FROM CustPref) AS Perf UNPIVOT (PrefValue FOR PrefValues IN (' +  
   @ColValues + ')) AS UP1 UNPIVOT (PrefType FOR PrefTypes IN (' + 
   @ColNames + ')) AS UP2 WHERE ' + 
   'substring(PrefValues,5,1) = substring(PrefTypes,5,1);'
-- Print UNPIVOT Command
PRINT @CMD
-- Execute UNPIVOT Command
execute sp_executesql @CMD

Listing 5: Dynamic SQL to generate UNPIVOT query

When I run the code in Listing 5 it will first PRINT the dynamic SQL code it generates, and then will execute the code. When the dynamic SQL code is executed it will produce the same results as shown in Result 2.

To accomplish building this dynamic SQL, I used the INFORMATION_SCHEMA.COLUMNS view. This view allowed me to set two variables @ColNames, and @ColValues that contain a string of comma-separated names for the columns being unpivoted. These two variables are then used to build my dynamic UNPIVOT query that is stored in the @CMD variable. Once I've built the dynamic SQL it is first displayed using the PRINT statement, and then executed using sp_executesql.

This was a trivial example that worked for my table that contained just 4 name/value columns. But this same code would also have worked if my table had 100's of different name/value columns.

Summary

The UNPIVOT operation was first introduced in SQL Server 2005. This operation allows you to take multiple name/value columns from a denormalized table to create a normalized record set where each row contains the values for a single set of name/value pair columns. By using multiple UNPIVOT operations you are able to unpivot multiple different sets of name/value pair columns.

Question and Answer

In this section you can review how well you have understood using the UNPIVOT operator by answering the following questions.

Question 1:

The “IN” clause in the UNPIVOT command is used to identify what:

  1. The column headings that need to be unpivoted
  2. The column values that need to be unpivoted
  3. The name of unpivoted column heading
  4. The name of the unpivoted column value.

Question 2:

A UNPIVOT query has the following syntax:

SELECT [columns not unpivoted],
 [unpivot_column],
       [value_column],
FROM
(<source query>)
AS <alias for the source data>
UNPIVOT ( [value_column] FOR [unpivot_column] IN ( <column_list> ) ) 
   AS <alias for unpivot>

When an UNPIVOT query is executed what will be value will be returned for the “value_column”:

  1. The value of columns within the column_list
  2. The value of the unpivot_column
  3. The value of the columns not being unpivoted
  4. None of the above

Question 3:

You have the following denormalized table definition:

CREATE TABLE Question3 (
KeyId INT, 
X1Type INT, X1Value INT, X2Type INT, X2Value INT, 
Y1Type INT, Y1Value INT, Y2Type INT, Y2Value INT, 
Z1Type INT, Z1Value INT, Z2Type INT, Z2Value INT);

You need to produce a record set that normalizes this table into multiple rows of output, where the output has the following column headings: KeyID, XType, YType and ZType. The XType column needs to display the X1Value, or X2Value column values. The YType column needs to display the Y1Value, or Y2Value column values. The ZType column needs to display the Z1Value, or Z2Value column values.

To meet this output criteria how many UNPIVOT operation will you need.

  1. One
  2. Two
  3. Three
  4. Four

Answers:

Question 1:

The correct answer is a. The IN clause in an UNPIVOT operation is used to identify the column heading names for the columns you want to UNPIVOT

Question 2:

The correct answer is a. The “value_column” contains the values of the columns being unpivoted, which are the values of the columns identified in the IN clause.

Question 3:

The correct answer is c. A single UNPIVOT command can only unpivot one set of column. Since the results need to unpivot three different sets of columns you will need three different UNPIVOT operations to meet the requirements of this question.

This article is part of the parent stairway Stairway to Advanced T-SQL

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating