SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 

Dynamically Create Warehouse Dimensions

By Craig Benson,

When looking to shorten the data warehouse development lifecycle, one should always try to employ proven design patterns to save time. This article illustrates a design pattern that I use to create multi-value dimensions from code set (lookup) tables. This design pattern uses an algorithm that takes a record set of lookup codes, i.e. a code set, pivots them into bit columns and then provides all possible bit permutations for them using dynamic SQL. The number of permutations = 2x (where x = the number of codes in the given code set).

For the purposes of this article, only temporary objects will be created, but this method can be adjusted to connect to an actual code value table and create / propagate an actual dimension table (see the attached SQL script for the complete code).

The first step is to create a temporary “flattened” code value table and insert some code values. Normally, there would be a parent code set table with a description of the code set, but we’ll just focus on the code value table. Here, we’ll use an “Item” code set with 10 code values. The 10 codes will yield 1024 bit permutations (210).

DECLARE @CodeValue AS TABLE (CodesetName VARCHAR(20), Code VARCHAR(3), CodeDescription VARCHAR(100));
     
INSERT @CodeValue (CodesetName, Code, CodeDescription)
  SELECT 'Item', 'TST', 'T-shirt'
  UNION ALL                  
  SELECT 'Item', 'SST', 'Sweatshirt'
  UNION ALL                  
  SELECT 'Item', 'BCP', 'Ball cap'
  UNION ALL                  
  SELECT 'Item', 'SCF', 'Scarf'
  UNION ALL                  
  SELECT 'Item', 'CFM', 'Coffee mug'
  UNION ALL                  
  SELECT 'Item', 'MTN', 'Mittens'
  UNION ALL                  
  SELECT 'Item', 'JCK', 'Jacket'
  UNION ALL                  
  SELECT 'Item', 'SKS', 'Socks'
  UNION ALL                  
  SELECT 'Item', 'HWM', 'Hand warmer'
  UNION ALL                  
  SELECT 'Item', 'PPS', 'Pen and pencil set';

Next, we’ll create some formatting variables that will be used within a dynamic SQL statement. We’ll leverage the COALESCE function to build comma-separated lists.

DECLARE @ColumnList   NVARCHAR(MAX),
        @SelectList   NVARCHAR(MAX),
        @JoinList     NVARCHAR(MAX),
        @SQL          NVARCHAR(MAX);                    

SELECT  @ColumnList = COALESCE(@ColumnList+',','') + '[' + Code + ']' FROM @CodeValue;

SELECT  @SelectList = COALESCE(@SelectList + ', ','') + '[' + Code + '] = x' + CASE RowID WHEN 1 THEN '' ELSE CONVERT(VARCHAR(1000), RowID) END + '.x',
        @JoinList   = COALESCE(@JoinList + CHAR(10),'') + CASE RowID WHEN 1 THEN '' ELSE 'CROSS JOIN x AS x' + CONVERT(VARCHAR(1000), RowID) END
  FROM Codes c;                    

Then, we’ll format a tokenized SQL statement and replace the tokens with their corresponding formatted values.

SELECT @SQL = REPLACE(REPLACE(REPLACE(
         'WITH x AS (
                       SELECT x FROM (VALUES(0),(1)) AS y(x)
                ),
                z AS (
                       SELECT @SelectList
                       FROM x
                            @JoinList
                )                                    
         SELECT  ROW_NUMBER() OVER(ORDER BY @ColumnList) - 1 AS RowID,
                 @ColumnList
         FROM z 
         ORDER BY RowID;', '@SelectList', @SelectList), '@JoinList', @JoinList), '@ColumnList', @ColumnList); 

Lastly, we execute the SQL statement to product the record set.

EXEC sp_executesql @SQL;

The first 10 rows of the results set will look like the following. Note that the record set is zero-indexed, thereby adhering to dimension design best-practices. The columns can be subsequently renamed to meet your requirements.

  RowID                TST         SST         BCP         SCF         CFM         MTN         JCK         SKS         HWM         PPS
  -------------------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- -----------
  0                    0           0           0           0           0           0           0           0           0           0
  1                    0           0           0           0           0           0           0           0           0           1
  2                    0           0           0           0           0           0           0           0           1           0
  3                    0           0           0           0           0           0           0           0           1           1
  4                    0           0           0           0           0           0           0           1           0           0
  5                    0           0           0           0           0           0           0           1           0           1
  6                    0           0           0           0           0           0           0           1           1           0
  7                    0           0           0           0           0           0           0           1           1           1
  8                    0           0           0           0           0           0           1           0           0           0
  9                    0           0           0           0           0           0           1           0           0           1
  10                   0           0           0           0           0           0           1           0           1           0

Putting all the code together looks as follows:

BEGIN TRY
    SET XACT_ABORT ON;
    SET NOCOUNT OFF;
    DECLARE @CodeValue AS TABLE
    (
        CodesetName VARCHAR(20),
        Code VARCHAR(3),
        CodeDescription VARCHAR(100)
    );
    INSERT @CodeValue
    ( CodesetName, Code, CodeDescription)
    SELECT 'Item', 'TST', 'T-shirt'
    UNION ALL
    SELECT 'Item', 'SST', 'Sweatshirt'
    UNION ALL
    SELECT 'Item', 'BCP', 'Ball cap'
    UNION ALL
    SELECT 'Item', 'SCF', 'Scarf'
    UNION ALL
    SELECT 'Item', 'CFM', 'Coffee mug'
    UNION ALL
    SELECT 'Item', 'MTN', 'Mittens'
    UNION ALL
    SELECT 'Item', 'JCK', 'Jacket'
    UNION ALL
    SELECT 'Item', 'SKS', 'Socks'
    UNION ALL
    SELECT 'Item', 'HWM', 'Hand warmer'
    UNION ALL
    SELECT 'Item', 'PPS', 'Pen and pencil set';

    DECLARE @ColumnList NVARCHAR(MAX),
            @SelectList NVARCHAR(MAX),
            @JoinList NVARCHAR(MAX),
            @SQL NVARCHAR(MAX);
    SELECT @ColumnList = COALESCE(@ColumnList + ',', '') + '[' + Code + ']'
    FROM @CodeValue;
    WITH Codes
    AS (SELECT RowID = ROW_NUMBER() OVER (PARTITION BY CodesetName ORDER BY Code),
               Code
        FROM @CodeValue)
    SELECT @SelectList
        = COALESCE(@SelectList + ', ', '') + '[' + Code + '] = x' + CASE RowID
                                                                        WHEN 1 THEN
                                                                            ''
                                                                        ELSE
                                                                            CONVERT(VARCHAR(1000), RowID)
                                                                    END + '.x',
           @JoinList = COALESCE(@JoinList + CHAR(10), '') + CASE RowID
                                                                WHEN 1 THEN
                                                                    ''
                                                                ELSE
                                                                    'CROSS JOIN x AS x' + CONVERT(VARCHAR(1000), RowID)
                                                            END
    FROM Codes AS c;
    SELECT @SQL
        = REPLACE(
                     REPLACE(
                                REPLACE(
                                           'WITH x AS (
                      SELECT x FROM (VALUES(0),(1)) AS y(x)
                ),
                z AS (
                       SELECT @SelectList
                       FROM x
                             @JoinList
                )                                     
         SELECT  ROW_NUMBER() OVER(ORDER BY @ColumnList) - 1 AS RowID,
                 @ColumnList
         FROM z 
         ORDER BY RowID;',
                                           '@SelectList',
                                           @SelectList
                                       ),
                                '@JoinList',
                                @JoinList
                            ),
                     '@ColumnList',
                     @ColumnList
                 );
    EXEC sp_executesql @SQL;
END TRY
BEGIN CATCH
    PRINT ERROR_MESSAGE();
END CATCH;

Happy warehousing!

 
Total article views: 876 | Views in the last 30 days: 876
 
Related Articles
FORUM

RowID in Sql Server

RowID in Sql Server

FORUM

Add RowID Column to Bulk Insert

Add RowID Column to Bulk Insert

FORUM

working with union all or union

union all vs union

BLOG

Sql Server - Union and Union All

Union and Union All, both are used to select data from one or more than one tables but still they......

FORUM

select rowid?

Hey, i wonder what's the most efficient way to get a "rank" of a record in a table set. An example:...

Tags
codeset    
dimension    
permutation    
warehouse    
 
Contribute