Blog Post

Insert Unknown Row to Dimension Table

,

When building a warehouse there is a common practice of adding an Unknown row to the Dimension tables.  For me, this is a monotonous task that I hate doing.  I was recently on a project that had a large number of dimensions and I refused to write a T-SQL Statement for each dimension.  Instead I wrote a T-SQL script that generates a T-SQL script that generates or executes the INSERT statement for me.  Here is the script.

create proc dbo.UnknownRow 
      @TableName sysname,
      @TableSchema sysname = 'dbo',
      @Action varchar(10) = 'Script'
as
declare
      @ColumnListing varchar(max) = '',
      @ValuesList varchar(max) = '',
      @query varchar(max)=' '

set @query = @query+ 'IF NOT EXISTS (SELECT * FROM ['+@TableSchema+'].['+@TableName+'] WHERE '+substring(@tablename,4,len(@tablename))+'SK= -1)'
set @query = @query+ ' begin'
set @query = @query+ ' set identity_insert ['+@TableSchema+'].['+@TableName+'] ON'
declare @insert varchar(max) = ' INSERT INTO ['+@TableSchema+'].['+@TableName+']'

SELECT @ColumnListing = @ColumnListing+'['+Column_Name+']'+','
      FROM  INFORMATION_SCHEMA.COLUMNS c
      INNER JOIN SYSOBJECTS o
      ON c.TABLE_NAME = o.name
      INNER JOIN sys.schemas s
      ON o.uid = s.schema_id
      LEFT JOIN sys.all_columns c2
      ON o.id = c2.object_id
      AND c.COLUMN_NAME = c2.name

WHERE
      c.TABLE_NAME = @TableName
      AND c.TABLE_SCHEMA = @TableSchema
      AND c2.is_computed = 0
      AND c.TABLE_SCHEMA = s.name
ORDER BY c.ORDINAL_POSITION
set @ColumnListing = SUBSTRING(@ColumnListing,0, len(@columnlisting))

set @insert = @insert+'('+ @columnlisting+')'
set @query = @query+@insert
SELECT
      @ValuesList = @ValuesList+
      CASE
            WHEN DATA_TYPE IN ('INT', 'NUMERIC') AND c.COLUMN_NAME NOT LIKE '%DateSK' THEN  '-1'
            WHEN DATA_TYPE IN ('DECIMAL') THEN  '-1'
            WHEN DATA_TYPE IN ('VARCHAR', 'CHAR') AND CHARACTER_MAXIMUM_LENGTH = 1 THEN '''U'''
            WHEN DATA_TYPE IN ('VARCHAR', 'CHAR') AND CHARACTER_MAXIMUM_LENGTH = 2 THEN '''Un'''
            WHEN DATA_TYPE IN ('VARCHAR', 'CHAR') AND CHARACTER_MAXIMUM_LENGTH BETWEEN 3 AND 7 THEN '''Unk'''
            WHEN DATA_TYPE IN ('VARCHAR', 'CHAR') AND CHARACTER_MAXIMUM_LENGTH > 7 THEN '''Unknown'''
            WHEN DATA_TYPE IN ('NVARCHAR', 'CHAR') AND CHARACTER_MAXIMUM_LENGTH = 1 THEN '''U'''
            WHEN DATA_TYPE IN ('NVARCHAR', 'CHAR') AND CHARACTER_MAXIMUM_LENGTH = 2 THEN '''Un'''
            WHEN DATA_TYPE IN ('NVARCHAR', 'CHAR') AND CHARACTER_MAXIMUM_LENGTH BETWEEN 3 AND 7 THEN '''Unk'''
            WHEN DATA_TYPE IN ('NVARCHAR', 'CHAR') AND CHARACTER_MAXIMUM_LENGTH > 7 THEN '''Unknown'''
            WHEN DATA_TYPE IN ('INT') AND c.COLUMN_NAME like '%DateSK' THEN '19000101'
            WHEN DATA_TYPE IN ('DateTime') THEN '''1900-01-01'''
            WHEN DATA_TYPE IN ('Date') THEN '''1900-01-01'''
            WHEN DATA_TYPE IN ('TINYINT') THEN  '0'
            WHEN DATA_TYPE IN ('FLOAT') THEN  '0'
            WHEN DATA_TYPE IN ('BIT') THEN  '0'
            ELSE ''''+DATA_TYPE+''''
      END+','
      FROM  INFORMATION_SCHEMA.COLUMNS c
      INNER JOIN SYSOBJECTS o
      ON c.TABLE_NAME = o.name
      INNER JOIN sys.schemas s
      ON o.uid = s.schema_id
      LEFT JOIN sys.all_columns c2
      ON o.id = c2.object_id
      AND c.COLUMN_NAME = c2.name

WHERE
      c.TABLE_NAME = @TableName
      AND c.TABLE_SCHEMA = @TableSchema
      AND c2.is_computed = 0
      AND c.TABLE_SCHEMA = s.name
ORDER BY c.ORDINAL_POSITION

set @query = @query+ ' VALUES('+substring(@ValuesList,0,LEN(@valueslist))+')'
set @query = @query+ ' set identity_insert ['+@TableSchema+'].['+@TableName+'] OFF'
set @query = @query+ ' end'

if(@Action = 'Script')
begin
      print @query
end
else
begin
      exec (@query)
end

The stored procedure in the above has one required parameter the TableName and two optional parameters SchemaName and Action.  The TableName and SchemaName are obvious.  If you accept the default for Action it generates the script, but if you pass anything else it will execute the generated script. Either way, the generated script is an insert statement that adds the unknown row to the specified dimension.  This script has been worked on by a few of the guys on our team.  We have not exhausted all of the possible SQL Server data types.  If you modify the script please share it with us.

Talk to you soon,

Patrick LeBlanc

Founder www.TSQLScripts.com and www.SQLLunch.com.

Visit www.BIDN.com, Bring Business Intelligence to your company.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating