SQLServerCentral Article

Oiling the gears for the data dictionary

,

As an organisation grows it becomes ever harder to keep track of its data, what it is and where it goes and associated metadata that adds context to turn data into information. A data dictionary is a valuable artefact, particularly when trying to integrate disparate data sources into a data warehouse.

Unfortunately by the time an organisation realises the worth of a data dictionary and decides to implement one it is a case of "We know where we want to get to but we wouldn't start from here"! Creating a data dictionary is a documentation task and developers tend to react to such tasks in the same way that slugs react to salt. With that in mind I thought about how I could make an unpleasant task as easy as possible so it becomes merely a minor irritation rather than a major pain point.

How we can use SQL Servers built in properties

Mindy Curnutt's article Data Dictionary from within SQL Server 2000 showed how descriptions can be added to objects from within Enterprise Manager and the same is also true of SQL Management Studio. These are stored in the MS_DESCRIPTION extended property.

There are a number of tools that can scavenge the SQL Server objects and any associated MS_DESCRIPTION properties in order to produce a documentation set in one or more of HTML, CHM, PDF and Word format. Back in August 2009 I wrote Database Documentation Using Red Gate's SQLDoc which describes not only the use of the tool but also how the MS_DESCRIPTION property can be added using the sp_addExtendedProperty and sp_UpdateExtendedProperty system stored procedures.

Making it simple as maintaining records in a table

The code snippet in my article may be easy for a DBA or TSQL enthusiast but it hardly fits in well with someone whose focus isn't on DB development. With that in mind my first thought was "wouldn't it be much easier to maintain the properties as entries in a table". I could then take the contents of the table and use it to populate the extended properties as and when I had time.

So what I need is two tables to hold descriptions for tables and fields. All the developers will have to do is insert or update records into these tables!

DataDictionary_Tables DataDictionary_Fields
IF NOT EXISTS(SELECT * FROM INFORMATION_SCHEMA.TABLES 
        WHERE TABLE_NAME='DataDictionary_Tables')
    BEGIN
        CREATE TABLE dbo.DataDictionary_Tables(
            SchemaName sysname NOT NULL,
            TableName sysname NOT NULL,
            TableDescription varchar(7000) NOT NULL
            CONSTRAINT DEF_DataDictionary_TableDescription  DEFAULT (''),
            CONSTRAINT PK_DataDictionary_Tables PRIMARY KEY CLUSTERED (
                SchemaName,
                TableName 
            )
        )
        PRINT 'TABLE CREATED: dbo.DataDictionary_Tables'
    END
GO
IF NOT EXISTS(SELECT * FROM INFORMATION_SCHEMA.TABLES 
        WHERE TABLE_NAME='DataDictionary_Fields')
    BEGIN
        CREATE TABLE dbo.DataDictionary_Fields(
            SchemaName sysname NOT NULL,
            TableName sysname NOT NULL,
            FieldName sysname NOT NULL,
            FieldDescription varchar(7000) NOT NULL
            CONSTRAINT DEF_DataDictionary_FieldDescription  DEFAULT (''),
            CONSTRAINT PK_DataDictionary_Fields PRIMARY KEY CLUSTERED (
                SchemaName ,
                TableName ,
                FieldName 
            )
        )
        PRINT 'TABLE CREATED: dbo.DataDictionary_Fields'
    END
GO

Pre-populating the tables

There are three problems here

  • The developer has to ensure that there are no typos when they insert/update records.
  • They have to determine whether to insert or update records in the first place
  • It is asking the developers to write a repetitive insert/update statement.

What would be really useful would be a stored proc that scraped the schema and populated the tables for the developers!

Of course we have the ANSI standard INFORMATION_SCHEMA views to allow us to do just that

  • INFORMATION_SCHEMA.TABLES
  • INFORMATION_SCHEMA.COLUMNS

It is simply a case of performing a LEFT JOIN between our two new tables and these views to grab the non MS user tables as illustrated in the proc below.

CREATE PROC dbo.PopulateDataDictionary
AS 
    SET NOCOUNT ON
    DECLARE @TableCount INT,
        @FieldCount INT
    INSERT  INTO dbo.DataDictionary_Tables ( SchemaName, TableName )
            SELECT  SRC.TABLE_SCHEMA,
                    TABLE_NAME
            FROM    INFORMATION_SCHEMA.TABLES AS SRC
                    LEFT JOIN dbo.DataDictionary_Tables AS DEST
                        ON SRC.table_Schema = DEST.SchemaName
                           AND SRC.table_name = DEST.TableName
            WHERE   DEST.SchemaName IS NULL
                    AND SRC.table_Type = 'BASE TABLE'
                    AND OBJECTPROPERTY(OBJECT_ID(QUOTENAME(SRC.TABLE_SCHEMA)
                                                 + '.'
                                                 + QUOTENAME(SRC.TABLE_NAME)),
                                       'IsMSShipped') = 0
    SET @TableCount = @@ROWCOUNT
    INSERT  INTO dbo.DataDictionary_Fields
            (
              SchemaName,
              TableName,
              FieldName
            )
            SELECT  C.TABLE_SCHEMA,
                    C.TABLE_NAME,
                    C.COLUMN_NAME
            FROM    INFORMATION_SCHEMA.COLUMNS AS C
                    INNER JOIN dbo.DataDictionary_Tables AS T
                        ON C.TABLE_SCHEMA = T.SchemaName
                           AND C.TABLE_NAME = T.TableName
                    LEFT JOIN dbo.DataDictionary_Fields AS F
                        ON C.TABLE_SCHEMA = F.SchemaName
                           AND C.TABLE_NAME = F.TableName
                           AND C.COLUMN_NAME = F.FieldName
            WHERE   F.SchemaName IS NULL
                    AND OBJECTPROPERTY(OBJECT_ID(QUOTENAME(C.TABLE_SCHEMA)
                                                 + '.'
                                                 + QUOTENAME(C.TABLE_NAME)),
                                       'IsMSShipped') = 0
    SET @FieldCount = @@ROWCOUNT
    RAISERROR ( 'DATA DICTIONARY: %i tables & %i fields added', 10, 1,
        @TableCount, @FieldCount ) WITH NOWAIT
GO

Making updates simpler

OK, so we have a method to pre-populate our tables so the developer only ever has to worry about UPDATE statements but can we make it even simpler? Well, yes we can by creating stored procedures and eliminating the possibility of a malformed WHERE clause. For tables this will be as follows: -

CREATE PROC dbo.UpdateDataDictionaryTable
    @SchemaName sysname = N'dbo',
    @TableName sysname, 
    @TableDescription VARCHAR(7000) = '' 
AS 
    SET NOCOUNT ON
    UPDATE  dbo.DataDictionary_Tables
    SET     TableDescription = ISNULL(@TableDescription, '')
    WHERE   SchemaName = @SchemaName
            AND TableName = @TableName
    RETURN @@ROWCOUNT
GO

And for fields this will be as follows:-

CREATE PROC dbo.UpdateDataDictionaryField
    @SchemaName sysname = N'dbo',
    @TableName sysname, 
    @FieldName sysname, 
    @FieldDescription VARCHAR(7000) = '' 
AS 
    SET NOCOUNT ON
    UPDATE  dbo.DataDictionary_Fields
    SET     FieldDescription = ISNULL(@FieldDescription, '')
    WHERE   SchemaName = @SchemaName
            AND TableName = @TableName
            AND FieldName = @FieldName
    RETURN @@ROWCOUNT
GO

Of course it may be beneficial to use a RAISERROR statement rather than the RETURN @@ROWCOUNT statement to print a suitable message. This is particularly true if you are using some form of logged deployment method or continuous integration.

IF @@ROWCOUNT = 1
    RAISERROR('DOCUMENTED TABLE: %s.%s',10,1,@SchemaName,@TbleName) WITH NOWAIT
ELSE
    RAISERROR('***FAILED TO DOCUMENT TABLE: %s.%s',10,1,@SchemaName,@TbleName) WITH NOWAIT

For details of how to use RAISERROR see Quick Hints for using the RAISERROR Command.

If you want to cause a continuous integration task to fail then simply change the ***FAILED raiserror to use severity level 16 rather than 10. Some agile methods advocate "fail early" which is good if there is buy-in for such discipline but a bit of a pain if you are trying to get your data dictionary process adopted. There is an alternative which I will describe in the next section.

Test driven development for the process

If the maintenance of a data dictionary is mandatory then clearly there should be automated tests to ensure that the data dictionary tasks have been completed. These are as simple as throwing an error if our two data dictionary tables have any blank descriptions in them.

CREATE PROC dbo.TestDataDictionaryTables
AS 
    SET NOCOUNT ON
    DECLARE @TableList TABLE
        (
          SchemaName sysname NOT NULL,
          TableName SYSNAME NOT NULL,
          PRIMARY KEY CLUSTERED ( SchemaName, TableName )
        )
    DECLARE @RecordCount INT
    EXEC dbo.PopulateDataDictionary -- Ensure the dbo.DataDictionary tables are up-to-date.
    INSERT  INTO @TableList ( SchemaName, TableName )
            SELECT  SchemaName,
                    TableName
            FROM    dbo.DataDictionary_Tables
            WHERE   TableName NOT LIKE 'MSp%' -- ???
                    AND TableName NOT LIKE 'sys%' -- Exclude standard system tables.
                    AND TableDescription = ''
    SET @RecordCount = @@ROWCOUNT
    IF @RecordCount > 0 
        BEGIN
            PRINT ''
            PRINT 'The following recordset shows the tables for which data dictionary descriptions are missing'
            PRINT ''
            SELECT  LEFT(SchemaName, 15) AS SchemaName,
                    LEFT(TableName, 30) AS TableName
            FROM    @TableList
            UNION ALL
            SELECT  '',
                    '' -- Used to force a blank line
            RAISERROR ( '%i table(s) lack descriptions', 16, 1, @RecordCount )
                WITH NOWAIT
        END
GO

What we are doing here is simple

  • Run our proc to scavenge the database schema for any new tables and/or fields
  • Store any schema.table names where there is a blank description in a table variable
  • If such records exist list them and throw an error that will break the build.

The test for the database fields is very similar.

CREATE PROC dbo.TestDataDictionaryFields
AS 
    SET NOCOUNT ON
    DECLARE @RecordCount INT
    DECLARE @FieldList TABLE
        (
          SchemaName sysname NOT NULL,
          TableName SYSNAME NOT NULL,
          FieldName sysname NOT NULL,
          PRIMARY KEY CLUSTERED ( SchemaName, TableName, FieldName )
        )
    EXEC dbo.PopulateDataDictionary -- Ensure the dbo.DataDictionary tables are up-to-date.
    INSERT  INTO @FieldList
            (
              SchemaName,
              TableName,
              FieldName
            )
            SELECT  SchemaName,
                    TableName,
                    FieldName
            FROM    dbo.DataDictionary_Fields
            WHERE   TableName NOT LIKE 'MSp%' -- ???
                    AND TableName NOT LIKE 'sys%' -- Exclude standard system tables.
                    AND FieldDescription = ''
    SET @RecordCount = @@ROWCOUNT
    IF @RecordCount > 0 
        BEGIN
            PRINT ''
            PRINT 'The following recordset shows the tables/fields for which data dictionary descriptions are missing'
            PRINT ''
            SELECT  LEFT(SchemaName, 15) AS SchemaName,
                    LEFT(TableName, 30) AS TableName,
                    LEFT(FieldName, 30) AS FieldName
            FROM    @FieldList
            UNION ALL
            SELECT  '',
                    '',
                    '' -- Used to force a blank line
            RAISERROR ( '%i field(s) lack descriptions', 16, 1, @RecordCount )
                WITH NOWAIT
        END
GO

In general we have created everything the developer needs to make the maintenance of a data dictionary as painless as possible.

The complicated bit

There are two further tasks to be achieved

  • Create a stored proc to copy the table/field descriptions from our two data dictionary tables to the MS_DESCRIPTION extended properties.
  • Create stored procs to scavenge any legacy MS_DESCRIPTION entries.

The latter is useful if you have historically maintained extended properties yourself using sp_AddExtendedProperty and sp_UpdateExtendedProperty

A proc to apply our data dictionary

What we really want is something that decides which of the two system procs to call

  • sp_AddExtendedProperty
  • sp_UpdateExtendedProperty

There is a gotcha if you are running a mix of pre and post SQL 2005 boxes and that is the property of the 3rd argument

  • SQL 7 and SQL2000 the argument will always be 'User'
  • SQL2005 onwards the argument will always be 'Schema'

Our first task becomes to determine the version of SQL Server on which our procs are running and set a variable to contain the appropriate value.

DECLARE @SQLVersion VARCHAR(30),
    @SchemaOrUser sysname
SET @SQLVersion = CONVERT(VARCHAR, SERVERPROPERTY('ProductVersion'))
IF CAST(LEFT(@SQLVersion, CHARINDEX('.', @SQLVersion) - 1) AS TINYINT) < 9 
    SET @SchemaOrUser = 'User'
ELSE 
    SET @SchemaOrUser = 'Schema'

This certainly works from SQL7 through to SQL2008R2 and should hopefully work fine on SQL2011 (Denali)

The full proc code is as shown below.

CREATE PROC dbo.ApplyDataDictionary
AS 
    SET NOCOUNT ON
    DECLARE @SQLVersion VARCHAR(30),
        @SchemaOrUser sysname
    SET @SQLVersion = CONVERT(VARCHAR, SERVERPROPERTY('ProductVersion'))
    IF CAST(LEFT(@SQLVersion, CHARINDEX('.', @SQLVersion) - 1) AS TINYINT) < 9 
        SET @SchemaOrUser = 'User'
    ELSE 
        SET @SchemaOrUser = 'Schema'
    DECLARE @SchemaName sysname,
        @TableName sysname,
        @FieldName sysname,
        @ObjectDescription VARCHAR(7000)

    DECLARE csr_dd CURSOR FAST_FORWARD
        FOR SELECT  DT.SchemaName,
                    DT.TableName,
                    DT.TableDescription
            FROM    dbo.DataDictionary_Tables AS DT
                    INNER JOIN INFORMATION_SCHEMA.TABLES AS> T
                        ON DT.SchemaName COLLATE Latin1_General_CI_AS = T.TABLE_SCHEMA COLLATE Latin1_General_CI_AS
                           AND DT.TableName COLLATE Latin1_General_CI_AS = T.TABLE_NAME COLLATE Latin1_General_CI_AS
            WHERE   DT.TableDescription <> ''

    OPEN csr_dd
    FETCH NEXT FROM csr_dd INTO @SchemaName, @TableName, @ObjectDescription
    WHILE @@FETCH_STATUS = 0
        BEGIN
            IF EXISTS ( SELECT  1
                        FROM    ::fn_listextendedproperty((NULL, @SchemaOrUser,
                                                        @SchemaName, 'table',
                                                        @TableName, default,
                                                        default) ) 
                EXECUTE sp_updateextendedproperty N'MS_Description',
                    @ObjectDescription, @SchemaOrUser, @SchemaName, N'table',
                    @TableName, NULL, NULL
            ELSE 
                EXECUTE sp_addextendedproperty N'MS_Description',
                    @ObjectDescription, @SchemaOrUser, @SchemaName, N'table',
                    @TableName, NULL, NULL

            RAISERROR ( 'DOCUMENTED TABLE: %s', 10, 1, @TableName ) WITH NOWAIT
            FETCH NEXT FROM csr_dd INTO @SchemaName, @TableName,
                @ObjectDescription
        END
    CLOSE csr_dd
    DEALLOCATE csr_dd
    DECLARE csr_ddf CURSOR FAST_FORWARD
        FOR SELECT  DT.SchemaName,
                    DT.TableName,
                    DT.FieldName,
                    DT.FieldDescription
            FROM    dbo.DataDictionary_Fields AS DT
                    INNER JOIN INFORMATION_SCHEMA.COLUMNS AS T
                        ON DT.SchemaName COLLATE Latin1_General_CI_AS = T.TABLE_SCHEMA COLLATE Latin1_General_CI_AS
                           AND DT.TableName COLLATE Latin1_General_CI_AS = T.TABLE_NAME COLLATE Latin1_General_CI_AS
                           AND DT.FieldName COLLATE Latin1_General_CI_AS = T.COLUMN_NAME COLLATE Latin1_General_CI_AS
            WHERE   DT.FieldDescription <> ''
    OPEN csr_ddf
    FETCH NEXT FROM csr_ddf INTO @SchemaName, @TableName, @FieldName,
        @ObjectDescription
    WHILE @@FETCH_STATUS = 0
        BEGIN
            IF EXISTS ( SELECT  *
                        FROM    ::fn_listextendedproperty(NULL, @SchemaOrUser,
                                                        @SchemaName, 'table',
                                                        @TableName, 'column',
                                                        @FieldName) ) 
                EXECUTE sp_updateextendedproperty N'MS_Description',
                    @ObjectDescription, @SchemaOrUser, @SchemaName, N'table',
                    @TableName, N'column', @FieldName
            ELSE 
                EXECUTE sp_addextendedproperty N'MS_Description',
                    @ObjectDescription, @SchemaOrUser, @SchemaName, N'table',
                    @TableName, N'column', @FieldName
            RAISERROR ( 'DOCUMENTED FIELD: %s.%s', 10, 1, @TableName,
                @FieldName ) WITH NOWAIT
            FETCH NEXT FROM csr_ddf INTO @SchemaName, @TableName, @FieldName,
                @ObjectDescription
        END
    CLOSE csr_ddf
    DEALLOCATE csr_ddf
GO

There are a few points to note in the code. Unless Jeff Moden knows otherwise I believe this is one of the few cases where row by row processing is justified. In truth I should probably bounce the qualifying records into a temp table and iterate through the temp table rather than risk any cursor locking issues.

I am also using explicit COLLATE statements. There reason for this is that I had to document a legacy database restored from a separate server and ran into collation conflicts. In the ideal world our servers should be on a particular collation however I am not building for the ideal world.

Procs to scavenge any legacy descriptions

These are by far the most complicated stored procs. To scavenge table properties the tasks that are needed are as follows

  • Work out what version of SQL is running
  • Build a table variable and populate it with a list of schemas. The function to retrieve MS_DESCRIPTION will retrieve descriptions for all tables in a single schema but not all tables in all schemas.
  • Create a temporary table to hold the output from the fn_listextendedproperty function.
  • Iterate through the schemas in our table variable grabbing the descriptions and updating our data dictionary table.
  • Drop the temporary table.

The stored procedure to do so is as shown below.

CREATE PROC dbo.ScavengeDataDictionaryTables
AS 
    SET NOCOUNT ON
    IF OBJECT_ID('tempdb..#DataDictionaryTables') IS NOT NULL 
        DROP TABLE #DataDictionaryTables
    DECLARE @SchemaOrUser sysname,
        @SQLVersion VARCHAR(30),
        @SchemaName sysname 
    SET @SQLVersion = CONVERT(VARCHAR, SERVERPROPERTY('ProductVersion'))
    SET @SchemaName = ''
    DECLARE @SchemaList TABLE
        (
          SchemaName sysname NOT NULL
                             PRIMARY KEY CLUSTERED
        )
    INSERT  INTO @SchemaList ( SchemaName )
            SELECT DISTINCT
                    TABLE_SCHEMA
            FROM    INFORMATION_SCHEMA.TABLES
            WHERE   TABLE_TYPE = 'BASE TABLE'
    IF CAST(LEFT(@SQLVersion, CHARINDEX('.', @SQLVersion) - 1) AS TINYINT) < 9 
        SET @SchemaOrUser = 'User'
    ELSE 
        SET @SchemaOrUser = 'Schema'

    CREATE TABLE #DataDictionaryTables
        (
          objtype sysname NOT NULL,
          TableName sysname NOT NULL,
          PropertyName sysname NOT NULL,
          TableDescription VARCHAR(7000) NULL
        )
    WHILE @SchemaName IS NOT NULL
        BEGIN
            TRUNCATE TABLE #DataDictionaryTables

            SELECT  @SchemaName = MIN(SchemaName)
            FROM    @SchemaList
            WHERE   SchemaName > @SchemaName

            IF @SchemaName IS NOT NULL 
                BEGIN
                    RAISERROR ( 'Scavenging schema %s', 10, 1, @SchemaName )
                        WITH NOWAIT
                    INSERT  INTO #DataDictionaryTables
                            (
                              objtype,
                              TableName,
                              PropertyName,
                              TableDescription

                            )
                            SELECT  objtype,
                                    objname,
                                    name,
                                    CONVERT(VARCHAR(7000), value)
                            FROM    ::fn_listextendedproperty(NULL,
                                                            @SchemaOrUser,
                                                            @SchemaName,
                                                            'table', default,
                                                            default, default)
                            WHERE   name = 'MS_DESCRIPTION'
                    UPDATE  DT_DEST
                    SET     DT_DEST.TableDescription = DT_SRC.TableDescription
                    FROM    #DataDictionaryTables AS DT_SRC
                            INNER JOIN dbo.DataDictionary_Tables AS DT_DEST
                                ON DT_SRC.TableName COLLATE Latin1_General_CI_AS = DT_DEST.TableName COLLATE Latin1_General_CI_AS
                    WHERE   DT_DEST.SchemaName COLLATE Latin1_General_CI_AS = @SchemaName COLLATE Latin1_General_CI_AS
                            AND DT_SRC.TableDescription IS NOT NULL
                            AND DT_SRC.TableDescription <> ''
                END
        END
    IF OBJECT_ID('tempdb..#DataDictionaryTables') IS NOT NULL 
        DROP TABLE #DataDictionaryTables
GO

The equivalent proc for scavenging fields is slightly more involved because the ::fn_listextendedproperty( function requires both the schema and table as arguments in order to list all the fields in the table. The proc is shown below

CREATE PROC dbo.ScavengeDataDictionaryFields
AS
SET NOCOUNT ON
IF OBJECT_ID('tempdb..#DataDictionaryFields') IS NOT NULL
     DROP TABLE #DataDictionaryFields
IF OBJECT_ID('tempdb..#TableList') IS NOT NULL
     DROP TABLE #TableList
DECLARE 
    @SchemaOrUser sysname,
    @SQLVersion VARCHAR(30),
    @SchemaName sysname ,
    @TableName sysname
SET @SQLVersion = CONVERT(VARCHAR,SERVERPROPERTY('ProductVersion'))
CREATE TABLE #TableList(SchemaName sysname NOT null,TableName sysname NOT NULL)
INSERT INTO #TableList(SchemaName,TableName)
SELECT TABLE_SCHEMA,TABLE_NAME FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_TYPE='BASE TABLE'
IF CAST(LEFT(@SQLVersion,CHARINDEX('.',@SQLVersion)-1) AS TINYINT) <9
    SET @SchemaOrUser = 'User'
ELSE
    SET @SchemaOrUser='Schema'
CREATE TABLE #DataDictionaryFields (
    objtype sysname  NOT NULL,
    FieldName sysname NOT NULL,
    PropertyName sysname NOT NULL,
    FieldDescription VARCHAR(7000) NULL
)
DECLARE csr_dd CURSOR FAST_FORWARD FOR
    SELECT SchemaName,TableName
    FROM #TableList
OPEN csr_dd
FETCH NEXT FROM csr_dd INTO @SchemaName, @TableName
WHILE @@FETCH_STATUS = 0
    BEGIN
        TRUNCATE TABLE #DataDictionaryFields
        RAISERROR('Scavenging schema.table %s.%s',10,1,@SchemaName,@TableName) WITH NOWAIT
    INSERT INTO #DataDictionaryFields
                ( objtype ,
                  FieldName ,
                  PropertyName ,
                  FieldDescription
                )
        SELECT objtype ,
                objname ,
                   name ,
                   CONVERT(VARCHAR(7000),value )
        FROM   ::fn_listextendedproperty( (NULL, @SchemaOrUser, @SchemaName, 'table', @TableName, 'column', default)
        WHERE name='MS_DESCRIPTION'
        UPDATE DT_DEST
        SET DT_DEST.FieldDescription = DT_SRC.FieldDescription
        FROM #DataDictionaryFields AS DT_SRC
            INNER JOIN dbo.DataDictionary_Fields AS DT_DEST
            ON DT_SRC.FieldName COLLATE Latin1_General_CI_AS = DT_DEST.FieldName COLLATE Latin1_General_CI_AS
        WHERE DT_DEST.SchemaName COLLATE Latin1_General_CI_AS = @SchemaNameCOLLATE Latin1_General_CI_AS
        AND DT_DEST.TableName COLLATE Latin1_General_CI_AS = @TableNameCOLLATE Latin1_General_CI_AS
        AND DT_SRC.FieldDescription IS NOT NULL AND DT_SRC.FieldDescription<>''
        FETCH NEXT FROM csr_dd INTO @SchemaName, @TableName
    END
CLOSE csr_dd
DEALLOCATE csr_dd
IF OBJECT_ID('tempdb..#DataDictionaryFields') IS NOT NULL
     DROP TABLE #DataDictionaryFields
IF OBJECT_ID('tempdb..#TableList') IS NOT NULL
     DROP TABLE #TableList
GO

Installing our data dictionary objects

What we have here is a framework for maintaining the data dictionary and this framework should be deployed to all databases that are going to be documented. The framework is reliable and robust so there is a case for installing it within the model DB so it will be present in all databases created on that server.

I have three scripts that I use to deploy the framework in the first place.

Script Purpose
Datadictionary_FrameworkInstall.SQL Uses SQLCMD mode to connect to a server and database then calls the following two scripts
DataDictionaryTables.SQL Builds the two data dictionary tables
DataDictionaryProcs.SQL Builds the stored procs described here.

The Datadictionary_FrameworkInstall.SQL script is very simple.

:CONNECT yourdbserver
:setvar ScriptLocation "C:\Documents and Settings\David\My Documents\SQL Server Management Studio\Projects\Datadictionary\"
go
USE model
GO
-----------------------------------------------------------------------------
PRINT 'DataDictionary_FrameworkInstall.SQL'
PRINT '***************************************'
GO
-----------------------------------------------------------------------------
:r $(ScriptLocation)DataDictionaryTables.SQL
:r $(ScriptLocation)DataDictionaryProcs.SQL

Putting it all together as part of a build process

How a developer uses the framework is really up to them. My interest is that I have made it as simple as possible and having lead the hydrophobic horse to water I choose to watch it drink at a safe distance.

It really depends on whether this is a new build database or maintaining an existing one. If it is an existing one then the first step is to run the Populate and Scavenge stored procs. As developers add objects they may choose to run the populate and update procs at the end of their object creation script or they may choose to have one large script at the end of the build.

The last stored procedures are as follows: -

  • ApplyDataDictionary
  • TestDataDictionaryTables
  • TestDatadictionaryFields

If you have a tool like Red-Gate SQLDoc then you can either have a windows scheduled task to build the documentation set or if it is installed on the continuous integration machine you can even include it as part of the build process.

Personally I favour the latter because a sizable development utilizing test driven development will take considerable time to build and perform all the relevant tests without the overhead of running the actual SQLDoc process.

Final thoughts

I have been an advocate for a decent data dictionary (and for that matter quality systems documentation) ever since I worked in a small company supporting a large number of systems spread over a number of clients. Simple brief intent and purpose comments attached to a database or API makes life a great deal easier. Just because a task isn't popular doesn't mean that it has to be painful. Making a development task easier is always a good move because, like slugs, developers always know where the good beer is.

Rate

4.75 (51)

You rated this post out of 5. Change rating

Share

Share

Rate

4.75 (51)

You rated this post out of 5. Change rating