Re-oiling the gears for the data dictionary or catalog

,

Back in 2011 I wrote Oiling the gears for the data dictionary It describes a means of simplifying the addition of comments to SQL Server objects used by a number of database documentation tools.  These produce information that is up-to-date, consistently presented and in a consumable format.  These comments can provide the basis for a data dictionary or catalog.  Ultimately you still have to make the effort to supply meaningful descriptions for these tools to harvest.  However, there is a shortcut and I should like to share my approach with you.

Recap and update

Oiling the gears for the data dictionary took the simple approach to hiding the complexity of the extended properties stored procedures, sp_AddExtendedProperty and sp_UpdateExtendedProperty.

  • Build two data dictionary tables to hold table and column descriptions respectively
  • Create a stored procedure to populate those tables from the INFORMATION_SCHEMA catalog views
  • Add two more stored procedures, each to update a specific data dictionary tables
  • Finally, build a stored procedure to take the descriptions from our two tables and apply them as extended properties

I also specified two "test" stored procedures that would highlight any tables or columns with blank descriptions.  However these result in too many failed tests when plugged into a legacy system.  You can only plug these into a build process once the number of fails are low enough for you to have a realistic chance of filling in the missing descriptions.

As of 2019 I would make three changes to the approach I took in 2011.

  • Create my artifacts in an explicit Metadata schema, not dbo
  • Change the table DataDictionary_Tables name and structure to support a wider set of objects. Views are important too!
  • Alter column names in tables, views and stored procedures containing “field” to “column”

What is the problem I am trying to solve?

When you have worked for an organisation for a long time it is easy to forget what it is like to be a new starter.  Especially when that new starter is not only new to the organisation but new to the world of work.

  • How do they acquire the knowledge that you take for granted?
  • Do they understand the various data modelling concepts and how those models can be used to make their lives easier?
  • Is that information written down anywhere or is it word of mouth from person to person?
  • Is your written information accurate and up-to-date?

In addition, increased regulation makes the provision of a data dictionary or catalog an important weapon in the compliance arsenal.

Good quality documentation is something that everyone wants to be available but rarely is, furthermore few have the time or mindset necessary to produce and maintain such information.

This is where database documentation tools such as Redgate SQLDoc come in. Redgate SQLDoc allows you to publish your database  documentation in HTML, PDF, Word or Markdown files. It allows you to chose what artefacts to publish and also what properties should be included.

It also allows you to maintain the MS_DESCRIPTION property on the various SQL Server database objects including the database itself.  Although the product makes it easier to supply the descriptions it does not generate default descriptions for you.

In my experience with a data dictionary or catalog is that you know where you want to get to but you wouldn't start from where you are today!

Can we auto-generate column descriptions?

The data warehouse I work on has tens of thousands of columns across the various tables and views.  When examining the structure of the database there are a number of common columns across a multitude of tables.  For example the majority of tables have some form of DateKey field that can be used with a DimDate dimension.

Even in the AdventureWorksDW2012, the small demo data warehouse database, we can see that DateKey crops up 14 times.

SELECT name AS ColumnName,COUNT(*) AS Occurrences,GROUPING(name) AS IsTotal
FROM AdventureWorksDW2012.sys.all_columns
WHERE name LIKE '%datekey'
GROUP BY name WITH ROLLUP
ColumnName Occurrences IsTotal
DateKey 8 0
DueDateKey 2 0
OrderDateKey 2 0
ShipDateKey 2 0
NULL 14 1

We can use SQL to apply simple rules to generate descriptions and wrap it up in a view.

CREATE VIEW MetaData.DefaultDescription_DateKey
AS
SELECT
    OBJECT_SCHEMA_NAME(AC.object_id) AS SchemaName,
    O.name AS ObjectName,
    O.type AS ObjectType,
    AC.name AS ColumnName,
    CASE
        WHEN AC.name='OrderDateKey' THEN 'Identifies the date on which the customer placed the order.'
        WHEN AC.name='DueDateKey' THEN 'The expected delivery date. This should be within 3 days of the order date. '
        WHEN AC.name='ShipDateKey' THEN 'The delivery date. By subtracting the order date by the delivery date we can identify instances where we are not satisfying the customer.'
        WHEN AC.name='DateKey' THEN 'The date on which the ' + OBJECT_SCHEMA_NAME(AC.object_id)+'.'+O.name + ' record was created. '
        ELSE '' END AS ColumnDescription
FROM AdventureWorksDW2012.sys.all_columns AS AC
    INNER JOIN sys.objects AS O
    ON O.object_id = AC.object_id
WHERE AC.name LIKE '%DateKey'
    AND O.type IN('U','V')
GO

The rules can be as simple or as complex as needed. You could have made the rule take account of whether the OrderDateKey was for an internet sale or an order passed on to a reseller.

The important piece is telling the reader that the column links back to the relevant dimension.

There are two problems with this approach:

  • You can end up with a lot of views
  • Your descriptions are still manually created

Addressing the “too many views” problem

There are two parts to our view to provide DateKey descriptions:

  • The broad WHERE clause to bring back any column ending in DateKey
  • The specific column descriptions in the CASE statement

We can represent this by adding a couple of tables to our MetaData schema as shown below.

DataDictionary_ColumnPatterns will contain records similar to those below:

ColumnPatternID ColumnPattern
1 %DateKey
2 %CustomerKey

DataDictionary_ColumnDefaults will replicate the function of the CASE statement from our earlier query.

  • WHEN values stored in ColumnName
  • THEN value stored in ColumnDescription
ColumnName ColumnPatternId ColumnDescription
DateKey 1 The date on which the {schema.table} record was created.
DueDateKey 1 The expected delivery date. This should be within 3 days of the order date.
OrderDateKey 1 Identifies the date on which the customer placed the order.
ShipDateKey 1 The delivery date. By subtracting the order date by the delivery date we can identify instances where we are not satisfying the customer.
CustomerKey 2 The DimCustomer dimension presents properties of the customer but also those properties that allow us to dice/slice and aggregate such as gender, marital status, commute distance banding.

We can create a view to apply these column descriptions.

CREATE VIEW MetaData.DataDictionary_ColumnDefaultDescriptions
AS
SELECT 
    OBJECT_SCHEMA_NAME(O.object_id) AS SchemaName,
    O.name AS ObjectName,
    REPLACE(O.type_desc, 'USER_', '') AS ObjectType,
    C.name AS ColumnName,
    REPLACE(CD.ColumnDescription, '{schema.table}', OBJECT_SCHEMA_NAME(O.object_id) + '.' + O.name) AS ColumnDescription
FROM sys.columns AS C
    INNER JOIN sys.objects AS O
    ON C.object_id = O.object_id
    INNER JOIN MetaData.DataDictionary_ColumnPatterns AS CP
    ON C.name LIKE CP.ColumnPattern
    INNER JOIN MetaData.DataDictionary_ColumnDefaults AS CD
    ON C.name = CD.ColumnName
WHERE O.type IN ( 'U', 'V' );
GO

Now we have that view we can use it to update our MetaData.DataDictionary_Columns table.

UPDATE DEST
SET DEST.ColumnDescription = SRC.ColumnDescription
FROM MetaData.DataDictionary_Columns AS DEST
    INNER JOIN MetaData.DataDictionary_ColumnDefaultDescriptions AS SRC
    ON SRC.SchemaName = DEST.SchemaName
    AND SRC.ObjectName = DEST.ObjectName
    AND SRC.ColumnName = DEST.ColumnName
WHERE DEST.ColumnDescription = ''

What this has achieved is to turn our column description generator into a data driven mechanism.  This makes the addition of rule based descriptions to our data dictionary or catalog a data change rather than a code change.

Harvesting key information for the data dictionary

We can take this a stage further. Our system tables tell us where we have primary and foreign keys therefore we can use these to generate default column descriptions as well.

Remember databases may be sparse on keys whether by design or by omission. This fact was the inspiration for Auto-suggesting foreign keys and data model archaeology.

Primary Key Descriptions

Redgate SQLDoc emphasizes columns participating in a Primary/Foreign Key with an icon. These include descriptions in the image alt text that can be picked up by screen readers. For those of us whose eyesight isn’t what it was but don’t use a screen reader a simple description will suffice.

CREATE VIEW MetaData.DefaultDescription_PrimaryKey
AS
WITH SingleColumnPK AS (
    SELECT 
        I.object_id,
        I.index_id,
       MAX(LOWER(i.type_desc)) AS IndexType,
       MAX(IC.column_id) AS column_id
    FROM sys.indexes AS I
        INNER JOIN sys.index_columns AS IC
        ON IC.object_id = I.object_id
        AND IC.index_id = I.index_id
    WHERE I.is_primary_key = 1
    GROUP BY I.object_id,I.index_id
    HAVING COUNT(*) = 1
)
SELECT 
    OBJECT_SCHEMA_NAME(C.object_id) AS SchemaName,
    OBJECT_NAME(C.object_id) AS ObjectName,
    'TABLE' AS ObjectType,
    C.name AS ColumnName,
    'The '+ SCPK.IndexType +' primary key that uniquely identifies the ' + OBJECT_SCHEMA_NAME(C.object_id)+'.'+OBJECT_NAME(C.object_id)+' record.' AS ColumnDesccription
FROM sys.columns AS C
    INNER JOIN SingleColumnPK SCPK
    ON SCPK.object_id = C.object_id
    AND SCPK.column_id = C.column_id
GO

Foreign Key Descriptions

Assuming that there are foreign keys in the database then a simple description can be generated for single column foreign keys too.

CREATE VIEW MetaData.DefaultDescription_ForeignKey
AS
WITH SingleColumnFK AS (
    SELECT constraint_object_id,parent_object_id
    FROM sys.foreign_key_columns
    GROUP BY constraint_object_id,parent_object_id
    HAVING COUNT(*) = 1
)
SELECT
    OBJECT_SCHEMA_NAME(FKC.parent_object_id) AS SchemaName,
    OBJECT_NAME(FKC.parent_object_id) AS ObjectName,
    'TABLE' AS ObjectType,
    PC.name AS ColumnName,
    CASE
        WHEN FKC.parent_object_id = FKC.referenced_object_id
            THEN 'Joins back to the ' + PC.name + ' column in this table.'
        ELSE 'Join to '+FC.name + ' column in the ' 
        +OBJECT_SCHEMA_NAME(FKC.referenced_object_id)+'.'+OBJECT_NAME(FKC.referenced_object_id) + ' table.'
        END AS ColumnDescription
    FROM    sys.foreign_key_columns AS FKC
        INNER JOIN SingleColumnFK AS SCFC
        ON SCFC.constraint_object_id = FKC.constraint_object_id
        AND SCFC.parent_object_id = FKC.parent_object_id
        INNER JOIN sys.columns AS FC
        ON FKC.referenced_object_id = FC.object_id
        AND FKC.referenced_column_id = FC.column_id
        INNER JOIN sys.columns AS PC
        ON FKC.parent_object_id = PC.object_id
        AND FKC.parent_column_id = PC.column_id
GO

The join criteria is particularly useful because it adds data lineage to a data dictionary or catalog.

Combining Default Descriptions with Foreign Key Information

We can concatenate the column descriptions from our two views to form an overall ColumnDescription:

  • MetaData.DataDictionary_ColumnDefaultDescriptions
  • MetaData.DefaultDescription_ForeignKey
CREATE VIEW MetaData.DefaultColumnAndFKDescriptions
AS
SELECT
    COALESCE(CDD.SchemaName,DFK.SchemaName) AS SchemaName,
    COALESCE(CDD.ObjectName,DFK.ObjectName) AS ObjectName,
    COALESCE(CDD.ColumnName,DFK.ColumnName) AS ColumnName,
    COALESCE(CDD.ColumnDescription,'') +COALESCE(' '+DFK.ColumnDescription,'') AS ColumnDescription
FROM MetaData.DataDictionary_ColumnDefaultDescriptions AS CDD
    FULL OUTER JOIN MetaData.DefaultDescription_ForeignKey AS DFK
    ON DFK.SchemaName = CDD.SchemaName
    AND DFK.ObjectName = CDD.ObjectName
    AND DFK.ColumnName = CDD.ColumnName
GO

Character fields with a small number of values

Where there is a small reference table or a dimension where a column contains a small number of values, I prefer to list those values in the column documentation. This it to help the reader by reducing the need to jump to other pages to get the desired information.

For us to know when we have a small number of values, we have to profile our data or have some foreknowledge of the columns for which this is true. Data profiling can be an intensive operation so we should consider the nature of the dimensions.

  • Short and wide such as a date or time dimension.
  • Deep and wide such as a customer dimension
  • Deep and narrow such as an email address dimensions

Identifying Profile Candidates

Due to the intensity of profiling I might build a table to enable me to flag those columns I wish to profile.

CREATE TABLE MetaData.ColumnProfileCandidates(
    SchemaName sysname NOT NULL,
    ObjectName sysname NOT NULL,
    ColumnName sysname NOT NULL,
    ApproximateRows BIGINT NOT NULL,
    IncludeForProfiling BIT NOT NULL
        CONSTRAINT DEF_ColumnProfileCandidates_IncludeForProfiling DEFAULT (1),
    CONSTRAINT PK_MetaData_ColumnProfileCandidates PRIMARY KEY (
        SchemaName,
        ObjectName,
        ColumnName
    )
);
GO

In a dimensional data mart, such as Adventureworks2012DW, you could run a query similar to the one below to begin to identify columns that might meet the criteria for having a small list of values.

  • Choose only NCHAR, NVARCHAR, CHAR, VARCHAR columns whose length is less than 100 and excludes xxxxCHAR(MAX)
  • Only look at dimension tables
  • Exclude columns that end in “Key”
  • Provide the estimated table row count for the table
INSERT INTO MetaData.ColumnProfileCandidates(
    SchemaName,
    ObjectName,
    ColumnName,
    ApproximateRows
)
SELECT 
    SRC.SchemaName,
    SRC.ObjectName,
    SRC.ColumnName,
    SRC.ApproximateRows
FROM (
    SELECT 
        OBJECT_SCHEMA_NAME(O.object_id) AS SchemaName,
        O.name AS ObjectName,
        C.name AS ColumnName,
        COALESCE(I.rows, 0) AS ApproximateRows
    FROM sys.columns AS C
        INNER JOIN sys.types AS T
        ON T.user_type_id = C.user_type_id
        INNER JOIN sys.objects AS O
        ON O.object_id = C.object_id
        LEFT JOIN sysindexes AS I
        ON O.object_id = I.id
        AND I.indid IN ( 0, 1 )
    WHERE T.name LIKE '%char%'
         AND O.type = 'U'
         AND O.name LIKE 'Dim%'
         AND C.max_length BETWEEN 1 AND 100
         AND C.name NOT LIKE '%Key'
) AS SRC
    LEFT JOIN MetaData.ColumnProfileCandidates AS DEST
    ON DEST.SchemaName = SRC.SchemaName
    AND DEST.ObjectName = SRC.ObjectName
    AND DEST.ColumnName = SRC.ColumnName
WHERE DEST.SchemaName IS NULL;

You can set the IncludeForProfiling flag to zero for any column you know will contain more values than is useful for generating descriptions.

Recording Profile Information

I want to record the profiling of the columns flagged in MetaData.ColumnProfileCandidates, so I create a table to do so.

CREATE TABLE MetaData.ColumnProfileResults(
    SchemaName sysname NOT NULL,
    ObjectName sysname NOT NULL,
    ColumnName sysname NOT NULL,
    NumberOfValues BIGINT NOT NULL,
    CONSTRAINT PK_MetaData_ColumnProfileResults PRIMARY KEY CLUSTERED (
        SchemaName ASC,
        ObjectName ASC,
        ColumnName ASC
    )
)
GO

Against every qualifying column I want to run a COUNT(DISTINCT {column name}) AS NumberOfValues. To do this I have to generate the SQL command and execute it. I can do this using the stored procedure shown below.

CREATE PROC MetaData.RecordColumnProfiles
AS
SET NOCOUNT ON;
TRUNCATE TABLE MetaData.ColumnProfileResults;
DECLARE @SQLCommandList TABLE(SQLCommand VARCHAR(2000) NOT NULL);
DECLARE @SQLCommand VARCHAR(2000) = '';
INSERT INTO @SQLCommandList(SQLCommand)
SELECT 'INSERT INTO MetaData.ColumnProfileResults SELECT ''' + SchemaName + ''' AS SchemaName,''' + ObjectName
    + ''' AS ObjectName,''' + ColumnName + ''' AS ColumnName,COUNT(DISTINCT ' + QUOTENAME(ColumnName)
    + ') AS Occurrences FROM ' + QUOTENAME(SchemaName) + '.' + QUOTENAME(ObjectName) AS SQLCommand
FROM MetaData.ColumnProfileCandidates
WHERE IncludeForProfiling = 1;
WHILE @SQLCommand IS NOT NULL
    BEGIN
        SELECT @SQLCommand = MIN(SQLCommand)
        FROM @SQLCommandList
        WHERE SQLCommand > @SQLCommand;
        IF @SQLCommand IS NOT NULL
            BEGIN
                EXEC (@SQLCommand);
                PRINT @SQLCommand;
            END;
    END;
GO

Generating the Column Descriptions for the data dictionary or catalog

As there are twelve months in the year I will generate descriptions for any character column that holds up to twelve values.

SELECT *
FROM MetaData.ColumnProfileResults
WHERE NumberOfValues<=12;

I want to apply descriptions to my MetaData.DataDictionary_Columns table where any existing ColumnDescription is a zero length string. I will need a staging table the same structure as my target table.

CREATE TABLE MetaData.DataDictionary_StagingColumns(
    SchemaName sysname NOT NULL,
    ObjectName sysname NOT NULL,
    ColumnName sysname NOT NULL,
    ColumnDescription VARCHAR(7000) NOT NULL,
    CONSTRAINT PK_DataDictionary_StagingColumns PRIMARY KEY CLUSTERED(
        SchemaName ASC,
        ObjectName ASC,
        ColumnName ASC
    )
)
GO

Into this table I want to substitute the schema, table and column names into the query below, execute that query and put the results in my staging table.

SELECT STUFF
(
    (
        SELECT DISTINCT CHAR(13) + CHAR(10) + '* ' + {ColumnName}
        FROM {SchemaName.TableName}
        WHERE {ColumnName} IS NOT NULL
        FOR XML PATH(''), TYPE
    ).value('.', 'NVARCHAR(MAX)'),1,2,''
);
GO

To do this we can use the same technique that we used to record the profiling information.

CREATE PROC MetaData.RecordColumnStagingDescriptions
AS
SET NOCOUNT ON;
TRUNCATE TABLE MetaData.DataDictionary_StagingColumns
DECLARE @SQLCommandList TABLE(SQLCommand VARCHAR(2000) NOT NULL);
DECLARE @SQLCommand VARCHAR(2000) = '';
INSERT INTO @SQLCommandList(SQLCommand)
SELECT 'INSERT INTO MetaData.DataDictionary_StagingColumns SELECT ''' + SchemaName + ''' AS SchemaName,''' + ObjectName
    + ''' AS ObjectName,''' + ColumnName + ''' AS ColumnName,REPLACE(STUFF((SELECT DISTINCT CHAR(13) + CHAR(10) + CHAR(42) + CHAR(160) + '
    + QUOTENAME(ColumnName)
    + ' FROM '
    + QUOTENAME(SchemaName) + '.' + QUOTENAME(ObjectName)
    + ' WHERE ' + QUOTENAME(ColumnName) + ' IS NOT NULL FOR XML PATH(''''), TYPE).value(''.'',''NVARCHAR(MAX)''),1,2,''''),CHAR(39),CHAR(39)+CHAR(39));' AS SQLCommand
FROM MetaData.ColumnProfileResults
WHERE NumberOfValues<=12;
WHILE @SQLCommand IS NOT NULL
    BEGIN
        SELECT @SQLCommand = MIN(SQLCommand)
        FROM @SQLCommandList
        WHERE SQLCommand > @SQLCommand;
        IF @SQLCommand IS NOT NULL
            BEGIN
                EXEC (@SQLCommand);
                PRINT @SQLCommand;
            END;
    END;
GO

Once the stored procedure is created and executed, we can run an UPDATE statement similar to the one we ran earlier.

UPDATE DEST
SET DEST.ColumnDescription = SRC.ColumnDescription
FROM MetaData.DataDictionary_Columns AS DEST
    INNER JOIN MetaData.DataDictionary_StagingColumns AS SRC
    ON SRC.SchemaName = DEST.SchemaName
    AND SRC.ObjectName = DEST.ObjectName
    AND SRC.ColumnName = DEST.ColumnName
WHERE DEST.ColumnDescription = ''

We can run the MetaData.ApplyDataDictionary stored procedure to attach our descriptions to the object extended properties.

Examining our results in Redgate SQLDoc

We can use Redgate SQLDoc to generate our documentation.  The two screen shots below give examples of what the generated column descriptions look like.

Example dimension table once column descriptions have been generated for the data dictionary or catalog

Example fact table once default column descriptions have been generated for the data dictionary or catalog

Possible enhancements for foreign key descriptions

Our foreign key descriptions apply one of two rules. One for when the foreign key references the same table, the other for when it references a different table. The generated descriptions for a foreign key will be

  • Joins back to the {referenced column} column in this table
  • Join to {referenced column} column in the {schema.table}

That is good enough for basic information and with the inclusion of the default column description can be a rich source of information.  What if we wanted to add the MS_DESCRIPTION property from the referenced table?  This would be simple enough to do although I would choose to have an additional TABLE_SUMMARY property. This would be explicitly to hold a brief description of what the referenced table allowed the analyst to do.  It would not constraint us in what we wanted to put in MS_DESCRIPTION.

Worked example using DimDate as the referenced table

Let us suppose that dbo.DimDate has a TABLE_SUMMARY property.  The value of that property is “The date dimension provides a mechanism to dice/slice and aggregate facts by various date parts thereby eliminating the majority of date calculations.”

We could harvest the TABLE_SUMMARY for the referenced table so that the description in the fact becomes a concatenation of three things:

  • Default description for a column with a specific name
  • Description of the join
  • TABLE_SUMMARY description

Our view to retrieve foreign key descriptions would change to the following.

CREATE VIEW MetaData.DefaultDescription_ForeignKey
AS
WITH SingleColumnFK AS (
    SELECT constraint_object_id,parent_object_id
    FROM sys.foreign_key_columns
    GROUP BY constraint_object_id,parent_object_id
    HAVING COUNT(*) = 1
)
SELECT
    OBJECT_SCHEMA_NAME(FKC.parent_object_id) AS SchemaName,
    OBJECT_NAME(FKC.parent_object_id) AS ObjectName,
    'TABLE' AS ObjectType,
    PC.name AS ColumnName,
    CASE
        WHEN FKC.parent_object_id = FKC.referenced_object_id
            THEN 'Joins back to the ' + PC.name + ' column in this table.'
            ELSE 'Join to '+FC.name + ' column in the ' 
                + OBJECT_SCHEMA_NAME(FKC.referenced_object_id)+'.'+OBJECT_NAME(FKC.referenced_object_id) + ' table.'+ COALESCE(' ' + CAST(EP.value AS VARCHAR(MAX)),'')
           END AS ColumnDescription
FROM sys.foreign_key_columns AS FKC
    INNER JOIN SingleColumnFK AS SCFC
    ON SCFC.constraint_object_id = FKC.constraint_object_id
    AND SCFC.parent_object_id = FKC.parent_object_id
    INNER JOIN sys.columns AS FC
    ON FKC.referenced_object_id = FC.object_id
    AND FKC.referenced_column_id = FC.column_id
    INNER JOIN sys.columns AS PC
    ON FKC.parent_object_id = PC.object_id
    AND FKC.parent_column_id = PC.column_id
    LEFT JOIN sys.extended_properties AS EP
    ON FKC.referenced_object_id = EP.major_id
    AND EP.minor_id=0
    AND EP.name='TABLE_SUMMARY'
GO

The combination of techniques discussed so far can provide rich information for our data dictionary or catalog with remarkably little effort. Think carefully about how we structure and phrase the table descriptions with reuse in mind.  Certainly, in a dimensional model it could reduce the amount of manual effort by more than half.

Potential objections to the auto-generation technique

The beauty of a relational database is that the implementation of Codd's rule 4 means that the object metadata is usefully descriptive of the objects within the database.  It is easy to generate a useful, human readable description of columns participating in a primary or foreign key relationship.

Arguments against are likely to be that for the operational database:

  • It does not utilise foreign keys
  • It must not be used for non-core business use

Fortunately, users of Redgate tools have mechanisms for addressing these.

Possible architecture for generating and maintaining descriptive comments for the data dictionary or catalog

The approach you could take is illustrated in the diagram above.

  1. Take a copy of the data mart from source control
  2. Apply any absent keys
  3. Generate descriptions
  4. Use SQLDoc to amend or add any other descriptions
  5. Use SQL Compare to generate change scripts to apply the extended properties
  6. Apply to the data mart and commit to source control

Concluding thoughts

My experience when it comes to documenting a database is that the size of the effort is always a barrier to doing it.  Consequently opposition to providing documentation is always quite vocal. From a documentation perspective you can gain substantial coverage of the database using the techniques described above. Hopefully this will reduce the decibels emitted by the naysayers.

Devising rules for retrieving column information and generating descriptions is an art.  Certainly writing reusable table descriptions that can be concatenated with generated foreign keys descriptions is too.  The goal is to gain a critical mass of documentation where adding further documentation becomes a matter of adopting development practises.

  • Append descriptions when you create objects
  • Fill in descriptions that are blank when you amend objects
  • Make tests for empty descriptions part of your build mechanism
  • Ensure that any failing tests break your build

The payback is a source of information about our databases that is educational and information.  An ideal start for a data dictionary or catalog.

 

Resources

Rate

4.67 (3)

Share

Share

Rate

4.67 (3)