In this article I am going to demonstrate how Data Dictionaries, which normally hold key business information slated for BI Applications, can be managed and deployed in Data Warehouse (DW) and Decision Support System (DSS) database applications.
First, I will show how this metadata can be uploaded to a DW or Datamarts as SQL Server Extended Properties and secondly, how it can be propagated from Extended Properties and tables into custom reports on SSMS and SSRS. Finally, and more importantly I will demonstrate how you can import and sync this information with your SSAS environment as properties of Dimensional Attributes and Measures to allow easy display of this information in your cube where they are needed the most.
The Value of Data Dictionaries in DW and DSS Database Applications
Business intelligence (BI) is often defined as an effective system that provides corporations with “one version of the truth”. Very often, the pursuit of complete data integration and analytic reporting in BI endeavors results in key aspects of this truth found in Data Dictionaries either falling through the cracks or simply not being optimized for the purposes they are intended.
I believe that having a correct logical DW model of your business and a Data Dictionary consisting of well-defined, cross-organizational definitions of Dimensional Attributes and Measures in the logical model is the first and major step in achieving this “one version of the truth”. What is prevalent in most BI endeavors is that this major first step is often achieved; however the information in the Data Dictionaries does not make it into the application environments where they are useful to business users and Developers. The information is either kept in files on shared drives, or maintained in applications outside of the actual BI applications’ environments where they are mostly needed.
Data Dictionaries can consist of substantial amount of data especially in Enterprise Data Warehouses (EDW), as a result dedicated effort should be directed towards gathering, updating and maintaining this key business information. This article will only demonstrate how to manage the data and deploy it to the aforementioned SQL Server BI environments.
“Business” Data Dictionary
The term Data Dictionary by definition, may denote anything from separate sets of tables that describe the application tables, to files and applications with such information. A Data Dictionary may contain such information as column names, data types, sizes, and other descriptions. Descriptive attributes that make part of this information may include titles, captions, primary keys, foreign keys, and rules for entering column level data for front end applications. It may represent all or part of an overall meta-data solution.
In this document I will refer only to definitions of Dimensional Attributes and Business Metrics consisting of regular Fact Measures and Calculated Measures in a Data Warehouse or DSS Database application as “Business” Data Dictionary (BDD). In this context, BDDs will be documents or applications consisting of very detailed business definitions and formulas purposely designed as Dictionaries for analytical BI reports and SSAS cubes. I am also using the term to distinguish it from the traditional OLTP systems Dictionaries with physical object definitions and business rules for entering column level data for front end applications.
The figures below show some sample illustrations of data rows in a Business Data Dictionary.
Fig 1. Data row describing a Dimensional Attribute
Fig 2. Data row describing a Regular Measure
Fig 3. Data row describing a Calculated Measure
If you are a BI architect, as a best practice, I believe you should be able to isolate this layer of information from your overall metadata solution at any time.
Loading Dimensional Attributes as SQL Server Extended Properties
In this section I will demonstrate how you can load BDD information as Extended Properties of your DW, Datamart or any DSS SQL Server database.
SQL Server Extended Properties
SQL Server provides access to standard sets of default metadata that are stored in a collection of system views. SQL Server also provides a way to extend these standard metadata, which are generated when you create database objects with your own properties.
There are very good articles about how Extended Properties works (e.g. Easy Extended Properties ), but in this article, I am going to demonstrate how you can use the simple attached SSIS package to load and append the definitions from a BDD spreadsheet as Extended Properties.
For the rest of this article, I am going to use SQL server 2008 Adventure Works Data Warehouse (AW DW) and the 2008 Adventure Works Cube and the files in the attached zipped file for all demonstrations. I will also assume you are following the demonstration initially using this database, cube and files.
Preparing your Business Data Dictionary
In the zipped file is a sample spreadsheet BusinessDataDictionary.xls, which I generated using the non-key columns in the Adventure Works Data Warehouse (AWDW); save the file to location on your local drive. To use the attached SSIS Package in any situation without any modifications, you must format your BDD like the BusinessDataDictionary.xls spreadsheet.
I am going to assume that the sample spreadsheet represents an extraction of a Business Data Dictionary for the AW project. The Dimensional Attributes tab on the spreadsheet actually represents generic definitions I generated using the columns names in the AW DW. The Measures tabs also represent generic definitions I assigned to Regular and Calculated Measure names that I extracted directly from the AW cube.
Loading Your Dictionary as Extended Properties with SSIS Package
We are going to use the attached LoadExtendedProperties.dtsx SSIS package to append the Dimensional Attributes in the BusinessDataDictionary.xls spreadsheet as Extended Properties of the Dimensional Attributes of AW DW. Note that for the Dimensional Attributes, the column names in the spreadsheet matches exactly with the columns in the AW DW. This is important for implementing Extended Properties.
Before you run the package, you must first run the script in Listing1 below, in your DW. The table created by the script will be used to hold the latest version of all the BDD data in the BusinessDataDictionary.xls spreadsheet. We will later build a report directly from this table and also sync Measure definitions from the table with the Description Properties of the Measures in the cube.
Listing 1. Create table script for a table to store BDD
IF NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[Dictionary]') AND type in (N'U'))
CREATE TABLE [dbo].[Dictionary](
[ObjectType] [nvarchar](128) NULL,
[SchemaName] [sysname] NULL,
[TableName] [sysname] NULL,
[ColumnName] [sysname] NULL,
[Name] [nvarchar](128) NULL,
[PropertyName] [sysname] NULL,
[Definition] [nvarchar](4000) NULL,
) ON [PRIMARY]
Now open the attached LoadExtendedProperties.dtsx SSIS Package, point the DataWarehouse database connection in the Package to your Adventure Works data warehouse and the Dictionary_ExcelConnManager connection to the location of the BusinessDataDictionary.xls spreadsheet on your local drive and execute the package.
When you execute the package it does two things. First, it loads all the BDD data from the BusinessDataDictionary.xls spreadsheet into the table created above. Secondly, it appends the Dimensional Attribute definitions as an Extended Property called BusinessDataDictionary to the DataWarehouse database.
To view the Extended Properties after the execution is completed, go to SSMS, navigate to any dimensional table in the AW DW database, expand the table, right click any non-key column and select Properties from the drop-down list and click on Extended Properties. Your will see a property named BusinessDataDictionary and a value, which represents the Business Definition column from the BDD spreadsheet that we are using in this exercise as shown in figure 4 below.
Fig 4. Showing Extended Property of a Dimensional table column.
After the completion of the package, refresh and open your Adventure Works data warehouse database. Open the new table dbo.dictionary, and you should also see all the dimensional and fact measures from the tabs on the BDD spreadsheet also loaded into this table. You can also check all the External properties appended by the package by running the script in Listing 2 below.
Listing 2. A script to query BDD Extended Properties
s.name as [Schema],
t.name as [Table name],
c.name as [Column Name],
c.name as Name,
ep.name as [Property Name],
ep.value as [Business Definition]
FROM sys.tables as t
INNER JOIN sys.schemas as s
ON s.schema_id = t.schema_id
INNER JOIN sys.columns as c
ON t.object_id = c.object_id
INNER JOIN sys.extended_properties as ep
ON ep.major_id = t.object_id
where ep.name in ('BusinessDataDictionary')
The package works by first uploading the contents of the spreadsheet into the dictionary table. It then extracts the dimensional attributes as recordsets, which are passed to two For Each Loop Containers. Each container executes one of the parameterized system store procedures under Listing 3 below to drop the “BusinessDataDictionary” property if one already exists and then add the property back with the latest values from the dictionary table.
Listing 3. System Store Procedures to drop and add Extended properties
EXEC sp_dropextendedproperty ? ,? ,? ,? ,? ,? ,? ;
EXEC sp_addextendedproperty ? ,? ,? ,? ,? ,? ,? ,? ;
The exact columns passed as parameters to the Store procedure to add the Extended Properties can be obtained from the Execute SQL Tasks in the package. The package can be scheduled to refresh and maintain your BDD as Extended properties of your DW or Datamarts.
Syncing your Dictionary Definitions with your Cube
The trick we are going to employ is to sync the Dictionary definitions from the BDD with the Description property of Dimensional Attributes and Measure in the cube where applicable. This allows the Measures and Attribute definitions and formulas from the Dictionary to show up as tooltips as you browse and hover over the objects in the cube. Let’s first see how this feature works.
Manually Updating Description Property of Cube objects
When you browse and hover over cube objects with the Description Property not populated, as shown in figure 5, the tooltip will show only the names of object, which does not really provide a user with any meaningful information.
Fig 5. Shows browsing the “Average Rate” Measure in the AW cube, when the Description Property is not set or populated.
Similarly if you browse any Dimensional Attribute of a cube with no Description Property populated, only the names of the Attribute and the Hierarchy it belongs to are displayed in the tooltip, there is nothing that tells a user what the Attribute actually mean.
In this section we are going to connect to a cube and do a quick live update of the cube, by manually updating the description Property of some objects. To connect live to the cube, open Visual Studio, go to File -->Open -->Analysis Database…. , enter your Server name, choose the AW Database and select the AW cube (note, if you are using any other cube for this exercise remember, this is live update so make sure it is development cube). In the solution explorer under cubes, double click the AW cube, this opens the Cube structure tab, under Measures, expand the Exchange Rate measure group, right click the first Measure, Average Rate and select Properties. In the Description, type something like
“This is the arithmetic mean of all daily COB rates in a particular month***********”
Formula: (R1 + R2 +……………. + RN)/N “
as shown in Fig 6 below.
Fig 6. Showing a manual entry into the Average Rate Measure Description Property.
Do the same for any Dimensional Attribute by typing anything in the description Property. After that, press the Save All button to update the cube on the server with the changes we just made, you will see a pop up with the message “Updating information on the server”. What this means is we don’t have to reprocess the cube to see the changes we made. Close VS after the updating process completes.
Now reconnect to the cube through Management Studio or press the reconnect button if you still have the AS connection or cube open. Browse the Average Exchange Rate Measure and any other Dimension Attribute you just updated, you should now see the definitions you manually entered in the description property above displayed in the tooltip on the object, as shown below.
Fig 7. Showing the new definition and formula of the Average Rate Measure as a Tooltip in the cube during browsing.
For small cubes it is possible to manually update cube object Description Property with your dictionary definitions, but on some projects, I have dealt with over 2800 critical Dimensional Attributes and Measure definitions, constantly evolving, so in some cases manually updating the cube object Description Property with Dictionary definition is not an option. This where some definitions loaded as Extended Properties comes in handy.
Syncing Extended Properties with Cube Dimensional Attributes Properties
The approach we are going explore to sync Data Dictionary definitions with Dimensional Attributes Description in a cube, require that;
You load your dictionary definitions as Extended Properties of the SQL Server Data warehouse or Mart, whichever is the source of your cube. (To load your Data Dictionary as Extended Properties, please refer to the section: Loading Your Data Dictionary as SQL Server Extended Properties above.)
You have visual Studio (VS) BIDS Helper installed (To install this add-on go to Codeplex at http://bidshelper.codeplex.com/).
With these two conditions met, syncing Extended Properties with your cube is very simple. Let’s do a live sync to update a cube with the Extended Properties we loaded above. Open Visual Studio, go to File -->Open-->Analysis Database…, enter your Server name, choose the AW Database and select the AW cube (remember, you can do this in any AS database project in VS). In the Solution Explorer, Select one or more dimensions and then right click and choose Sync Descriptions...
Fig 8. Showing how to sync dimension and their attribute with Extended Properties.
In the pops up you should see all the extended properties in the AW DW relational database including the BusinessDataDictionary property we loaded above (if you are using any other database make sure that the data source in the project is pointing to the database with the BusinessDataDictionary Extended Property that we loaded above), check the box with the right property, in this case BusinessDataDictionary, and click OK.
Fig 9: Showing the BusinessDataDictionary Extended Property during the sync process.
Now click save all to update the cube on the server, you will see the message “Updating information on the server……”. If you go back and browse the Dimensional Attributes in the cube you should now see the business definitions that were appended as Extended Property “BusinessDataDictionary” appropriately displayed in the tooltip for Dimensional Attributes, tied to a column in a table, or view in the AW DW.
Fig 10. Showing the new Attribute definition as a Tooltip in the cube during browsing.
It is important to note that you will only be able to sync Dimensional Attributes that are tied to a table or view in the data source view (DSV), and not to a named query in the DSV. The definitions from the Extended Properties will be applied to any such attribute with a NameColumn or with only one KeyColumn. You can read more on this feature on Codeplex. Another important factor that leads to the next chapter, is that the Sync Descriptions feature on BIDS helper, that we just utilized above, only operates on Analysis Services Dimensions and not Measures.
The approach I am going to use to sync Measures in the next section could be used for Dimensional Attributes. I chose the option above to sync Dimensional Attributes because it is clean and will always sync directly to the actual column names in the database whether the names are different from the Attribute names used in the cube or not. This option in my opinion is better for the simple fact that Dimensional Attributes normally constitute the bulk of most BDD data.
Syncing your BDD Measures with your Cube Measures.
Unlike Dimensional Attributes, VS does not provide an easy way to sync all the Measure definitions in your Data Dictionary directly with your cube Measures without manually updating the Description Property. I am going to demonstrate one way to sync your Data Dictionary with your Cube Measures by simply configuring and executing the UpdateCubeMeasures_AMO.dtsx SSIS package attached. The package uses Analysis Management Object (AMO) in script task to update the cube.
Before you run the package,
First change the DataWarehouse database connection in the Package to point to the AW DW database or the database with your Data Dictionary table dbo.dictionary refer to the section: Loading Your Data Dictionary as SQL Server Extended Properties above.)
Secondly, change the values of the three variables; ASServer, ASDatabase and ASCube to Your Analysis Services Server, Analysis Services Database and Analysis Services Cube respectively as shown in fig below. Remember to provide the full sever and instance Name if you are using a names instance.
Fig 11. Showing the variables to change in the Package UpdateCubeMeasures_AMO.dtsx.
Now, execute the Package.
After completion reconnect to the cube through Management Studio or press the reconnect button if you still have the AS connection or cube open. Browse and hover over any of the Measures in the cube, you should now see the Dictionary definitions from the dbo.dictionary tables displayed as tooltip on both Regular and calculated Measures as shown below.
Fig 12. Showing a Measure definition and formula as a Tooltip in the cube during browsing.
The UpdateCubeMeasures_AMO.dtsx Package contains a single Script Tasks. In the Task, I use ADO.NET to retrieve and iterate the dictionary definitions from the dbo.Dictionary table in the Data Warehouse and use Analysis Management Object (AMO) to connect to Analysis Services and update the cube objects description as I iterate the cube Objects and their definitions from ADO result set. The C# script in the task is as shown in the listing 4 below.
Listing 4: A C# script that updates cube Measure Description Property with BDD Definitions.
public void Main()
string strSvr = (string)Dts.Variables["User::ASServer"].Value;
string strSvrDB = (string)Dts.Variables["User::ASDatabase"].Value;
string strcube = (string)Dts.Variables["User::ASCube"].Value;
Server svr = new Server();
Database db = new Database();
svr.Connect("Data source=" + strSvr);
//Database db = svr.Databases.FindByName(strSvrDB);
db = svr.Databases[strSvrDB];
Cube cube = db.Cubes.GetByName(strcube);
ConnectionManager adocm = Dts.Connections["DataWarehouse"];
SqlConnection connection = (SqlConnection)adocm.AcquireConnection(Dts.Transaction);
string commandText =
"SELECT DISTINCT ObjectType, SchemaName ,Name ,BusinessDescription "
+" FROM dbo.Dictionary where ObjectType "
+ "in ('Calculated Measure','Regular Measure')";
SqlCommand command = new SqlCommand(commandText, connection);
SqlDataReader rdr = command.ExecuteReader();
if (rdr["ObjectType"].ToString() == "Regular Measure")
MeasureGroup msg = cube.MeasureGroups.GetByName(rdr["SchemaName"].ToString());
Measure ms = msg.Measures.GetByName(rdr["Name"].ToString());
ms.Description = rdr["BusinessDescription"].ToString();
objMdxScript = cube.MdxScripts;
CalculationProperty CalProperty = objMdxScript.CalculationProperties.Find(rdr["Name"].ToString());
CalProperty.Description = rdr["BusinessDescription"].ToString();
Dts.TaskResult = (int)ScriptResults.Success;
I have explored using XMLA script and SQL Server XML xquery as an alternative (which I may present in another forum) but compared to that option, this AMO method is faster, the package setup is simpler and the code is terser.
Deploying your Dictionary as RS Report.
Another easy and useful way to display BDD information is as Reporting Services reports. Besides the out-of the-box Report Manager portal, RS reports can be made available to users on other portals like MS Sharepoint site, your company’s intranet or as custom report on SSMS.
Custom SSMS Reporting Services Reports
I have come to rely on SSMS reports for managing metadata, monitoring Server resource and other BI related activities. For developers and power users this is one area of Data Dictionary deployment I find very useful. I am not going to go into report design details; I am going to focus on how to deploy such a Data Dictionary Report. You can learn more on report design and SSMS reports on Mike Cole's Blog. I am going to demonstrate how to deploy the attached sample Dictionary report file SSMSDataDictionaryRpt.rdl. The report file works with the Dictionary Table we loaded above (refer to the section: Loading Your Data Dictionary as SQL Server Extended Properties above.)
To load the SSMSDataDictionaryRpt.rdl dictionary report file as SSMS Custom Report:
Save the file to a location on the Server or Machine where your DW Database is located.
In Management Studio, navigate to the DW Database, right-click the database node in Object Explorer, point to Reports and click Custom Reports. In the Open File dialog box, locate the folder where you saved the SSMSDataDictionaryRpt.rdl file, and then click the file to launch the report. As shown on fig 13. below.
Fig 13. Showing SSMS dictionary custom report.
After opening the report once, the report will be added to the most recently used list under Custom Reports on the shortcut menu in the database. It is good practice to have a dedicated folder for custom reports, from which you can reopen your reports if they fall off the list.
Some key features that you will note on the report, is that I added MS SS2008 logo and also color formatted the report just to look like the Standard SSMS Reports that ships with SQL Server. SSMS custom reports cannot be parameterized so I provided a drilldown, grouped the dictionary by the Object Type, and sorted by the name of the Object. You can change the formatting and any of these features by adding the RDL file to a VS 2005 Report Server project.
Note again that to modify the report, you must add the RDL file to a VS 2005 project. SQL Server 2005/2008 R2 only supports RDL schema Reports and not RDLC schema report files generated by VS 2008 as the time of writing. In other words if you try to deploy VS 2008 reports as Custom SSMS report it will fail. Another important thing to note is that the Custom SSMS report framework does not provide any parameter escaping for the underlying queries in the report. When writing the queries for these reports you must make sure they are written to prevent SQL injection attacks.
Despites its few limitations these reports are great and easy to deploy. The greatest feature about SSMS reports is that, unlike Regular SSRS Reports you don’t need SSRS Report Server in order to deploy and run them. If the underlying data reside on a SQL Server all you need is an RDL file.
Reporting Services Reports
Unlike the SSMS custom Reports, regular Reporting Services reports need the Report Server to run, however it gives you a lot more flexibility in terms of deployment options. For instance, links to parameterize dictionary reports like the simple sample shown in fig 14 below can be made available to various web based portals like MS SharePoint and also site in your company’s intranet. This flexibility allows it to be tailored in various ways for your target audience.
Fig 14. Showing a simple parameterized RS dictionary report.
How to obtain Cube Measures Name
If you have to obtain the exact Measure Names used in any cube you can follow the example below.
SQL Server 2008 introduced SSAS Dynamic Management Views, which exposes Analysis Services schema rowsets as regular tables, allowing you to write T-SQL style statements to query these views. Note these views are on AS server so you have to still execute these T-SQL style queries in MDX query windows connected to the right AS Database.
For instance the easiest way to obtain the exact Measure Names (like we used in the exercises above) from SSAS 2008 Cube is to query the $system.mdschema_measures DMV as below.
Listing 5: MDX query to obtain Measure names from a Sql Server 2008 Cube.
If you have to retrieve Measure names from a AS 2005 cube, I have provided an alternate method below that uses SQL XML XQuery methods to loop through a cube XMLA file to obtain the measure, but you must first generate the XMLA file.
To generate the XMLA file, navigate to your cube, right click it and go to:
Script Cube as-->Create To --> File….
Note that the resulting XMLA script may contain non ASCII characters. For instance, when you script the AW cube, the file will contain non ASCII characters like û, ê, è, é due to translations in the cube. When using XMLA files, or any other file as SQL Server XML file, always save the XMLA files as Unicode files to avoid errors.
To save the script, Click File-->Save XMLAQuery1.xmla as…. Click the drop down on the save button and select the Save with Encoding… option. In the pop up, click the Encoding drop down and select Unicode -Codepage 1200 option as shown in figure 15 below. This is windows version of UTF-16 Unicode. Now save the file to a location on your local drive where you have access to.
Fig15. Showing the Encoding option Pop up.
Replace the @FileLoc parameter in SQL script under Listing 6 below, with the complete location, file name and extension of the file you just saved. Open MS and execute the SQL script to obtain all the Calculated Measures and Regular Measures in the cube you just scripted. You can then match these names to the name used in your BDD if they are not exectly the same.
Listing 6. SQL script to obtain Measure names from an XMLA file.
DECLARE @sql nvarchar(MAX)
DECLARE @namespace nvarchar(100)
DECLARE @FileLoc nvarchar(1000)
set @FileLoc = 'C:\BI\Data dictionary\AdventureWorksDW2008.xmla'
set @namespace='declare namespace ns ="http://schemas.microsoft.com/analysisservices/2003/engine";'
set @sql= N'
DECLARE @xmla xml
DECLARE @Properties TABLE(
INSERT INTO @Properties(XMLA)
FROM (SELECT *
FROM OPENROWSET (BULK N'''+ @FileLoc +''', SINGLE_BLOB)
Set @xmla = (select XMLA from @Properties);
'''+'Regular Measure'+''' AS [MeasureType],
C.value('''+ @namespace +
'(ns:Name)'''+','+'''varchar(100)''' + ') AS MeasureName,
C.value(''' + @namespace +
'(ns:Description)''' +','+ '''varchar(100)'''+' ) AS [Description],
C.value('''+ @namespace +
'(ns:MeasureExpression)'''+','+ '''varchar(100)''' +') AS MeasureExpression
from @xmla.nodes(''' + @namespace +
'''+'Calculated Measure'+''' AS [MeasureType],
C.value('''+ @namespace +
'(ns:CalculationReference)'''+','+ '''varchar(100)'''+' ) AS MeasureName,
C.value(''' + @namespace +
'(ns:Description)'''+',' +'''varchar(100)'''+' ) AS [Description],
null AS MeasureExpression
from @xmla.nodes(''' + @namespace +
where MeasureName is not null
As a note, you may obtain this information through other means, I chose these options because they very straight forward and appropriate for this forum.
The value of a BDD is intangible and thus easy to overlook. As a sales and marketing analyst for a large financial firm for some time, I understand the cost and the frustration of jumping through hoops to get exact business metrics definition or formulas and attribute definitions in a Global Data Warehouse.
Depending on the nature of your business or the sizes of your DW and cube, a Data Dictionary and the methods described in this document might not even be necessary. On the other hand, for EDWs supporting global businesses with critical financial metrics, some form of BDD may be inevitable. This may be information depended on by Analysts, power users, Developers and management daily. The methods described in this document are some of the ways you can have this information permeate your SQL server BI application environment without the need for a third party application.