SQLServerCentral Article

Building Business Intelligence Data Warehouses

,

Introduction

Business intelligence can improve corporate performance in any information-intensive industry. With applications like target marketing, customer profiling, and product or service usage analysis, businesses can finally use their customer information as a competitive asset. They can enhance their customer and supplier relationships, improve the profitability of products and services, create worthwhile new offerings, better manage risk, and reduce expenses dramatically. In order to capture the power of business intelligence, a proper data warehouse needs to be built. A data warehouse project needs to avoid common project pitfalls, be business driven, and deliver proper functionality.

Data Warehouse Projects

A data warehouse project is not immune to the pitfalls associated with any technology development project. Common mistakes can either immediately kill or severely cripple the project as it progresses through development. A data warehouse project typically is a consolidated solution between multiple systems that encourages data stewardship and advanced analytics form existing systems. However, it is possible to get trapped by the illusion that consolidating data will fix a problem or automatically answer a question. One mistake in a data warehouse project is the failure to define the business reason for the project. Are there problems with existing systems? Does the data need to be aggregated? Does one need point-in-time reporting?  Critical questions need to be addressed before entering a project. Another common mistake is the assumption a data warehouse load, often referred to as ETL (extract, transform, load) will fix source data. Although an ETL can be used to scrub existing source data, its general purpose is not to fix it. Dirty source data will contain missing, erroneous, or inconsistent data. “This is often not a problem for OLTP systems, but can be disastrous for decision support systems.” (Mimno pg 4)  Further, a data warehouse should not be considered the quick fix. A data warehouse should be viewed as part of an overall solution of data storage and reporting needs. A common mistake is the “failure to anticipate scalability and performance issues.” (Mimno pg4) A data warehouse needs proper architecture, application configuration that would include RAID configuration and data normalization. RAID configuration constitutes how the data is placed across the disks of the server. In addition, it defines the performance and data redundancy of data stored on the server. Data normalization is a term that refers to data analysis technique that organizes data attributes such that they are grouped to form non-redundant, stable, flexible, and adaptive

entities.

Business Driven Project

Most companies enter a data warehouse project because they are experiencing problems with performance and integration from their source systems. The project should obtain proper executive sponsorship to help guide the project. A good solution is to have a proper business case approved and form a steering committee to gather requirements and guide the project from start to finish. In addition, users experience slower query response time coupled with the inflexible architecture form different systems. It is typical to see organizations have different departments start their own data warehouse project to solve their individual need. This starts the “stovepipe” problem common with data warehouse projects. The term stovepipe refers to the problem seen when common data is taken by different groups that don’t apply the same business rules and definitions. What happens is the same data is isolated between internal projects and produces a different result. It is common to see the Human Resources department come up with a different company headcount number than Finance. In addition to fractured data mart projects, data warehouses can lack the ability to properly change as users requests change. Without a proper development lifecycle, changes are not well managed and become over bearing to the project. The data warehouse is built for a problem now and does not focus on the organizations end goal. It soon becomes inflexible and unreliable as data can become corrupted with erroneous changes. Finally, a data warehouse project needs to meet the need of the business. The project should be geared toward answering a focused problem. Whether the business need is reporting, consolidation of source data, or archived storage. Mimno defines business driven as “define the business requirements first; next specify architecture; and finally select tools.” 

A Business Intelligence data warehouse

Business Intelligence provides a single, clean, and consistent source of data for decision-making. It allows executives and business analysts to perform financial analysis, variance analysis, budgeting analysis, and other types of reporting. It also allows the discovery of patterns and trends in data directly built from every day business actions. The primary goal of Business Intelligence is to increase productivity, reduce time, and reduce cost by establishing a single integrated source of information.

Data warehouses exist to facilitate complex, data-intensive, and frequent ad hoc queries. Accordingly, data warehouses must provide far greater and more efficient query support than is demanded of transactional databases. The data warehouse access component supports enhanced spreadsheet functionality, efficient query processing, structured queries, ad hoc queries, data mining, and materialized views. In particular, enhanced spreadsheet functionality includes support for state-of-the-art spreadsheet applications (e.g., MS Excel) as well as for OLAP applications programs. These offer preprogrammed functionalities such as the following:

  1. Roll-up: Data is summarized with increasing generalization (e.g., weekly to quarterly to annually).
  2. Drill-down: Increasing levels of detail are revealed (the complement of roll-up).
  3. Pivot: Cross tabulation (also referred as rotation) is performed.
  4. Slice and dice: Performing projection operations on the dimensions.
  5. Sorting: Data is sorted by ordinal value.
  6. Selection: Data is available by value or range.
  7. Derived (computed) attributes: Attributes are computed by operations on stored and derived values.

“The classic approach to BI system implementation, users and technicians construct a data warehouse (DW) that feeds data into functional data marts and/or "cubes" of data for query and analysis by various BI tools. The functional data marts represent business domains such as marketing, finance, production, planning, etc. At a conceptual level, the logical architecture of the DW attempts to model the structure of the external business environment that it represents.” (Kurtyka)

 

The business intelligence data warehouse allows users to access information with a reporting tool to execute business planning and analysis.  The proper BI data warehouse will deliver a cyclical “process of data acquisition (capture), analysis, decision making and action.”

(Kurtyka) 

Conclusion

A data warehouse can deliver a consolidated reporting source with increased flexibility and performance. However, in order to deliver a robust BI data warehouse solution some guiding principles need to be followed to deliver a successful product.  First, the data warehouse project should encompass company directives in order to avoid individual projects that lead to stovepipe data marts. In addition, the ETL program that loads the data warehouse should not be used as a toll to fix source data. Finally, the data warehouse solution should be architected to provide superior performance and data normalization. Second, the warehouse solution should be business driven. The project should provide a solution for a specific problem. A proper business case should be approved and the project controlled by a steering committee. Third, the BI data warehouse should be built to increase productivity, increase data flexibility, and reduce company cost by integrating multiple sources of information.

References

Rate

3 (3)

You rated this post out of 5. Change rating

Share

Share

Rate

3 (3)

You rated this post out of 5. Change rating