In SQL Server 2000, Microsoft has closed the gap between its multidimensional database offering, OLAP Services, and the rest of the market place. In fact, Microsoft has made so many significant enhancements that they changed the name to Analysis Services. Although users of OLAP Services 7.0 will find most of the current features in the Analysis Services, all will be pleasantly surprised by the number of new and improved features.
The most dramatic change to Analysis Services is the addition of Data Mining capabilities. Data Mining is a collection of statistical algorithms, developed by Microsoft Research, which are used to discover patterns in large data sets. You can use these patterns to identify, classify or predict such things as buying patterns, demographics or behavior in order to improve customer relationship management and maximize revenue potential.
Microsoft implemented these data mining services using the OLE DB for Data Mining specification (http://www.microsoft.com/data/oledb/dm.htm) to allow other 3rd party vendors and application developers to integrate data mining with external applications. For example, using the OLEDB for DM spec, a developer can integrate data mining into their web site in order to predict buying patterns based on demographic profiles. Although no performance numbers are currently available, Microsoft has been using this technology in their MSN and MSNBC web sites for months.
Two algorithms will be included in the final release, Microsoft Decision Trees and Microsoft Clustering. These algorithms will be able to perform classification, prediction and clustering. Microsoft has also left the door open for the addition of new algorithms in the future to support other mining tasks such as deviation analysis, estimation, and affinity grouping.
Microsoft Decision Trees are used for classification. It constructs one or more decision trees that can be used to predict attributes based on other attributes. For example, classification can be used to classify credit applications as low, medium or high risk or identifying fraudulent behavior based on inputs such as age, sex, income or any other attribute.
Microsoft Clustering is used to group records into clusters that share similar attributes. The difference between clustering and classification is that clustering groups these records together based on self-similarity rather than predefined predictive input.
The new OLAP server includes some very welcome enhancements. These new feature enhancements are meant to bring OLAP Services up to the level of more mature OLAP multidimensional products. These features add to the overall capabilities of the product, improve manageability, provide more granular security and generally make the product more friendly.
Microsoft has added some powerful new features in the cube and dimension design areas. OLAP Services supports parent-child dimensions that can represent structures like an organizational chart. Another new feature that parent-child dimensions can take advantage of is custom rollup formulas. In the organizational chart example, a supervisor’s salary is not the sum of all of their personnel. Using a custom rollup formula, salaries within this organizational structure can be correctly calculated. OLAP Services now supports ragged dimensions or non-standard hierarchies where standard hierarchies don’t apply. For example, in the US, cities rollup into states and states roll up into country. However, in other countries, city may only rollup into country. This structure can now be represented correctly.
Several performance enhancements have been added to OLAP as well. The addition of separate storage modes for large dimensions allows OLAP Services to support 10 Million+ member dimensions. Also, automatic member grouping effectively removes the 7.0 version’s limitation of 64,000 children under a single parent. OLAP Services can also take advantage of SQL Server 2000’s new indexed views feature by using these instead of relational aggregate tables.
However, the biggest performance enhancements are only available using the Enterprise Edition – Linked Cubes and Distributed Partitioned Cubes. These features allow for transparent scale-out architecture for your OLAP servers. You no longer have to add more processors or memory; you can use distributed computing to linearly scale.
OLAP Services also now supports security at the dimension or member level. This allows for a more granular design of security. For example, you can design security roles to only allow users from within a particular division to see metrics for their own division or enterprise, but not other divisions.
OLAP Services has also become much more developer and analyst friendly with the additions of the MDX Builder, Actions, and Drillthrough. MDX can now be built graphically, UI applications can now respond to user activity by performing pre-defined actions such as displaying metadata, and source relational data can be accessed by drilling to the lowest grain of the fact table. OLAP Services also now supports multi-user administration – no more being forced to wait while another user edits a different database on the same server.
OLAP Services also now supports cube and dimension designs such as Distinct Count, Default Members, member property ordering and dimension filters. Distinct Count was difficult to implement and poor performing in 7.0, but has been included as a new aggregate function type. Default members and member property ordering options create a richer user experience by allowing the cube designer to select the member and order that the end user will use most frequently.
These additions, as well as many others, should make Analysis Services the best choice for OLAP and Data Mining within your organization.