Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

Devin Knight

Devin is a BI consultant at Pragmatic Works Consulting. Previously, he has tech edited the book Professional Microsoft SQL Server 2008 Integration Services and was an author in the book Knight's 24-Hour Trainer: Microsoft SQL Server 2008 Integration Services. Devin has spoken at past conferences like PASS and at several SQL Saturday events. He is a contributing member to the Business Intelligence Special Interest Group (SIG) for PASS as a leader in the SSIS Focus Group. Making his home in Jacksonville, FL, Devin is a participating member of the local users’ group (JSSUG).

SSAS Dimension Attribute Discretization Methods

Discretization (one of my favorite tech words to say J) is a really great feature of Analysis Services that is often forgotten.  Basically what discretization does is it can take a dimension attribute and automatically place its members in buckets for you.  For example, if you have a product dimension that has an attribute for price you could create buckets for price instead of listing every distinct value. 

To set this property you select the attribute to configure and open the properties menu.  The properties you’re concerned with are DiscretizationMethod and DiscretizationBucketCount. 

The DiscretizationMethod property controls the way the buckets are created.  The below chart, provided by technet, breaks down what each option does. 

DiscretizationMethod Setting

Description

None

Displays the members.

Automatic

Selects the method that best represents the data: either the EqualAreas method or the Clusters method.

EqualAreas

Tries to divide the members in the attribute into groups that contain an equal number of members.

Clusters

Tries to divide the members in the attribute into groups by sampling the training data, initializing to a number of random points, and running several iterations of the Expectation-Maximization (EM) clustering algorithm.

This method is useful because it works on any distribution curve, but is more expensive in terms of processing time.

The DiscretizationBucketSize controls the number of buckets to be created.  When it’s set to 0 the buckets are automatically created. 

So if this is such a great feature why is it not used more often?  Well like many things it’s all a matter of control.  Using these discretization properties you have some control of how the buckets are created and displayed but you may have a specific bucket size requirement and these properties don’t permit specifications other than the ones highlighted earlier to get the exact bucket size you may require. 

So what alternative do you have?  You can manually create your buckets in the actual table, in a view using a case statement, or in the data source view also with a case statement.  That way you have complete control over the buckets.  This is what I see most commonly done but it’s nice to know you have options if you don’t have specific bucket requirements.

Comments

Posted by walteranez on 2 February 2011

Are there any query performance gains when you use a dimension with discretization for reporting? For instance, if my dimension it's really large (in the millions of rows) containing an attribute that's alphanumeric such as Ticket Number, and users would like to be able to drill down to it.

Posted by shan hasan on 11 February 2011

It really helps in SSAS !!!!

Leave a Comment

Please register or log in to leave a comment.