Improving cube processing time

As your SSAS cube gets bigger, cube processing time will become a problem. This is especially true as more and more companies want cube processing during the day instead of the usual off-hours time when no one is using the cube. Partitioning the cube can help to reduce the processing time. So can using a different processing strategy than a Process Full.

The biggest benefit of partitioning is that it allows you to process multiple partitions in parallel on a server that has multiple processors. This can greatly reduce the total cube processing time. Note that partitioning requires the enterprise version of SQL Server 2008 (view version differences).

Regarding the best possible processing strategy, I suggest the following steps:

1. Process Update all the dimensions that could have had data changed. Depending on the nature of the changes in the dimension table, Process Update can affect dependent partitions. If only new members were added, then the partitions are not affected. But if members were deleted or if member relationships changed (e.g., a Customer moved from Redmond to Seattle), then some of the aggregation data and bitmap indexes on the partitions are dropped. The cube is still available for queries, albeit with lower performance (with Process Update flexible aggregations and indexes on related partitions will be dropped). This means after a Process Update you need to do a Process Index on the partitions (see step #3).

2. Process Data the partitions that have changed data (which are usually the most recent partitions). Of course, the smaller the partition, the better, so try to use daily partitions instead of monthly or use monthly partitions instead of yearly.

3. Process Index for the rest of the partitions (the partitions that have not changed). Note instead of doing step #2 and step #3 separately you could just do a Process Full (which does a Process Data and Process Index behind the scenes). However, the best practice is to do a Process Data and Process Index separately instead of a Process Full, because: it is a bit faster, it reduces the stress on the server, it makes data available to end-users sooner (while a Process Index is happening, users can still query cube), and you can just run Process Index if Process Data completes but Process Index bombs. As another option, instead of doing a Process Index, you can do a Process Default, which will evaluate the state of all the partitions: for one’s that had a Process Full it will ignore them; for one’s that did not process, it may or may not touch them (it will if they were part of a dimension and the aggregations were dropped, in which case it will rebuild those aggregations and indexes via a Process Index but it won’t reprocess the data); for one’s that had a Process Data it will build out the aggregations and indexes (Process Index).

There is one more option to consider, a Process Incremental, which can be used in place of all the above steps. But you can only do a Process Incremental if there are new records for the partition (no updates or deletes), as a Process Incremental never deletes or updates existing members, it only adds new members. A Process Incremental internally creates a temporary partition, processes it with the target fact data, and then merges it with the existing partition. Process Incremental doesn’t drop aggregations and indexes. Note a big point of confusion is that when you choose “Process Incremental” in the GUI, it really translates it into a Process Add, and it only works for partitions and dimensions (yet for some reason the Process Incremental option does not show up on the GUI for dimensions - you need to fire it using XMLA). Even though the GUI has the Process Incremental option available for Cube and Measure Group, it scripts those to work only on the partitions.

I prefer to have my cube do a Process Full occasionally to mop up any deletes or updates to measure groups, and to just generally ensure it’s exactly the same as the tables it’s built from.

Other options for increasing processing speed is to improve I/O: More/Faster spindles, short-stoke the disks, use solid-state disks.

And other ways to improve processing speed: add faster CPU, add more memory, use remote partitions, use a dedicated SSAS server, changing Processing thread pool (used for allocating worker threads for processing jobs), use Jumbo Frames, increase the network packet size, use multiple NICs.

Since a picture is worth a thousand words, here are all the process options depending on the type of object: