Blog Post

Star Join Optimization in SQL Server 2008

,

    When working with a dimensionally modeled data warehouse it is common for a large number of your queries to follow a "star pattern". This pattern consist of Fact data being retrieved along with it's related dimension data from a start schema. What makes the Star Schema look like a star is the dimensions rotating around the Fact Table. See a good example of it here. Star queries also usually have most, if not all, of the following characteristics:

  1. They follow a pattern of joining fact tables with their related dimensions
  2. They apply some types of filtering
  3. They usually perform some type of aggregation or additive operation

    A true relational thought process can hinder the performance of this type of query, so in SQL Server 2008 Enterprise Edition (sorry standard edition folks) they have added new optimizations in the query optimizer to significantly improve query performance for queries that retrieve a larger portion of those tables.

    The new additions are based on Bitmap filters, which you may be familiar with if you've played with Bitmap indexing in Analysis Services. Some good info on those can be found here:

    Basically it allows SQL Server to remove non-qualifying fact table rows from any further processing, resulting in significant amount of processing time compared to comparable products. Results of 15%-30%. Some individual queries get a n even greater boost.

    Below is a diagram from MSDN showing the mechanism in action.

    clip_image001

    FROM MSDN:

    " Figure 1: Star join query plan with join reduction processing for efficient DW

    The new star join optimization uses a series of hash joins, building a hash table for each dimension table that participates. As a byproduct of building this hash table, additional information, called a bitmap filter, is built. Bitmap filters are represented as boxes in Figure 1, labeled β€œJoin Reduction Info.” These filters are pushed down into the scan on the fact table, and effectively eliminate almost all the rows that would be eliminated later by the joins. This eliminates the need to spend CPU time later copying the eliminated rows and probing the hash tables for them. The illustration shows the effect of this filtering within the fact table scan. The SQL Server 2008 query executor also re-orders the bitmaps during execution, putting the most selective one first, then the next most selective one, and so forth. This saves more CPU time, because once a fact table row fails a check against a bitmap, the row is skipped. "

    Enjoy this new feature in your data warehousing efforts! Let me know if you have any success stories or challenges so we can work through them together! πŸ™‚

    Don't forget to post your thoughts or email me your questions to ajorgensen@pragmaticworks.com. As always, this Blog is to help you better understand the tools at your disposal ...

You can see more posts like this as well as other great content on my main blog at http://blogs.pragmaticworks.com/Adam_Jorgensen

Rate

β˜… β˜… β˜… β˜… β˜… β˜… β˜… β˜… β˜… β˜…

You rated this post out of 5. Change rating

Share

Share

Rate

β˜… β˜… β˜… β˜… β˜… β˜… β˜… β˜… β˜… β˜…

You rated this post out of 5. Change rating