Multiple Calendars Using Kimball Outrigger Technique

  • Has anyone successfully used Kimball's outrigger technique for multiple corporate calendars in SSAS?  If so, can you share the technical details of how this was accomplished?  I've been trying to implement this multiple ways now but can't get the dimension to aggregate correctly.  I could simply add more columns to the original date table, but that would end up being a lot of columns.  I could also set up the date as a many to many.  The issue with this as that the main fact tables are accumulating snapshots and I would be doing this with several dates.  The Kimball Reader only has a short blurb on this, but they claim to have done this many times.  The only caution that they have is that you have to in the end restrict the join to only one row per date.  But I do that by making he calendar portion not aggregatable and setting a default.  I'm about to give up on this for now unless someone can tell me that this is really possible.  Even this design raises questions, but the first priority is to just get this to work, if it's possible.  Thanks,

  • RonKyle - Wednesday, September 13, 2017 1:36 PM

    The only caution that they have is that you have to in the end restrict the join to only one row per date.  But I do that by making he calendar portion not aggregatable and setting a default.

    Are you using the CalendarName column in DimCalendar to filter the results to give the 1:1 relationship to DimDate?  I'm assuming based on Kimball's example that the DimCalendar has multiple different calendars for the same DateKey.  I haven't used this specific example with SSAS but I have used DimDate as an outrigger to other dimensions.

  • Are you using the CalendarName column in DimCalendar to filter the results to give the 1:1 relationship to DimDate?

    Beyond setting the property as non-aggregatable and setting a default, how else would you do this?  I don't rule out that I'm missing something on this point, but I can't imagine what else there would be to do.

  • I would assume it's pretty straight forward like a snowflake schema, but this is SSAS afterall. I don't work with it that much.

    In most of my cases, I try not to let the business units expand on the date dimension to a point of needing this. I do however support multiple fiscal calendars, but that's easily solved if they are all using the same fields, but a key to filter their fiscal based on the business unit in the date dimension itself.

    I take it you are trying to support various requirements like Week versus FiscalWeek versus HolidayWeek and so on and so where HolidayWeek may not apply to all units, but maybe just one. Then the question comes to mind, why can't you just create one data mart per unit versus trying to support everything in one?

  • I do however support multiple fiscal calendars, but that's easily solved if they are all using the same fields, but a key to filter their fiscal based on the business unit in the date dimension itself.

    How do you do this?  I tried setting the calendar to non-aggregatable to do this, but that didn't work.   I can't think of another way to do this.

  • RonKyle - Thursday, September 21, 2017 5:40 AM

    I do however support multiple fiscal calendars, but that's easily solved if they are all using the same fields, but a key to filter their fiscal based on the business unit in the date dimension itself.

    How do you do this?  I tried setting the calendar to non-aggregatable to do this, but that didn't work.   I can't think of another way to do this.

    Well, if you have multiple business units sharing the DimDate table, then you copy over 365 records (i.e.: one record per day indexing on the date key) for each client into that table with a business unit identifier (foreign key). Then you ensure that identifier is also present on the Fact tables so when you do JOIN the DimDate table and aggregate it, then you are only aggregating the date records for the specific business unit. This allows you to have different fiscal calendars for different clients where maybe January 1st is not Fiscal Week 1, but maybe Fiscal Week 2 or something.

    Obviously doing this across the business unit as a key is not the most optimized approach. Many business units might share the same fiscal calendar and you don't want to duplicate records where the only uniqueness is that business unit foreign key. Thus, maybe changing the business unit key to a calendar key (not date key) or even fiscal key where multiple business units can share the same fiscal key where other units can have other fiscal keys would be a better more optimized option for your reporting.

    So, if I have 4 fiscal calendars across 100 clients, I only need to store 4 sets of those dates across N number of years as opposed to 100 fiscal calendars across N number of years.

    Make sense?

  • Well, if you have multiple business units sharing the DimDate table, then you copy over 365 records (i.e.: one record per day indexing on the date key) for each client into that table with a business unit identifier (foreign key). Then you ensure that identifier is also present on the Fact tables so when you do JOIN the DimDate table and aggregate it, then you are only aggregating the date records for the specific business unit. This allows you to have different fiscal calendars for different clients where maybe January 1st is not Fiscal Week 1, but maybe Fiscal Week 2 or something

    This isn't per the Kimball method.  If you look at the diagram with which I opened the thread, you will see that it does not require the fact table to have the calendar key in it.  The way you're proposing would require a separate fact entry for each separate calendar.  I've read about this before, but it's generally discouraged because of the row count increases in the fact table.  It would also mean that if you added a new calendar, you would then have to take each of the transactions and create a new set for the new calendar.  The advantage of the Kimball approach (in theory) is that a new calendar would simply have to be added to the calendar record.  Unfortunately I can't get the solution to work completely correctly as a practical matter, and I don't know what I'm missing.  Surely someone has used this method as (supposedly) it is the method by which the Kimball team approached this problem.

  • RonKyle - Friday, September 22, 2017 7:20 AM

    Well, if you have multiple business units sharing the DimDate table, then you copy over 365 records (i.e.: one record per day indexing on the date key) for each client into that table with a business unit identifier (foreign key). Then you ensure that identifier is also present on the Fact tables so when you do JOIN the DimDate table and aggregate it, then you are only aggregating the date records for the specific business unit. This allows you to have different fiscal calendars for different clients where maybe January 1st is not Fiscal Week 1, but maybe Fiscal Week 2 or something

    This isn't per the Kimball method.  If you look at the diagram with which I opened the thread, you will see that it does not require the fact table to have the calendar key in it.  The way you're proposing would require a separate fact entry for each separate calendar.  I've read about this before, but it's generally discouraged because of the row count increases in the fact table.  It would also mean that if you added a new calendar, you would then have to take each of the transactions and create a new set for the new calendar.  The advantage of the Kimball approach (in theory) is that a new calendar would simply have to be added to the calendar record.  Unfortunately I can't get the solution to work completely correctly as a practical matter, and I don't know what I'm missing.  Surely someone has used this method as (supposedly) it is the method by which the Kimball team approached this problem.

    It just depends. In many cases, the data warehouse already has a unique key per business unit in order to allow data marts to be built separately per business unit. While it's true, that business unit may not pass to the data mart because no other business unit exists in said data mart, adding a fiscal key that is specific to that business unit in the fact is not increasing the record count unless there is more than one fiscal key per business unit. The impact is more of making the fact table wider and if you are entirely row-based, that does have a slight impact to performance. In my case, I'm all column-store.

    Anyways, my question would be how do you filter for the right calendar with OrderDateKey (FactSales) ->  DateKey (DimDate) -> DateKey (DimCalender) when the CalendarKey does not exist in the FactSales when it comes from data warehouse to data mart?

  • Anyways, my question would be how do you filter for the right calendar with OrderDateKey (FactSales) -> DateKey (DimDate) -> DateKey (DimCalender) when the CalendarKey does not exist in the FactSales when it comes from data warehouse to data mart?      

    That is the question that I'm am asking by this post.  If I had the answer, I wouldn't have needed to post it.  This design that is in the initial post is in Kimball's Data Warehouse reader as his solution to this problem.  Unfortunately his reader does not provide any technical details, let alone sufficient technical information to implement it beyond the warning to be careful with the filtering.  I assumed that meant setting the calendar to non-aggregatable, but that does not seem to be sufficient.  The answer may lie in the attribute relationships.  If you haven't implemented a solution according to the design above, you're not going to be able to help me.  It's not that I can't do multiple calendars.  There is always the option to add more columns to the date dimension table.  It's that I can't get this particular solution to work properly.  If it weren't from Kimball, I would assume there's something wrong with the design.  But as he says it's the one he uses, someone surely knows how to make the design work in SSAS.

  • RonKyle - Friday, September 22, 2017 8:39 AM

    Anyways, my question would be how do you filter for the right calendar with OrderDateKey (FactSales) -> DateKey (DimDate) -> DateKey (DimCalender) when the CalendarKey does not exist in the FactSales when it comes from data warehouse to data mart?      

    That is the question that I'm am asking by this post.  If I had the answer, I wouldn't have needed to post it.  This design that is in the initial post is in Kimball's Data Warehouse reader as his solution to this problem.  Unfortunately his reader does not provide any technical details, let alone sufficient technical information to implement it beyond the warning to be careful with the filtering.  I assumed that meant setting the calendar to non-aggregatable, but that does not seem to be sufficient.  The answer may lie in the attribute relationships.  If you haven't implemented a solution according to the design above, you're not going to be able to help me.  It's not that I can't do multiple calendars.  There is always the option to add more columns to the date dimension table.  It's that I can't get this particular solution to work properly.  If it weren't from Kimball, I would assume there's something wrong with the design.  But as he says it's the one he uses, someone surely knows how to make the design work in SSAS.

    Well, I may be getting ahead of myself here. I'm not really focusing on Kimballs methodology here because regardless if Kimball posted an article on not, I'm more focused on the problem at hand: how can you implement multiple calendars? This is extremely easy to do and it's pretty obviously the technical details you're searching for are missing.

    If you want to implement multiple calendars to your fact then you're going to need to have those attributes to filter to the calendar. Otherwise, that calendar key is going to cause a many-to-many relationship across the various calendars, which will require you to implement a junction table that effectively do the same thing. This is assuming the Kimball article is referencing multiple calendars. If it's not, then there is a 1:1 relationship because only one unique calendar key exist for every one unique date key right? The moment you have multiple calendar keys, then that's going to duplicate the date key, which is the only key are you joining on from the fact (this is a problem).

    This leaves us with the solution outside of that article, which is you're going to have to find some way to associate your fact to your calendar. You can do this in the fact itself by again, adding a key to filter the calendar dimension or you can add a junction table that will associate an existing key in the fact to the calendar table such as maybe CustomerKey, OrderDateKey and CalendarKey together to formulate that 1:1 relationship.

    Course, the one thing we all may not be considering is maybe that date key is not a numeric value for the actual date that's reused across business units, but actually a sequential value where the dates in the DimDate key are duplicated across business units. This means every business unit in the fact has it's own set of unique date keys, which will effectively filter the DimCalendar table. But, that's no different than adding the business key to the dimension and duplicating the same values per unit.

  • Well, I may be getting ahead of myself here. I'm not really focusing on Kimballs methodology here because regardless if Kimball posted an article on not, I'm more focused on the problem at hand: how can you implement multiple calendars? This is extremely easy to do and it's pretty obviously the technical details you're searching for are missing.

    Thanks for the comments, but if you don't have anything that can help me with this method, I would prefer to hear from others.  This is not that I can't implement multiple calendars.  I can already do multiple calendars through the many to many method or the additional columns method.  But I have fact tables that are accumulating snapshots and that have a lot of dates as well as a lot of possible calendars.  If the Kimball method can use a single join, that would be preferable.  My attempts to use this design have failed due, I presume, to some technical issue that I am missing.  I further assume that as this method is in Kimball's Data Warehouse Reader that others have successfully implemented it.  I would like to hear from one of them.

  • In Kimball Group Reader, on page 339, they talk about "Multi-Enterprise Calendar Dimensions" which is what I believe you are trying to implement.  In the Kimball example, they have a Base Calendar Dimension, with 1 row per day in it, then there is a Snowflake Calendar Subdimension as an outrigger, which has 1 row per day per Organization in it, thus the Subdimension has a higher cardinality than the base dimension and needs to be filtered to query it properly.

    In your example, DimDate looks like it is the 1 row per day table, and DimCalendar is the multiple rows per day table, so you need some kind of key or code in DimCalendar to be able to filter it based on which of your many Corporate Calendars you wish to represent for a given report or cube.  How do you distinguish different corporate calendars in the DimCalendar table?

  • In Kimball Group Reader, on page 339, they talk about "Multi-Enterprise Calendar Dimensions" which is what I believe you are trying to implement. 

    Yes, this is what I'm trying to implement.

    In the Kimball example, they have a Base Calendar Dimension, with 1 row per day in it, then there is a Snowflake Calendar Subdimension as an outrigger, which has 1 row per day per Organization in it, thus the Subdimension has a higher cardinality than the base dimension and needs to be filtered to query it properly.

    What does it mean to be properly filtered?  As I have already explained, I have set the Calendar attribute to be non-aggregatable with a default set.  Therefore it is not possible to choose more than one, and when not being acted on, there is a default.  Yet this does not seem adequate. 

     

    In your example, DimDate looks like it is the 1 row per day table, and DimCalendar is the multiple rows per day table, so you need some kind of key or code in DimCalendar to be able to filter it based on which of your many Corporate Calendars you wish to represent for a given report or cube. How do you distinguish different corporate calendars in the DimCalendar table?   

    Each calendar has a different key and name, and each calendar has a complete set of the same dates, although of course the hierarchies above this level will differ.

  • RonKyle - Monday, September 25, 2017 11:36 AM

    What does it mean to be properly filtered?  As I have already explained, I have set the Calendar attribute to be non-aggregatable with a default set.  Therefore it is not possible to choose more than one, and when not being acted on, there is a default.  Yet this does not seem adequate...
    Each calendar has a different key and name, and each calendar has a complete set of the same dates, although of course the hierarchies above this level will differ.

    if each calendar has a different name, then the CalendarName in DimCalendar is the column you need to filter on for each query / cube you setup against this data model.  Setting the attribute to be non-aggregatable doesn't really filter the data, that is more dealing with the pre-aggregatation of calendar hierarchies such as day/month/quarter/year.  When you query FactSales table, if you include DimCalendar in your joins and don't filter on CalendarName, then it would improperly multiply your results.  Maybe I'm not understanding the original question?

  • if each calendar has a different name, then the CalendarName in DimCalendar is the column you need to filter on for each query / cube you setup against this data model. Setting the attribute to be non-aggregatable doesn't really filter the data, that is more dealing with the pre-aggregatation of calendar hierarchies such as day/month/quarter/year. When you query FactSales table, if you include DimCalendar in your joins and don't filter on CalendarName, then it would improperly multiply your results. Maybe I'm not understanding the original question?

    Sorry, but you are not understanding the actual issue.  The resulting view is not a multiplication of the results.  In that sense, I'm confident that the non-aggregatable setting is having an effect.  What I'm seeing is the correct number for each of the days in each of the calendar.  But above that level the query returns the All total of the metric.  It repeats in each tuple as if it's an unrelated dimension. 

    I don't see how it's possible to filter on each query if the default setting on a non-aggretable dimension attribute will not do that.  Can you explain further that portion of your answer?  By the nature of the cube, the designer can't know how the user will want to query the data.  So all the calculations have to made to work under generic circumstances.

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply