SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Capturing the Reason For Change for Type 2 changes


Capturing the Reason For Change for Type 2 changes

Author
Message
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (209K reputation)SSC Guru (209K reputation)SSC Guru (209K reputation)SSC Guru (209K reputation)SSC Guru (209K reputation)SSC Guru (209K reputation)SSC Guru (209K reputation)SSC Guru (209K reputation)

Group: General Forum Members
Points: 209315 Visits: 41973
Lempster (2/27/2014)
Thanks for the replies folks. As JustMarie and EricEyster made opposing arguments, I guess the most important point is to be consistent!

Jeff, I don't see it as much of a pain to maintain a IsCurrent flag; I agree that it could be seen as superfluous, but I'm just following Kimball best practice which states:

...the current-flag provides a rapid way to isolate exactly the set of dimension members that is in effect at the moment of the query.

So it's for ease of querying more than anything else. Thanls for the tip about the EndDate value though - I think I can afford to lose most of the year 9999 :-). (Although thinking in that vein lead to the Y2K problem didn't it? ;-))


Put an index on the "Current-Flag" and watch your GUI's timeout when they try to do an INSERT because of the massive extent splits that will occur. ;-) I recommend just doing the dates correctly.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Lempster
Lempster
SSCertifiable
SSCertifiable (5K reputation)SSCertifiable (5K reputation)SSCertifiable (5K reputation)SSCertifiable (5K reputation)SSCertifiable (5K reputation)SSCertifiable (5K reputation)SSCertifiable (5K reputation)SSCertifiable (5K reputation)

Group: General Forum Members
Points: 5013 Visits: 1657
Jeff Moden (2/28/2014)

Put an index on the "Current-Flag" and watch your GUI's timeout when they try to do an INSERT because of the massive extent splits that will occur. ;-) I recommend just doing the dates correctly.


I'm talking about a Data Warehouse so there aren't going to be any GUIs to timeout, certainly not any doing inserts. There will of course be inserts on a daily basis due to the ETL process. The relational tables in the Data Warehouse will have multidimensional cubes built on them and it will be the cubes that are queried by end users, not the relational tables directly.
I will of course undertake extensive testing, but at this point I'm inclined to follow Kimball best practice.

Regards
Lempster
EricEyster
EricEyster
Right there with Babe
Right there with Babe (726 reputation)Right there with Babe (726 reputation)Right there with Babe (726 reputation)Right there with Babe (726 reputation)Right there with Babe (726 reputation)Right there with Babe (726 reputation)Right there with Babe (726 reputation)Right there with Babe (726 reputation)

Group: General Forum Members
Points: 726 Visits: 520
Lempster (3/3/2014)
Jeff Moden (2/28/2014)

Put an index on the "Current-Flag" and watch your GUI's timeout when they try to do an INSERT because of the massive extent splits that will occur. ;-) I recommend just doing the dates correctly.


I'm talking about a Data Warehouse so there aren't going to be any GUIs to timeout, certainly not any doing inserts. There will of course be inserts on a daily basis due to the ETL process. The relational tables in the Data Warehouse will have multidimensional cubes built on them and it will be the cubes that are queried by end users, not the relational tables directly.
I will of course undertake extensive testing, but at this point I'm inclined to follow Kimball best practice.

Regards
Lempster


Things change a little if you are going to use SSAS. The DW becomes little more than a data store to facilitate the ETL processes. Sure, you need enough to also support your debugging when things go bump in the night, but the Kimball design assumes your users are getting data from the relational engine.

If you want to display the isCurrent flag for testing or for ease of loading to SSAS, create a view to calculate the isCurrent flag using a case statement on the endDate.
RonKyle
RonKyle
SSCertifiable
SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)

Group: General Forum Members
Points: 6794 Visits: 3620
Kimball design assumes your users are getting data from the relational engine


Would you mind explaining further what you mean by this? It's possible that I'm not understanding something, but Kimball is geared to SSAS and SSAS is not pulling the data from the relational engine. What am I missing?



EricEyster
EricEyster
Right there with Babe
Right there with Babe (726 reputation)Right there with Babe (726 reputation)Right there with Babe (726 reputation)Right there with Babe (726 reputation)Right there with Babe (726 reputation)Right there with Babe (726 reputation)Right there with Babe (726 reputation)Right there with Babe (726 reputation)

Group: General Forum Members
Points: 726 Visits: 520
RonKyle (3/3/2014)
Kimball design assumes your users are getting data from the relational engine


Would you mind explaining further what you mean by this? It's possible that I'm not understanding something, but Kimball is geared to SSAS and SSAS is not pulling the data from the relational engine. What am I missing?


Yes, follow the Kimball design in the SSAS database. Assuming you are using MOLAP, SSAS pulls the data from the relational engine during dimension/partition processing and does not touch it again. We have systems that rebuild a single partition each week, letting most the of data in the DW untouched until it is purged. no need for heavy indexing, etc, on the relational DW side to support end user queries. Instead, focus on optimizing the ETL processes for fast load, select for the partition, and purge.
Lempster
Lempster
SSCertifiable
SSCertifiable (5K reputation)SSCertifiable (5K reputation)SSCertifiable (5K reputation)SSCertifiable (5K reputation)SSCertifiable (5K reputation)SSCertifiable (5K reputation)SSCertifiable (5K reputation)SSCertifiable (5K reputation)

Group: General Forum Members
Points: 5013 Visits: 1657
EricEyster (3/3/2014)
... but the Kimball design assumes your users are getting data from the relational engine.

I disagree with that statement. Sure, the relational data warehouse containing the dimension and fact tables is what the multidimensonal Measures and Dimensions (i.e. Cubes)are built on, but users query the data stored in Cubes directly from SSAS whether that be via Excel (pivot tables), PowerView/PowerPivot, MDX queries, SSRS or third party products.
RonKyle
RonKyle
SSCertifiable
SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)

Group: General Forum Members
Points: 6794 Visits: 3620
EricEyster (3/3/2014)
--------------------------------------------------------------------------------
... but the Kimball design assumes your users are getting data from the relational engine.

I disagree with that statement. Sure, the relational data warehouse containing the dimension and fact tables is what the multidimensonal Measures and Dimensions (i.e. Cubes)are built on, but users query the data stored in Cubes directly from SSAS whether that be via Excel (pivot tables), PowerView/PowerPivot, MDX queries, SSRS or third party products.


I agree with the disagreement. The statement is misleading because the USERS aren't getting the data from the relational engine. SSAS is getting data from the relational engine when it processes. The users then get it from SSAS through some front end.



sneumersky
sneumersky
Hall of Fame
Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)

Group: General Forum Members
Points: 3148 Visits: 487
Whatever you folks do, don't forget to set the correct attribute relationship types (rigid for type 2 dimension columns and flexible for type 1 and type 3 columns). :-D
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (209K reputation)SSC Guru (209K reputation)SSC Guru (209K reputation)SSC Guru (209K reputation)SSC Guru (209K reputation)SSC Guru (209K reputation)SSC Guru (209K reputation)SSC Guru (209K reputation)

Group: General Forum Members
Points: 209315 Visits: 41973
Lempster (3/3/2014)
Jeff Moden (2/28/2014)

Put an index on the "Current-Flag" and watch your GUI's timeout when they try to do an INSERT because of the massive extent splits that will occur. ;-) I recommend just doing the dates correctly.


I'm talking about a Data Warehouse so there aren't going to be any GUIs to timeout, certainly not any doing inserts. There will of course be inserts on a daily basis due to the ETL process. The relational tables in the Data Warehouse will have multidimensional cubes built on them and it will be the cubes that are queried by end users, not the relational tables directly.
I will of course undertake extensive testing, but at this point I'm inclined to follow Kimball best practice.

Regards
Lempster


My incliniation is that it doesn't matter where the inserts are coming from. A leading or singular column in an index with such low selectivity is going to cause massive extent splitting that will slow any process down.

Although I'm also inclined to go with what experts say, it does cause me concern when the best practices of one expert or group of experts is contrary to the best practices of another. Testing would be a good thing here.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
sneumersky
sneumersky
Hall of Fame
Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)Hall of Fame (3.1K reputation)

Group: General Forum Members
Points: 3148 Visits: 487

Put an index on the "Current-Flag" and watch your GUI's timeout when they try to do an INSERT because of the massive extent splits that will occur.


For some reason, I am drawing a massive blank about why this would happen (no coffee) . Is this because it is a non clustered index on a low cardinality column?
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search