Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««123»»

Capturing the Reason For Change for Type 2 changes Expand / Collapse
Author
Message
Posted Friday, February 28, 2014 8:55 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 6:39 PM
Points: 36,764, Visits: 31,220
Lempster (2/27/2014)
Thanks for the replies folks. As JustMarie and EricEyster made opposing arguments, I guess the most important point is to be consistent!

Jeff, I don't see it as much of a pain to maintain a IsCurrent flag; I agree that it could be seen as superfluous, but I'm just following Kimball best practice which states:

...the current-flag provides a rapid way to isolate exactly the set of dimension members that is in effect at the moment of the query.

So it's for ease of querying more than anything else. Thanls for the tip about the EndDate value though - I think I can afford to lose most of the year 9999 . (Although thinking in that vein lead to the Y2K problem didn't it? )


Put an index on the "Current-Flag" and watch your GUI's timeout when they try to do an INSERT because of the massive extent splits that will occur. I recommend just doing the dates correctly.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1546622
Posted Monday, March 3, 2014 3:08 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Today @ 6:58 AM
Points: 2,036, Visits: 1,377
Jeff Moden (2/28/2014)

Put an index on the "Current-Flag" and watch your GUI's timeout when they try to do an INSERT because of the massive extent splits that will occur. I recommend just doing the dates correctly.


I'm talking about a Data Warehouse so there aren't going to be any GUIs to timeout, certainly not any doing inserts. There will of course be inserts on a daily basis due to the ETL process. The relational tables in the Data Warehouse will have multidimensional cubes built on them and it will be the cubes that are queried by end users, not the relational tables directly.
I will of course undertake extensive testing, but at this point I'm inclined to follow Kimball best practice.

Regards
Lempster
Post #1546831
Posted Monday, March 3, 2014 6:32 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Monday, May 5, 2014 6:31 AM
Points: 291, Visits: 519
Lempster (3/3/2014)
Jeff Moden (2/28/2014)

Put an index on the "Current-Flag" and watch your GUI's timeout when they try to do an INSERT because of the massive extent splits that will occur. I recommend just doing the dates correctly.


I'm talking about a Data Warehouse so there aren't going to be any GUIs to timeout, certainly not any doing inserts. There will of course be inserts on a daily basis due to the ETL process. The relational tables in the Data Warehouse will have multidimensional cubes built on them and it will be the cubes that are queried by end users, not the relational tables directly.
I will of course undertake extensive testing, but at this point I'm inclined to follow Kimball best practice.

Regards
Lempster


Things change a little if you are going to use SSAS. The DW becomes little more than a data store to facilitate the ETL processes. Sure, you need enough to also support your debugging when things go bump in the night, but the Kimball design assumes your users are getting data from the relational engine.

If you want to display the isCurrent flag for testing or for ease of loading to SSAS, create a view to calculate the isCurrent flag using a case statement on the endDate.


Post #1546902
Posted Monday, March 3, 2014 6:40 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Friday, July 25, 2014 2:49 PM
Points: 804, Visits: 1,989
Kimball design assumes your users are getting data from the relational engine


Would you mind explaining further what you mean by this? It's possible that I'm not understanding something, but Kimball is geared to SSAS and SSAS is not pulling the data from the relational engine. What am I missing?



Post #1546904
Posted Monday, March 3, 2014 6:56 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Monday, May 5, 2014 6:31 AM
Points: 291, Visits: 519
RonKyle (3/3/2014)
Kimball design assumes your users are getting data from the relational engine


Would you mind explaining further what you mean by this? It's possible that I'm not understanding something, but Kimball is geared to SSAS and SSAS is not pulling the data from the relational engine. What am I missing?


Yes, follow the Kimball design in the SSAS database. Assuming you are using MOLAP, SSAS pulls the data from the relational engine during dimension/partition processing and does not touch it again. We have systems that rebuild a single partition each week, letting most the of data in the DW untouched until it is purged. no need for heavy indexing, etc, on the relational DW side to support end user queries. Instead, focus on optimizing the ETL processes for fast load, select for the partition, and purge.
Post #1546911
Posted Monday, March 3, 2014 7:08 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Today @ 6:58 AM
Points: 2,036, Visits: 1,377
EricEyster (3/3/2014)
... but the Kimball design assumes your users are getting data from the relational engine.

I disagree with that statement. Sure, the relational data warehouse containing the dimension and fact tables is what the multidimensonal Measures and Dimensions (i.e. Cubes)are built on, but users query the data stored in Cubes directly from SSAS whether that be via Excel (pivot tables), PowerView/PowerPivot, MDX queries, SSRS or third party products.
Post #1546915
Posted Monday, March 3, 2014 7:13 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Friday, July 25, 2014 2:49 PM
Points: 804, Visits: 1,989
EricEyster (3/3/2014)
--------------------------------------------------------------------------------
... but the Kimball design assumes your users are getting data from the relational engine.

I disagree with that statement. Sure, the relational data warehouse containing the dimension and fact tables is what the multidimensonal Measures and Dimensions (i.e. Cubes)are built on, but users query the data stored in Cubes directly from SSAS whether that be via Excel (pivot tables), PowerView/PowerPivot, MDX queries, SSRS or third party products.


I agree with the disagreement. The statement is misleading because the USERS aren't getting the data from the relational engine. SSAS is getting data from the relational engine when it processes. The users then get it from SSAS through some front end.



Post #1546919
Posted Monday, March 3, 2014 8:34 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Thursday, March 27, 2014 2:39 PM
Points: 2,141, Visits: 486
Whatever you folks do, don't forget to set the correct attribute relationship types (rigid for type 2 dimension columns and flexible for type 1 and type 3 columns).
Post #1546948
Posted Wednesday, March 5, 2014 3:17 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 6:39 PM
Points: 36,764, Visits: 31,220
Lempster (3/3/2014)
Jeff Moden (2/28/2014)

Put an index on the "Current-Flag" and watch your GUI's timeout when they try to do an INSERT because of the massive extent splits that will occur. I recommend just doing the dates correctly.


I'm talking about a Data Warehouse so there aren't going to be any GUIs to timeout, certainly not any doing inserts. There will of course be inserts on a daily basis due to the ETL process. The relational tables in the Data Warehouse will have multidimensional cubes built on them and it will be the cubes that are queried by end users, not the relational tables directly.
I will of course undertake extensive testing, but at this point I'm inclined to follow Kimball best practice.

Regards
Lempster


My incliniation is that it doesn't matter where the inserts are coming from. A leading or singular column in an index with such low selectivity is going to cause massive extent splitting that will slow any process down.

Although I'm also inclined to go with what experts say, it does cause me concern when the best practices of one expert or group of experts is contrary to the best practices of another. Testing would be a good thing here.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1548041
Posted Friday, March 7, 2014 8:41 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Thursday, March 27, 2014 2:39 PM
Points: 2,141, Visits: 486

Put an index on the "Current-Flag" and watch your GUI's timeout when they try to do an INSERT because of the massive extent splits that will occur.


For some reason, I am drawing a massive blank about why this would happen (no coffee) . Is this because it is a non clustered index on a low cardinality column?
Post #1548766
« Prev Topic | Next Topic »

Add to briefcase ««123»»

Permissions Expand / Collapse