SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Dimensional DB Design Question


Dimensional DB Design Question

Author
Message
Eric Mamet
Eric  Mamet
SSCommitted
SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)

Group: General Forum Members
Points: 1589 Visits: 893
Dear all,

I am familiar with Dimensional Modelling but here is a type of question I have not had to address before and I don't quite know how to model/handle this.

We supply financial products to customers. A customer applies through an "Application".
Depending on various factors, an Application may require an action from an Operator in order to progress. We call this a "Task".
Once a "Task" is fullfilled, another "Task" maybe necessary and there may be a delay before that next task starts (the end of one task is not necessarily the beginning of the next).
Tasks are grouped in Categories.

We would like to be able to answer questions like: "What were the Applications that moved from a Task of category X to a Task of category Y within a particular period?"

My guess would be that I will have a Fact table of Tasks with the following columns (foreign key to dimension tables)

- TaskApplication -> DimApplication (which application does this task applies to)
- Task_StartDateTime -> DimDate
- Task_EndDateTime -> DimDate
- TaskType -> DimTask (Type and Category)

My difficulty is then: how do I (quickly) identify all applications that moved from one task of type X to another of type Y.

Am I right in thinking that I should have a column in my fact table indicating the Task Rank for each Application (first task for that application is 1, second 2, etc)?

Eric Unsure
Loner
Loner
Hall of Fame
Hall of Fame (3.5K reputation)Hall of Fame (3.5K reputation)Hall of Fame (3.5K reputation)Hall of Fame (3.5K reputation)Hall of Fame (3.5K reputation)Hall of Fame (3.5K reputation)Hall of Fame (3.5K reputation)Hall of Fame (3.5K reputation)

Group: General Forum Members
Points: 3546 Visits: 3350
What if you use a factless table to keep track of the application? (Kimball used this method to keep track of which event happened, you could find it in his book.)

FromTask - Dim Task
FromDate - Dim Date
ToTask - Dim Task
ToDate - Dim Date
Application - Dim Application
Eric Mamet
Eric  Mamet
SSCommitted
SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)

Group: General Forum Members
Points: 1589 Visits: 893
You may be right
I'll have to go back to his books (which I have read some time ago)

Thanks
Dave Balsillie
Dave Balsillie
Mr or Mrs. 500
Mr or Mrs. 500 (507 reputation)Mr or Mrs. 500 (507 reputation)Mr or Mrs. 500 (507 reputation)Mr or Mrs. 500 (507 reputation)Mr or Mrs. 500 (507 reputation)Mr or Mrs. 500 (507 reputation)Mr or Mrs. 500 (507 reputation)Mr or Mrs. 500 (507 reputation)

Group: General Forum Members
Points: 507 Visits: 430
While this approach will work, it looks like you're creating a highly sparse cube structure. (This is often a a marker, for using BI technologies for othger purposes.)

From what I can see, you would be as far ahead simply designing a report from your operational data store, rather than trying to model this into a dimensional form. Some general quidelines that I've used on dimensional modelling:

- The dimension should be fully specified, without regard to other dimensions (or fact tables for that matter).
- Dimension values should be reusable via the intersection of the fact tables to other dimensions. For example, it should be a valid construct, where both customer A and customer B point to the same application. (I suspect that's not true in this case.)
- Ideally, fact values have some aggregation associated with them. BI is used to quickly identify trends, do rollups, etc.

Based on the comments from your original post, I'd expect the following:

Customer Dimension - track all chars of the customer
Time dimension - This is likely the "date sold", or when it was initiated.
Task Type dimension - type, category, etc.
Product dimension - details on the various financial products

The "data dimension" represents the application, and it's various stages:

Measures:
Time executing task
Application ID (rollup a distinct count of the application IDs)
Start date (rollup the minimum date)
End Date (rollup the maximum date)
Operator (rollup a distinct count of the operators)

Note: if the same task can occur multiple times for the same application, then you'll need to add a second time dimension.

I hope this helps.
Eric Mamet
Eric  Mamet
SSCommitted
SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)

Group: General Forum Members
Points: 1589 Visits: 893
Kind of... Makes me think!

I suggested to use a datamart instead of fullfilling the client's report request one by one and we are not planning to use Analysis Services (cubes) for the time being.

In essence, I am just trying to create an easy base for reporting.

Our view of the data is also partial as we are only an "intermediary" (the web front-end to mainframe back ends). Our client uses their own datawarehouse inhouse.

I'll do a bit of reading

Thanks.
Alan G-436699
Alan G-436699
Valued Member
Valued Member (50 reputation)Valued Member (50 reputation)Valued Member (50 reputation)Valued Member (50 reputation)Valued Member (50 reputation)Valued Member (50 reputation)Valued Member (50 reputation)Valued Member (50 reputation)

Group: General Forum Members
Points: 50 Visits: 222
Dear all,

I am familiar with Dimensional Modelling but here is a type of question I have not had to address before and I don't quite know how to model/handle this.

We supply financial products to customers. A customer applies through an "Application".
Depending on various factors, an Application may require an action from an Operator in order to progress. We call this a "Task".
Once a "Task" is fullfilled, another "Task" maybe necessary and there may be a delay before that next task starts (the end of one task is not necessarily the beginning of the next).
Tasks are grouped in Categories.

We would like to be able to answer questions like: "What were the Applications that moved from a Task of category X to a Task of category Y within a particular period?"

My guess would be that I will have a Fact table of Tasks with the following columns (foreign key to dimension tables)

- TaskApplication -> DimApplication (which application does this task applies to)
- Task_StartDateTime -> DimDate
- Task_EndDateTime -> DimDate
- TaskType -> DimTask (Type and Category)

My difficulty is then: how do I (quickly) identify all applications that moved from one task of type X to another of type Y.

Am I right in thinking that I should have a column in my fact table indicating the Task Rank for each Application (first task for that application is 1, second 2, etc)?

Eric


Hi Eric,

Your Fact Table looks just right to me. You simply need a query to answer your question. I see no reason why you can't have a query that interrogates the Fact table twice - something like:

SELECT DISTINCT FACT_FROM.TaskApplication
FROM FACT_TABLE FACT_FROM,
FACT_TABLE FACT_TO,
DIMDATE DATE_FROM_START,
DIMDATE DATE_TO_START,
DIMDATE DATE_TO_END,
DIMTASK TASK_FROM,
DIMTASK TASK_TO
WHERE FACT_FROM.STARTDATETIME = DATE_FROM_START.DATETIME
AND FACT_TO.STARTDATETIME = DATE_TO_START.DATETIME
AND FACT_TO.ENDDATETIME = DATE_TO_END.DATETIME
AND FACT_FROM.TaskApplication = FACT_TO.TaskApplication
AND DATE_TO_START.DATETIME >= @DATE_FROM
AND DATE_TO_END.DATETIME <= @DATE_TO
AND DATE_FROM_START.DATETIME <= DATE_TO_START.DATETIME -- Make sure X occurs before Y
AND FACT_FROM.TASK_TYPE_KEY = TASK_FROM.TASK_KEY
AND FACT_TO.TASK_TYPE_KEY = TASK_TO.TASK_KEY
AND TASK_FROM.TASK_TYPE = 'X'
AND TASK_TO.TASK_TYPE = 'Y'

I know the query above is not perfect and I've not used ANSI joins etc but hopefully it shows how you can get the answer to your question using your chosen fact table structure.
Eric Mamet
Eric  Mamet
SSCommitted
SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)

Group: General Forum Members
Points: 1589 Visits: 893
I must spend a bit of time looking into the factless fact idea. I am in no rush.

However, if I went for the initial solution, I suspect I still need to work out the ranking between tasks because the precise moment an application "move" from one task of type X to a task of type Y is when the first task of type Y following a type of type X starts...


I would then have a query like

SELECT DISTINCT FactFrom.ApplicationKey
FROM TaskFact TaskFrom
INNER JOIN TaskFact TaskTo ON TaskFrom.ApplicationKey = Facto.ApplicationKey
AND TaskFrom.TaskRank = TaskTo.TaskRank - 1
AND TaskFrom.TaskType = X
AND TaskTo.TaskType = Y
WHERE TaskTo.Start_StartDateTime >= @MinDate
AND TaskTo.Start_StartDateTime <= @MaxDate


I suspect the factless fact idea is better because it's all worked out upfront in the ETL and the final query gets straight forward, which is pretty much my goal as we won't use cubes.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search