One or two fact tables?

Question

One or two fact tables?

aaa121

SSCertifiable

Points: 5902
More actions
August 8, 2013 at 3:00 am

#296774

Hello,
I am currently working on a dimensional model for insurance claims.
When a claim is made money is paid out by the insurance company at a policy level (stored in a transactional format at source).
Money can also be recovered against a claim. This is stored in the source system in a transaction level but at a lower level of detail than policy.
Having one fact table for claims logically makes the most sense but the payments made in the insured does not break down below policy level.
I wanted to throw this out there to get some ideas on how best to design this if poss?
Thanks

Viewing 14 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply

Koen Verbeeck SSC Guru Points: 259197 More actions · Answer 1

Is a claim bound in any way to a policy? If yes and claim is just a smaller level of detail of a policy, than I would create just one fact table at the claim level.

If you need data at the policy level, you just aggregate it.

It's a best practice to create your fact tables at the lowest level of detail (aka the grain) possible.

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

Andy Hyslop SSCrazy Eights Points: 9198 More actions · Answer 2

Koen Verbeeck (8/8/2013)
Is a claim bound in any way to a policy? If yes and claim is just a smaller level of detail of a policy, than I would create just one fact table at the claim level.
If you need data at the policy level, you just aggregate it.
It's a best practice to create your fact tables at the lowest level of detail (aka the grain) possible.

Yep +1 🙂

==========================================================================================================================
A computer lets you make more mistakes faster than any invention in human history - with the possible exceptions of handguns and tequila. Mitch Ratcliffe

aaa121 SSCertifiable Points: 5902 More actions · Answer 3

Hi Both,

Thanks for your replies...I'll provide a little more detail based on your response.

Source system has the following tables:

Policy table (a record for each policy)

Policy risk (a record for each policy/risk type/premium class per each policy renewal)

Claim Table (a record for each policy/claim/risk type/premium class) - this table contains the total paid out and the total recovered.

Claim Recovery Table (a record for each policy/claim/risk type/premium class/treaty/reinsurance) - this table holds the recovered.

A policy can have 1 to many associated risk types

A policy can have 1 to many associated premium types

A policy can have 0 to many claims

A claim can have 0 to many recoveries

So I'll create a simple policy dimension containing a record for each policy.

I have then toyed with the idea then of creating a mini dimension(s) to store the rapidly changing risk and premium types as each policy could end up with hundred of records. (each policy appears an average of 47 times on the policy risk table). What are your views on this?

I want to have a single fact table at the lowest grain possible. However I am struggling with the total paid figure which is not available at the same grain as recovery.

Koen Verbeeck SSC Guru Points: 259197 More actions · Answer 4

Is it possible to post some sample data (this may be fake).

It would be easier to wrap my head around. 🙂

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

aaa121 SSCertifiable Points: 5902 More actions · Answer 5

Ok thanks for your help:

declare @policy table

(policy_number int)

insert into @policy

select 100

union

select 101

union

select 102

declare @policy_risk table

(policy_number int,

risk_type int,

premium_class char(3))

insert into @policy_risk

select 100, 1, 'AA1'

union

select 100, 1, 'ACC'

union

select 100, 1, 'GHT'

union

select 100, 1, 'DES'

union

select 100, 1, 'FIR'

union

select 101, 1, 'FIR'

union

select 101, 1, 'SEC'

union

select 101, 1, 'PPL'

union

select 102, 2, 'STA'

union

select 102, 2, 'KLS'

union

select 102, 2, 'ZZA'

union

select 102, 3, 'HTD'

union

select 102, 3, 'DTR'

declare @claim table

(claim_number int,

policy_number int,

risk_type int,

premium_class char(3),

total_paid money,

total_recovered money)

insert into @claim

select 1, 101, 1, 'FIR', 345, 0

union

select 2, 102, 2, 'STA', 7465.21, 100

declare @claim_recoveries table

(claim_recovery_id int identity(1,1),

claim_number int,

policy_number int,

risk_type int,

premium_class char(3),

treaty char(3),

reinsurance_id int,

total_recovered money)

insert into @claim_recoveries

select 2, 102, 2, 'STA', 'QSH',50, 25

union all

select 2, 102, 2, 'STA', 'AAA',51, 25

select * from @policy

select * from @policy_risk

select * from @claim

select * from @claim_recoveries

Koen Verbeeck SSC Guru Points: 259197 More actions · Answer 6

The total_recovered value in claim (100) is not the same as the sum of the recoveries in claim_recoveries (2 x 25). Is this normal?

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

aaa121 SSCertifiable Points: 5902 More actions · Answer 7

Apologies! My mistake sorry.

It should total the 100. Updated code attached:

declare @policy table

(policy_number int)

insert into @policy

select 100

union

select 101

union

select 102

declare @policy_risk table

(policy_number int,

risk_type int,

premium_class char(3))

insert into @policy_risk

select 100, 1, 'AA1'

union

select 100, 1, 'ACC'

union

select 100, 1, 'GHT'

union

select 100, 1, 'DES'

union

select 100, 1, 'FIR'

union

select 101, 1, 'FIR'

union

select 101, 1, 'SEC'

union

select 101, 1, 'PPL'

union

select 102, 2, 'STA'

union

select 102, 2, 'KLS'

union

select 102, 2, 'ZZA'

union

select 102, 3, 'HTD'

union

select 102, 3, 'DTR'

declare @claim table

(claim_number int,

policy_number int,

risk_type int,

premium_class char(3),

total_paid money,

total_recovered money)

insert into @claim

select 1, 101, 1, 'FIR', 345, 0

union

select 2, 102, 2, 'STA', 7465.21, 100

declare @claim_recoveries table

(claim_recovery_id int identity(1,1),

claim_number int,

policy_number int,

risk_type int,

premium_class char(3),

treaty char(3),

reinsurance_id int,

total_recovered money)

insert into @claim_recoveries

select 2, 102, 2, 'STA', 'QSH',50, 50

union all

select 2, 102, 2, 'STA', 'AAA',51, 50

select * from @policy

select * from @policy_risk

select * from @claim

select * from @claim_recoveries

Koen Verbeeck SSC Guru Points: 259197 More actions · Answer 8

OK, now that I see the data, I think you should have 2 fact tables, because indeed they are at two different grains.

Recovery is at the treaty level, while total paid is at the claim level.

Also the fact that some claims don't have recoveries makes me think in the direction of two fact tables.

But that's OK. When you analyse the data later on in Excel, you drag the total_paid measure next to the pivot and claims to the rows. When you add the recovery, it will automatically aggregate the data to the claims level.

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

aaa121 SSCertifiable Points: 5902 More actions · Answer 9

Thanks for your time on this. The conclusion I kept coming to was two fact tables as well.

What did you think about the mini dimension solution for those more frequently changing attributes?

kl25 Hall of Fame Points: 3812 More actions · Answer 10

Koen Verbeeck (8/8/2013)
OK, now that I see the data, I think you should have 2 fact tables, because indeed they are at two different grains.

Interesting topic. I've been struggling with a similar, two grains but related dimensions situation. Was coming to the conclusion that two fact tables were needed. Any suggested references for reading more about the use of two fact tables? My attempts to search for/find references were not fruitful.

Koen Verbeeck SSC Guru Points: 259197 More actions · Answer 11

aaa121 (8/8/2013)
Thanks for your time on this. The conclusion I kept coming to was two fact tables as well.
What did you think about the mini dimension solution for those more frequently changing attributes?

If the risk and premium class have fixed number of combinations, the mini-dimension is probably a good idea.

I do think you'd drop the policy risk dimension though, and add a surrogate key into the fact table(s), so that your policy dimension stays small. This way, the risk/premium class is directly tied to the facts.

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

Koen Verbeeck SSC Guru Points: 259197 More actions · Answer 12

kl25 (8/8/2013)
Koen Verbeeck (8/8/2013)
OK, now that I see the data, I think you should have 2 fact tables, because indeed they are at two different grains.
Interesting topic. I've been struggling with a similar, two grains but related dimensions situation. Was coming to the conclusion that two fact tables were needed. Any suggested references for reading more about the use of two fact tables? My attempts to search for/find references were not fruitful.

The Data Warehouse Toolkit by Ralph Kimball would be a good start.

Also check out his regular design tips. For example:

http://www.kimballgroup.com/2013/08/05/design-tip-158-making-sense-of-the-semantic-layer/

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

kl25 Hall of Fame Points: 3812 More actions · Answer 13

Koen Verbeeck (8/9/2013)
kl25 (8/8/2013)
Koen Verbeeck (8/8/2013)
OK, now that I see the data, I think you should have 2 fact tables, because indeed they are at two different grains.
Interesting topic. I've been struggling with a similar, two grains but related dimensions situation. Was coming to the conclusion that two fact tables were needed. Any suggested references for reading more about the use of two fact tables? My attempts to search for/find references were not fruitful.
The Data Warehouse Toolkit by Ralph Kimball would be a good start.
Also check out his regular design tips. For example:
http://www.kimballgroup.com/2013/08/05/design-tip-158-making-sense-of-the-semantic-layer/

Been doing a lot of reading in the Toolkit. Clearly haven't gotten to the right section yet. 😀

Have only run into the design tips a few times based on general searches. Looks like I need to do a more thorough look at that section of the kimballgroup site. Thanks so much for both suggestions!