Cube size estimation?

Question

Cube size estimation?

David.Poole

SSC Guru

Points: 76258
More actions
May 28, 2009 at 11:05 am

#205418

I'm completely new to cubes although I understand the principles and have read Ralph Kimballs book on dimensional modelling so my apologies for the basic nature of my question.
I have to put together a sizing estimate for a cube where the data will be held in the cube itself.
Is there a rule of thumb for taking sp_spaceused from the fact table and giving an estimate for a cube built on that fact table?
LinkedIn Profile

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply

GSquared SSC Guru Points: 260824 More actions · Answer 1

I can't think of how there could be a rule of thumb on that. Doesn't it kind of depend on how you define the cube?

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

stevefromOZ SSC-Forever Points: 43646 More actions · Answer 2

Not sure there's a rule of thumb, particularly not based on sp_spaceused for the fact. The areas where you'll have storage commitments are the dimensions (these can be affected by not setting attribute relationships correctly) and then the partitions, both for base data but also for the aggregations. The aggs are 'net new' data when compared to what you've brought in from the EDW. They're also affected by what you chose/allow to be aggregation candidates, and then of course how many members exist in the aggregation combination (simple e.g. Time dim with Yr level plus Customer Dim with Customer level, 3 yrs and 6 customers, if storing the aggs here,would require 18 aggregate records).

SSAS does have extremely good compression on it's side though, so often times, even with the aggregations, the resultant SSAS DB (so dims, cubes with aggs etc) are smaller than the total input.

If you have a reasonable sample dataset, you could extrapolate using the sample sized output.

Steve.