Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Complex and weird requirement : multi-customers Datawarehouse Expand / Collapse
Author
Message
Posted Wednesday, June 5, 2013 9:48 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: 2 days ago @ 7:07 AM
Points: 1, Visits: 40
Hi,

We have a project of having a kind of premade-standard-retailler-datawarehouse in the Cloud (hosted on a SQL Server somewhere, and with reports on that).

We send a little software to different customers having the same source system, lets say SAP. Then, they download it, install it, enter the SAP connexion information, then click "EXTRACT MY DATA". The data is then extracted from the source system and sent over the web to my Cloud Datawarehouse.

We may have 10...20..50...500 customers...depending on the strenght of the sales team...

I want to have only ONE datawarehouse (ONE fact table and ONE version of each Dims). They will be partitionned over a key that uniquely identify each customers (ranging from 1 to ...).

Note :
I was thinking to create one database for each customers (lot of sense), or maybe on schema
for each customers in the same database (less sense),... but then I'll have to maintain 40 versions of everything (fact / dinensions)...not good.


So my fact table (and every dimensions) have this key (uniqueCustomerKey) to partition the data. One partition for each customer.

My questions: Is that make sense?
Each customers wont have a lot of data in the fact table (max 3-4 millions of records each MAX).

Do anyone see a major caveat in my solution?

The fact table will then be clustered on the partitionKey + the date key
The dimension tables will be partition also the same way (partitionkey + dimKey)

thanks for you lights on that!

Simon
Post #1460303
Posted Wednesday, June 5, 2013 11:47 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: 2 days ago @ 6:02 PM
Points: 2,818, Visits: 2,560
I think this is more or less the way I would do it as well. The one caveat that I see is customer level security will have to be designed carefully. The last thing you want is for customer B to see some data from Customer A on one of their reports.
Post #1460356
Posted Friday, June 7, 2013 1:50 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Tuesday, April 22, 2014 7:44 AM
Points: 69, Visits: 89
Hi There, i would put it all in one fact table too, i'd consider clustering / partitioning it the other way though, i.e. DateKey, CustomerKey.

If most of the data warehouse queries include the datekey as well as the CustomerKey then this will generally be a better way of doing it. I'm assuming your loading your data in on daily increments? If you have the CustomerKey first then the index will be constantly re-shuffling around on the disk, where as putting the datekey first will make it all nice and contiguous, you'll get better scan speeds from that too. You should have a quick read of this http://msdn.microsoft.com/en-us/library/dd425070(v=sql.100).aspx too it really does help with this sort of thing.

What version / edition of SQL are you using?

Bob.
Post #1460990
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse