Delta Load strategy for fact and dimension tables

  • We are in the process of designing a star schema model to enable self-service BI using Power Pivot for Excel. The design should be compatible with both on-premise and Azure version. What would be the best strategy for loading fact and dimension tables? We don't want to go with SSIS, since SQL Azure does not support SSIS. We may have to go with stored procs being called from scheduled jobs for on-prem version and schedule a job using worker threads for Azure version. Is there any better approach to achieve this?

    Also, could you please advise on the best strategies for designing incremental loads for fact and dimension tables (compatible with on-prem and Azure).

  • I'm very surprised to find out that SQL Azure doesn't support SSIS!!!

    What "on prem" are you talking about? Seems to be a highly relevant detail that we are missing, as without such information, I might look silly saying "Why not stay with Microsoft tools since all pieces of the puzzle use MS tools?"

    Meanwhile, sure, scheduled stored procedures could do the job but not as fast as a tool optimized for bulk data movements.

  • sql_dba123 (5/6/2012)


    We may have to go with stored procs being called from scheduled jobs for on-prem version and schedule a job using worker threads for Azure version. Is there any better approach to achieve this?

    You make it sound like that's somehow a bad thing. 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Not really 🙂 its just that I am not aware of other options 🙂

  • Kind of off topic, but you might want to test this a bit.

    We ran into memory issues using PowerPivot and a more traditional Star Schema.

    It was not releasing memory as we drilled down.

    With only 3M fact records, we could kill a 24GB server as we drilled down.

    We ended up moving almost everything into a big, wide table.

    With the big, wide table - real good performance on a 8 GB server.

    We did use a strored proc to load a custom table, which worked pretty well in our situation.

    Internal strategy is one thing - putting out to the cloud is another.

    Part of this might be leveraging how you capture changes today.

    Triggers can be good for this.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply