Partitioning Data in a Multi-tenant Data Warehouse Solution

  • Comments posted to this topic are about the item Partitioning Data in a Multi-tenant Data Warehouse Solution

    [font="Tahoma"]Cheers,

    Sean :-D[/font]

  • How long does it take you to load the fact table after you've switched out the partition and truncted it?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hello,

    Most of our fact tables take under 5 minutes to run. We have about 6 truncate and reload fact tables. Then the switch back into the partitioned table is really fast since its a metadata only operation.

    Hope this answers your question.

    Sean

    [font="Tahoma"]Cheers,

    Sean :-D[/font]

  • What is the advantage of loading data in the staging, non-partitioned table? Wouldn't you obtain the same result by loading the fact data directly into the partitioned table?

  • We need to remove the data for the customer but don't want to incurr logging during deletes or inserts. We can't truncate because the other customers data must persist.

    [font="Tahoma"]Cheers,

    Sean :-D[/font]

  • Sean Woehrle (4/22/2014)


    Hello,

    Most of our fact tables take under 5 minutes to run. We have about 6 truncate and reload fact tables. Then the switch back into the partitioned table is really fast since its a metadata only operation.

    Hope this answers your question.

    Sean

    As you've rightfully pointed out the switch-out and switch-in is lightning quick. My suggestion would be to stage the full replacement before you do the switch-out so that you could do the switch-in leading to only milliseconds of "down" time instead of minutes

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff,

    I always appreciate a better way to do things, so your suggestions are welcome.

    I think we are basically "staging the full replacement" for customer 1 in the article you are suggesting already. We load the data into the non-partitioned (NP) table on the same FG as the partitioned table. That is the part that takes up to five minutes. Then the switch in takes milliseconds. Even if we had a separate NP table (from the one that gets created during the switch out, which we truncate) on the same FG group it would still take the same time to load.

    Am I missing something here?

    [font="Tahoma"]Cheers,

    Sean :-D[/font]

  • I did not mean to do deletes, but to do a switch out, followed by loading directly into the table.

    Jeff suggested two staging tables, one for switching out and another for switching in. First, you truncate both staging tables. Second you populate the switch-in staging table. At this point, all you need to do is to switch out from the partitioned table into switch-out staging table and immediately follow by switching in from the switch-in staging table into the partitioned table. As Jeff indicated, this will give you only a fraction of a second of down time.

    A third option with zero downtime is to create your partitioned table with extra partitions. You simply need a small table-of-contents (TOC) table that has as many rows as there are partitions in the partitioned table. Each row contains the partition ID, the tenant name and a status column. The status could be Available, Loading, Unloading and Active. For sake of example, suppose you have ten tenants and 11 partitions. The TOC table will contain 10 rows in Active status, one for each tenant, and one row with status of Available.

    When you want to load data for a tenant, you update the row with status of Available to set the status to Loading and set the tenant name. Once you are done loading the data, you simply update the TOC table to set the status of the Active row to Unloading and the status of the Loading row to Active. Queries will immediately use the new partition, as that is the one with Active status. I would suggest you create a scalar-valued function that takes as input the tenant name. The function will query the TOC for the tenant name with status of Active and return the partition ID. At this point you have as much time as you want to switch-out the data from the partitioned now marked as Unloading and set the TOC of this partition to Available.

    If you have many tenants and are concerned with concurrency of loading/unloading data for different tenants, you may create your partitioned table with extra partitions. For even greater flexibility, keep in mind that your staging table can also be partitioned and you can switch out from a partition in the main partitioned table into the same partition of staging table.

  • Sean Woehrle (4/23/2014)


    Jeff,

    I always appreciate a better way to do things, so your suggestions are welcome.

    I think we are basically "staging the full replacement" for customer 1 in the article you are suggesting already. We load the data into the non-partitioned (NP) table on the same FG as the partitioned table. That is the part that takes up to five minutes. Then the switch in takes milliseconds. Even if we had a separate NP table (from the one that gets created during the switch out, which we truncate) on the same FG group it would still take the same time to load.

    Am I missing something here?

    I think so. Here's the flow diagram from your article...

    As N_Muller pointed out in the post above, it looks like you have "downtime" equal to the the amount of time it takes to reload the "Auxilary Fact" table because you truncate it, reload it, and then switch it in. N_Muller is, indeed, correct in that I'm suggesting that you have two "staging" tables... 1 to switch-out and 1 to switch-in. Instead of the process of...

    Process Flow from Article


    1. Check and drop Aux Table

    2. Switch out partition to Auxilliary Fact Table ("DOWN TIME" STARTS HERE)

    3. Truncate Auxilliary Fact Table (takes only ms)

    4. Load Auxilliary Fact Table (takes ~5 minutes)

    5. Switch Auxilliary Fact to Partitioned Fact ("DOWN TIME" ENDS HERE, TOTAL = ~5 MINUTES)

    6. Drop Auxilliary Fact Table

    ... where the data is unavailable for ~5 minutes, you could do this with 2 Aux Tables...

    Two Table Suggestion Almost Eliminates Any Downtime


    1. Check and Drop Aux Tables

    2. Create the AuxOut and AuxIn Tables

    3. Load the AuxIn table. Still takes 5 minutes but there's no "down time" because we haven't switched out the original data, yet.

    4. This would be a good time to make sure the indexes weren't fragmented by the load and defragment them if they were.

    5. Switch out partition to the AuxOut Table ("DOWN TIME" STARTS HERE)

    6. Switch the AuxIn Table to the partition (Total "DOWN TIME" is only ms)

    7. Check and Drop Aux Tables (or leave the AuxOut table there until the next load in case something went horribly wrong)

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • N_Muller (4/23/2014)


    A third option with zero downtime is to create your partitioned table with extra partitions. You simply need a small table-of-contents (TOC) table that has as many rows as there are partitions in the partitioned table. Each row contains the partition ID, the tenant name and a status column. The status could be Available, Loading, Unloading and Active. For sake of example, suppose you have ten tenants and 11 partitions. The TOC table will contain 10 rows in Active status, one for each tenant, and one row with status of Available.

    Now THAT sounds VERY interesting. How would you overcome the problem that I think that would create with the partitioning function border values? And wouldn't you have about the same amount of "downtime" (measured in ms, of course) as the two table do-si-do while you repointed the TOC table?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • There aren't really border values in the data. The partition function would be integer. In the example I gave where there are 10 tenants and one extra partition, I would create the partition function for values 1-11. You really create 12 partitions, but you only use partitions 1-11, 10 of them for active tenants, and one for staging.

  • Jeff and N_Muller,

    Thanks for your feedback. Jeff, we don't have an issue with downtime yet as we load the data in the off hours. However if we do start hitting problems with not having an "outage window" We could try your suggestion. For now I don't want to change the 20 packages that use this approach. We have looked at parameterizing the packages further or generating them from metadata tables but see that as a run scenario and we aren't there yet.

    N_Muller, I did create the fact tables with extra partitions, but they were for future customers. I think you are merging the sliding window scenario with the needs we have. That might work if we can address the concerns Jeff raised.

    Thanks to you both!

    [font="Tahoma"]Cheers,

    Sean :-D[/font]

  • Sean Woehrle (4/23/2014)


    For now I don't want to change the 20 packages that use this approach. We have looked at parameterizing the packages further or generating them from metadata tables but see that as a run scenario and we aren't there yet.

    Thanks for the feeback, Sean, and good article, by the way.

    Shifting gears a bit, it's for reason like this that I don't use things like SSIS. This would be a proverbial cake walk to change if a dynamic stored procedure where the work horse.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply