Keeping Fact Tables Online while Loading via Partition Switching

  • Comments posted to this topic are about the item Keeping Fact Tables Online while Loading via Partition Switching

  • shubhankarthatte - Wednesday, January 18, 2017 11:09 PM

    Comments posted to this topic are about the item Keeping Fact Tables Online while Loading via Partition Switching

    Hi Shubhankarthatte,
    Partition switching is fast, but it has a nasty weakness: it needs exclusive schema modification lock. While all the readers need schema stability lock on this table. So part switch waits  when all readings from fact table are finished and unlock schema. Part switching in its turn holds next reading operations. In this way, one long running reading from fact table blocks other readers indirectly, if partition switching stays between them and holds the queue.
    You need to guarantee no select operation can run longer than several seconds or whatever delay you can afford.
    I tried to force all the readers use my stored procedures, which were optimised properly, but that didn't save us from endless running query one sunny day, which blocked all dwh functionality.

  • This is a great overview article.  The pictures save a thousand words.

    My only disappointment is that code was only provided for one of the 3 scenarios.  It would also have been interesting and helpful to have seen what the Partition Scheme and Partition Function looked like as well as what controls were used to limit the Medium to Large partitions to 10M rows.

    Also, was any consideration made to making the older partitions Read_Only to save on backup and any index maintenance times?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Will the schema swapping break any views or stored procedures referencing the original dbo.table after completion?

  • I think in the first example (of loading relatively small fact table and switching tables from schema to schema) one step is missing: Step 6 - DROP swap.table.  Otherwise, you will not be able to switch the table from "dbo" to "swap" schema on the next load.

  • kosha5 - Thursday, January 19, 2017 2:39 AM

    shubhankarthatte - Wednesday, January 18, 2017 11:09 PM

    Comments posted to this topic are about the item Keeping Fact Tables Online while Loading via Partition Switching

    Hi Shubhankarthatte,
    Partition switching is fast, but it has a nasty weakness: it needs exclusive schema modification lock. While all the readers need schema stability lock on this table. So part switch waits  when all readings from fact table are finished and unlock schema. Part switching in its turn holds next reading operations. In this way, one long running reading from fact table blocks other readers indirectly, if partition switching stays between them and holds the queue.
    You need to guarantee no select operation can run longer than several seconds or whatever delay you can afford.
    I tried to force all the readers use my stored procedures, which were optimised properly, but that didn't save us from endless running query one sunny day, which blocked all dwh functionality.

    Thanks for your valuable input and I agree with your point. The second approach has been used, to enable users to query the main table while the load is occurring independently. As per your point, part switching just has to wait for user queries to finish to clear the queue. I guess the only way to prevent the part switching to sit in the queue is to ensure that queries running on the dwh are optimized to give good performance. However once the queue is clear the switching process is fast.

  • Jeff Moden - Thursday, January 19, 2017 5:01 AM

    This is a great overview article.  The pictures save a thousand words.

    My only disappointment is that code was only provided for one of the 3 scenarios.  It would also have been interesting and helpful to have seen what the Partition Scheme and Partition Function looked like as well as what controls were used to limit the Medium to Large partitions to 10M rows.

    Also, was any consideration made to making the older partitions Read_Only to save on backup and any index maintenance times?

    Thanks for your feedback Jeff. The intention of this article was to give overview without giving very specific details. However the partitions were created for every month and partition function was to map the rows to the partition.

    Regarding your point on controls to restrict the rows to specific count, there was none. Based on the analysis of the source system, we had fairly good idea of the monthly load.

    Re your query on "was any consideration made to making the older partitions Read_Only", the daily batch looked back 3 months as we knew there was not much change in the data beyond that. Weekend batch was run to look at yearly data.

  • Tom Blaylock - Thursday, January 19, 2017 10:40 AM

    Will the schema swapping break any views or stored procedures referencing the original dbo.table after completion?

    Hi Tom, The process doesn't break the views or procs as long as the swapping has completed successfully.

  • Misha_SQL - Sunday, January 22, 2017 9:53 AM

    I think in the first example (of loading relatively small fact table and switching tables from schema to schema) one step is missing: Step 6 - DROP swap.table.  Otherwise, you will not be able to switch the table from "dbo" to "swap" schema on the next load.

    You don't need to drop swap.table as after step 4  there is no table in the swap schema as you have already moved the table to the shadow schema.

  • shubhankarthatte - Saturday, February 11, 2017 10:19 PM

    Jeff Moden - Thursday, January 19, 2017 5:01 AM

    This is a great overview article.  The pictures save a thousand words.

    My only disappointment is that code was only provided for one of the 3 scenarios.  It would also have been interesting and helpful to have seen what the Partition Scheme and Partition Function looked like as well as what controls were used to limit the Medium to Large partitions to 10M rows.

    Also, was any consideration made to making the older partitions Read_Only to save on backup and any index maintenance times?

    Thanks for your feedback Jeff. The intention of this article was to give overview without giving very specific details. However the partitions were created for every month and partition function was to map the rows to the partition.

    Regarding your point on controls to restrict the rows to specific count, there was none. Based on the analysis of the source system, we had fairly good idea of the monthly load.

    Re your query on "was any consideration made to making the older partitions Read_Only", the daily batch looked back 3 months as we knew there was not much change in the data beyond that. Weekend batch was run to look at yearly data.

    That's what I'm talking about.  The older data is likely static.  If the older partitions were set to READ_ONLY and a final backup of those partitions were taken, there would be no need to back them up ever again nor ever have to do any index maintenance ever again.  It would also allow for piece-meal restores and allow a really large database to be partially restored to "get back in business" and then allow the legacy partitions to continue to be restored instead of waiting for the whole database to be restored.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply