Partitioned Views

  • Comments posted to this topic are about the item Partitioned Views

    Regards
    Brian Ellul
    ----------------------------------------------------------------------------------
    Computers are incredibly fast, accurate, and stupid. Human beings are incredibly slow, inaccurate, and brilliant. Together they are powerful beyond imagination.
    - Albert Einstein -

  • hi Brian,

    Are you aware that SQL Server 2014 will have statistics by partitions?

    We expect that will help us tremendously.

    Also, I've found this video very useful: http://technet.microsoft.com/en-us/sqlserver/gg545008.aspx

    It can be found together with a lot of other good stuff at:

    http://www.sqlskills.com/T_MCMVideos.asp

    Best regards,

    Henrik

  • You can create multiple filtered statistics yourself that do not overlap in key range, this way you can get multiple blocks of 200 entry statistics, even on a single table. Use this to get more accurate statistics within each of your partitions (which in the case of partitioned views are first class tables in their own right) and thereby reduce the need to go to the extreme with the number of tables you have to create and maintain.

    I haven't this process myself as I am nowhere near the point it would give me major benefits, but your case seems a legit one for this approach. Just look up create statistics in BOL if you haven't yet, and maybe you can write a second article related to this optimization work and have a nice comparison. Filtered statistics where I think introduced with SQL Server 2008.

  • There are two more less-known problems with partitioned views that you sometimes have to work around:

    1. On inserts you need to specify even null-able columns (just like with instead -f insert triggers..seems like code wise they are linked somewhow)

    2. Insert of records into the view, with as source records from the same view is not allowed. If i recall correctly this even extends to the most obvious workarounds like using one of the views source tables instead.

  • Hi,

    Consider the following scenario:

    1. Application x has a few transaction tables as very huge tables say tables t1, t2, t3, etc.

    2. Application is only in maintenance mode and no changes to the code are allowed.

    In this scenario, if we wish to reap benefits of partitioned views, we have following as one of the options:

    1. Partition the transaction tables and create new tables as t1_1, t1_2, t1_3, t2_1, t2_2, t2_3, t3_1, etc.

    2. Create views with same name as transaction tables before partition i.e. t1, t2, t3, etc.

    In this case we would not need any changes to the code of application.

    Does anyone have any real life experience of this approach? What are your opinion on this?


    Kindest Regards,

    darshan Joshi

  • darshan (12/6/2013)


    Hi,

    Consider the following scenario:

    1. Application x has a few transaction tables as very huge tables say tables t1, t2, t3, etc.

    2. Application is only in maintenance mode and no changes to the code are allowed.

    In this scenario, if we wish to reap benefits of partitioned views, we have following as one of the options:

    1. Partition the transaction tables and create new tables as t1_1, t1_2, t1_3, t2_1, t2_2, t2_3, t3_1, etc.

    2. Create views with same name as transaction tables before partition i.e. t1, t2, t3, etc.

    In this case we would not need any changes to the code of application.

    Does anyone have any real life experience of this approach? What are your opinion on this?

    Yes I have, and if the application follows the rules i outlined in my post with respect to existing queries on the the partitioned view, it will work. You might still encounter one problem tho and it has to do with the primary key, if that happens to be one that is unsuited. Like having an identity column on it or a first column unsuited to partition efficiently with.

    Best is to try and examine the table structure, and replicate it in a test database. Then also make a trace of the application in action and check all the insert statements to see if any violates the rules i posted. You never know what an application does until you check it or wrote it.

  • Sorry for the interuption but I have to make sure that people understand a couple of things...

    I'd like to warn everyone that reads this article or any other article on "how easy partitioning is" that it's not actually an easy thing to do except on totally static data (per underlying partitioned table) where the PK of the table is NOT used as an FK anywhere. As touched on but seriously underplayed in this article, if FK's point to the PK of this table, you'll need to make some major changes and you'll also need to understand that the partitioning column MUST be included in the PK. On partitioned tables, the partitioning column MUST be included in ANY and ALL unique indexes, as well.

    What does that mean and why am I so concerned about all of this? Stop and think about it... if you add the stereotypical date or region column as the partitioning column, it must be added to the PK and, although the now 2 column PK can still be unique, the actual column that you were counting on being unique is no longer unique in and of itself. This can really raise hell with the PK and other columns that you actually want to contain unique values throughout the table.

    You also can't make the partitioning column of partitioned views an IDENTITY column (you can in partitioned tables but that's a whole 'nuther set of problems). And, as also previously pointed out, you must insert ALL columns of data when doing an INSERT to partitioned views which pretty much leaves out the ability to make insertable partitioned views if an IDENTITY column is present. That means that you have to use a "staging table" if you intend to use an IDENTITY column and that means that you either need to make an INSTEAD OF trigger on the view to process inserts or you need to change all the software that points at the view (which used to be a monolithic table) to now insert to the underlying table (almost MUST be a staging table if an IDENTITY column is involved) and read from the view.

    Don't get me wrong... partitioning of really large tables (especially temporal tables) provides huge advantages for backups, index/stats maintenance, and PieceMeal restores, but partitioning an existing table (even if it doesn't have an IDENTITY column) either as a partitioned view or a partitioned table is not a trivial task and careful planning is required or you're going to break a lot of data and/or break a lot of applications. As also previously suggested, you really should build a parallel system to experiment with and to make sure you understand all of the things that can go wrong and the things that you need to do to keep them from going wrong before the words "partition production tables" ever leaves your lips.

    Please... carefully plan any partitioning you might do and make sure that you understand ALL the nasty little caveats that are inconveniently spread out all over hell's little half acre in Books Online. Then test the partitioning on a separate system until you're absolutely sick of testing and then test some more.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • peter-757102 (12/5/2013)


    You can create multiple filtered statistics yourself that do not overlap in key range, this way you can get multiple blocks of 200 entry statistics, even on a single table. Use this to get more accurate statistics within each of your partitions (which in the case of partitioned views are first class tables in their own right) and thereby reduce the need to go to the extreme with the number of tables you have to create and maintain.

    I haven't this process myself as I am nowhere near the point it would give me major benefits, but your case seems a legit one for this approach. Just look up create statistics in BOL if you haven't yet, and maybe you can write a second article related to this optimization work and have a nice comparison. Filtered statistics where I think introduced with SQL Server 2008.

    Thanks for your feedback. You're right I should have mentioned that in the article for the sake of completeness, since it would solve the 200 limitation, however inacurate statistics is just one of the issues which Partioned views will help to solve...

    Regards
    Brian Ellul
    ----------------------------------------------------------------------------------
    Computers are incredibly fast, accurate, and stupid. Human beings are incredibly slow, inaccurate, and brilliant. Together they are powerful beyond imagination.
    - Albert Einstein -

  • darshan (12/6/2013)


    Hi,

    Consider the following scenario:

    1. Application x has a few transaction tables as very huge tables say tables t1, t2, t3, etc.

    2. Application is only in maintenance mode and no changes to the code are allowed.

    In this scenario, if we wish to reap benefits of partitioned views, we have following as one of the options:

    1. Partition the transaction tables and create new tables as t1_1, t1_2, t1_3, t2_1, t2_2, t2_3, t3_1, etc.

    2. Create views with same name as transaction tables before partition i.e. t1, t2, t3, etc.

    In this case we would not need any changes to the code of application.

    Does anyone have any real life experience of this approach? What are your opinion on this?

    Hi

    In fact that's exactly what I'm trying to accomplish. Split the underlying tables into multiple tables and join using a view. I'm in the process of testing all this out on our production systems and therefore although not yet live, this is the approach I'm adopting.

    Regards

    Brian

    Regards
    Brian Ellul
    ----------------------------------------------------------------------------------
    Computers are incredibly fast, accurate, and stupid. Human beings are incredibly slow, inaccurate, and brilliant. Together they are powerful beyond imagination.
    - Albert Einstein -

  • Great!

    Please keep us posted about your findings in Production!

    Thanks..


    Kindest Regards,

    darshan Joshi

  • Could you please explain why the statistics show logical reads of unknown table known as 'Worktable' instead of Base table?

  • "Worktable" is a table created in TempDB by the system to make things like hash tables for hash joins and other temporary objects.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply