Why did added partitions make cube processing slower?

  • We have a cube based on a 300 Million record fact table. The cube originally had 4 partitions (one per year) and took 30 minutes to process. Then we changed it to 48 partitions (one per month) thinking the performance would increase even further. However, the opposite happened and now it's taking 2 hours to process .

    Can someone help explain why?

    We know the obvious answer would be to just switch back, but we are trying to understand why adding partitions is not increasing performance as it's supposed to do in theory.

    Some notes:
    1. Both SSDE and SSAS are on the same development computer/server.
    2. The server has 20 GB of RAM and 1 CPU with 4 cores (no threading).
    3. We configured SSDE max memory to 8 GB (also tried 6 GB) with a minimum of 2 GB
    4. We configured SSAS low, total, and hard memory limits to 9 GB, 13 GB, and 14 GB respectively.
    5. The cube is very basic at this point there are no hierarchies, attribute relationships, or aggregations yet.

  • ptownbro - Sunday, August 20, 2017 10:10 AM

    We have a cube based on a 300 Million record fact table. The cube originally had 4 partitions (one per year) and took 30 minutes to process. Then we changed it to 48 partitions (one per month) thinking the performance would increase even further. However, the opposite happened and now it's taking 2 hours to process .

    Can someone help explain why?

    We know the obvious answer would be to just switch back, but we are trying to understand why adding partitions is not increasing performance as it's supposed to do in theory.

    Some notes:
    1. Both SSDE and SSAS are on the same development computer/server.
    2. The server has 20 GB of RAM and 1 CPU with 4 cores (no threading).
    3. We configured SSDE max memory to 8 GB (also tried 6 GB) with a minimum of 2 GB
    4. We configured SSAS low, total, and hard memory limits to 9 GB, 13 GB, and 14 GB respectively.
    5. The cube is very basic at this point there are no hierarchies, attribute relationships, or aggregations yet.

    Couple of things ..i am not expert in Cube Development process but adding partitions should definitely  help with improved performance , but what happening is it require more RAM memory to render the data and store the data.
    check  points to observe ..
    1. check  if the  ram allocated is being used and releasing when mdx is executing , once the limit 9 reaches SSAS engine will release the memory back and some times kills the query with memory pressure warning.
    2. adjust max no of processing threads from default to working number.
    3. and  try to allocate PREALLOCATE MEMORY  ( most improtant part )...
    4. and i hope when you are saying 9,13,14GB is in bytes but not in % .. usually its given in % as boundaries but anything more then 100% is considered as bytes and to 
    give fixed amount of RAM size  i prefer to give it in bytes calculated.

    FYI.. and no matter what you do FE is always a single thereaded process..  if you are using 2016 use extended events  to see which part of mdx is taking longer time.

    @JayMunnangi

  • I stand to be corrected on the following statements.  However, the 1 time that I have used partitions, I have found the same issue.

    Partitions are not a GO FAST switch for queries.
    In certain cases, they can improve performance, but that requires all indexes used by the query and all data retrieved by the query to be on a single partition.  As soon as you start crossing partition boundaries, things start to slow down.

  • I agree.  Partitioning is definitely not a "Go Fast" method.  It's a method for managing large datasets to reduce the need for index maintenance and can be used to seriously reduce the size of backups especially when older temporal partitions are "packed" and set to Read_Only (I reduced >10 hours backups to about 6 minutes using that technique).

    It's ironic that people think that partitioning improved their performance.  The clustered index needs to be setup a certain way for partitioning and, ironically, that almost always improves the performance in monolithic table structures, as well.  In fact, a monolithic structure will perform better with the same indexes than partitioned tables will.  And the idea of "partition elimination"?  It only works well for otherwise crap code and refactoring the code to use indexes correctly will frequently see performance improvements of tens and hundreds of times faster than settling for mere partition elimination.

    Part of the reason for the relatively slower performance of partitioned tables, from what I've been led to understand, is that each partition has it's own B-Tree much like a separate table would have.  That means that the system frequently has to look through more than 1 B-tree and smaller B-Trees are less effective than larger ones.

    Search for "Partitioned indexes" in the following article by a former Microsoft Employee.
    https://blogs.msdn.microsoft.com/hanspo/2009/08/21/inside-of-table-and-index-partitioning-in-microsoft-sql-server/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • For those mentioning index maintenance, queries, etc, the OP's asking about partitions of Analysis Services cubes, not SQL Server tables. Very different beasts.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Do read the following http://bidn.com/Blogs/Performance-Tuning-of-Tabular-Models-in-SSAS-2012-Whitepaper-Now-Available and the white paper mentioned on it.

    Partitions are good for incremental loading - or if your server and tabular instances are correctly configured, and assuming you set the parallel processing option on on your xmla to process the cube to allow for parallel processing of the partitions.

    Just creating partitions and not doing the remaining work will not improve performance at all.
    Other things to look at
    - cardinality of the columns - the highest cardinality the worst it will perform. 
      Look at possible targets to reduce the cardinality - such as datetime columns - split into date and time.
    - order the data is read - this will affect both compression ratio and performance - but may not be suitable for partitioned tables

    If not doing incremental load are you/can you do "process clear" before doing "process full"? this would also help in cases where memory is limited as in your server.

    And you mentioned that your server has threading off - if it is a physical server turn it on as it will perform better.

  • It's funny how the term "incremental load" seems to have changed over time.  To me, an "incremental load" has always meant an "upsert" where items that already exist aren't reloaded, items that need to be updates are updated, and new items are inserted.

    Using "SWITCH" with partitioning won't do that and won't be an "incremental load" unless the load will naturally not produce any duplicate information.  Even then, the partition must match the "SWITCH" in almost exactly.  I call that a "Fast Bulk Load" but I guess 3 words are harder to remember than 2. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks everyone for your responses. What I'm getting from this is that by adding more simultaneous processes, it's putting pressure on the RAM and likely processing some items on disk. I'll have to watch this more closely during our processing to see if this is the issue.
    I'll also try process clear before process full and see if that helps.
    Thanks!
    By the way, I'm asking about partitioning in SSAS not SSDE.

  • GilaMonster - Monday, August 21, 2017 12:51 AM

    For those mentioning index maintenance, queries, etc, the OP's asking about partitions of Analysis Services cubes, not SQL Server tables. Very different beasts.

    I definitely fell into that trap.  Didn't even look to see which forum the post was in.  Thanks, Gail.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Saturday, August 26, 2017 2:12 PM

    It's funny how the term "incremental load" seems to have changed over time.  To me, an "incremental load" has always meant an "upsert" where items that already exist aren't reloaded, items that need to be updates are updated, and new items are inserted.

    That's what it means in the DB engine, but not in SSAS. It's possible to partially update a cube (basically, reload just one or more partitions from source data rather than the whole cube) and then rebuild just that partition's aggregations.
    That's pretty much all I know about the process though. 🙂

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply