I partitioning ever simple?

  • Hi,

    *IS partitioning ever simple?

    I have a web report that I want to retrieve data as quickly as possible. The report is driven by a view of current year transactions - in December it will be about 20 million rows. Instead of taking 5 seconds to get 10k rows, I want it to take 1 second. Instead of 1 second for 200 rows, I want it to be instant.

    Sorry for the book I've written below. I tried to provide all the information I thought might help understand the situation.

    The report view has three tables. The main table has an account (~700 of these), cost center (~7,000 of these), several descriptive text fields and a few numeric columns (dollars, quantities). The second and third tables are flat file hierarchies for the cost centers and accounts respectively.

    On the first table I have a clustered index on cost center, account. On the second and third tables I have primary keys on cost center and account respectively.

    The account and cost center tables change every month or two and when they do, they are completely overwritten. The main table with texts, quantities and amounts is written from scratch once per day and swapped out instantly (the view points at a synonym).

    The web report ALWAYS filters on the account sub-category (about 70 of these) and returns all of the descriptive text fields and numbers. It almost always also filters on the cost center or one of the 8 levels of cost center hierarchy.

    Having heard the word "partition" but knowing nothing about it, I was thinking that my report might likely be faster if I either:

    a) write the sub-account into my main table and create a clustered index on sub-account, cost center (instead of clustered cost center, account and joining on the account table to get sub-account).

    or

    b) write the sub-account into my main table and partition the table on sub-account.

    or

    c)change the web report query so that it looks at not only account sub-category (~70 of these) but also account category (8 of these); write out 8 separate tables, one for each account category.

    Before looking into anything at all, I figured that b) would be the most straightforward as I probably just needed to "add a partition" after each time I drop/recreate my main table and before populating it, the same way I currently add an index. I figured it would be as simple as 'create partition over [account_sub_level] on myTableA and SQL Server would figure it out... But after doing an hour or two of reading up online, it sounds like it is WAY more complicated than I thought and it may require permissions that I don't necessarily have. I might end up trying to figure it out for a couple of days and realizing I can't do it. Maybe given the extent of what I am trying to accomplish I would be better of trying a) and if that doesn't work try c)...

    What do you think? Thanks for any advice!

    Tai

  • Maybe filtered indexes would be smarter? (I was reading this among other articles...http://www.sqlservercentral.com/blogs/practicalsqldba/2013/04/08/sql-server-part-9-filtered-index-a-new-way-for-performance-improvemnt/)

    I could create a filtered index for each account category or even each account sub-category. If I were to include all the fields that I need then this would amount to the same thing as partitioning but be easier for my brain to handle.

    The drawback I see is it would take twice the space (still under 15GB for a year of my data). Since I drop and re-create the main table each day and do no updates in between, would there be essentially no maintenance issue with this?

    Or would creating filtered indexes on every different account subcategory and including all fields in the index be just a stupid way of creating something that functioned similar to a clustered index on account category?

  • Partitioning usually isn't any faster than the right kind of query played against properly designed tables and indexes. Partitioning is to make index management easier, backups smaller, and a couple of other things but hardly ever for performance unless you also happen to get quite lucky.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks Jeff.

    From the looks of it I think there are a lot of other things I should learn before partitioning.

    Do you think that if I know I will always select on one particular field every single time, it might be worthwhile to create filtered indexes on each of the distinct values of that field? Or do I just need to make that my primary key and call it a day? There is also another field with quite a few more values, that is almost always selected on and that is currently my primary key...

    Any general thoughts on how to go from knowing nothing to knowing a fair amount about indexing and performance tuning? Start with the stairway to indexing on SSC and google around for whatever is out there on the subject?

    Thanks!

    Tai

  • Partitioning is not simple and it's not primarily for performance.

    Filtered indexes are nice, but they're not simple either and there are lots of scenarios where they can't be used by SQL.

    Basic indexing:

    http://www.sqlservercentral.com/articles/Indexing/68439/

    http://www.sqlservercentral.com/articles/Indexing/68563/

    http://www.sqlservercentral.com/articles/Indexing/68636/

    If you can post more details, like query, table definitions, index definitions and execution plan we may be able to help more.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • taigovinda (8/30/2013)


    Thanks Jeff.

    From the looks of it I think there are a lot of other things I should learn before partitioning.

    Do you think that if I know I will always select on one particular field every single time, it might be worthwhile to create filtered indexes on each of the distinct values of that field? Or do I just need to make that my primary key and call it a day? There is also another field with quite a few more values, that is almost always selected on and that is currently my primary key...

    Any general thoughts on how to go from knowing nothing to knowing a fair amount about indexing and performance tuning? Start with the stairway to indexing on SSC and google around for whatever is out there on the subject?

    Thanks!

    Tai

    There might be a little confusion here. The Primary Key has to contain unique non-null values. You can't just make any column the Primary Key. You could make any column the Clustered Index (which is quite different than the Primary Key even though both are usually used on the same column(s)) but that's not always a good idea. It really depends. See the links that Gail provided above for more information.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks Gail and Jeff. I've read the articles which seem like a good start - I feel a little more confident now to experiment. (Btw Jeff I did mean to say clustered index not primary key but can see from the articles a few things I am obviously not handling correctly.) I will revisit what I am doing with a combination of clustered and non clustered indexes and report back how it goes.

    Thanks.

    Tai

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply