Best primarykey and index

  • Hi all, i've to refactor a database with several tables with around 100.000.000 records each one.

    Each table has the following relevant columns:

    - ID (bigint)

    - Tenant (actually i have 10k tenants)

    - Year (all info has Always a filter search by year)

    - other 10/20 simple columns

    I was thinking to have:

    1) pk not clustered on ID

    2) clustered on (tenant,year,id)

    3) otger accessory not clustered index

    In alternative, i was thinking to apply a partition by year in each table.

    What do you think Will be the optimal architetture for thia database?

    All queries are by tenant and year.

    All updates are by id.

    The growth Will be of ~8.000.000 records year on many tables.

    Thanks

  • Thanks for posting your issue and hopefully someone will answer soon.

    This is an automated bump to increase visibility of your question.

  • I have to wonder why the Updates are being done by "Id" rather than by tenant and year.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Hi, the scenario most of the times is:

    1)Loading data in a grid (by tenant,year)

    2)user perform update on some rows, update by id

  • mto89 wrote:

    Hi, the scenario most of the times is:

    1)Loading data in a grid (by tenant,year)

    2)user perform update on some rows, update by id

    if you do your clustered index as you mentioned update by tenant, year, id may work better than just id.

  • mto89 wrote:

    1)Loading data in a grid (by tenant,year)

    Just one tenant at a time?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Yes all Is segregated by single tenant

  • Sounds to me like Tenant/Year might be the way to go for the queries, then.  Don't be alarmed by the high fragmentation that will occur on that clustered index.  It's just not going to matter and, unless you have "ExpAnsive" Updates involved (where rows become longer from things like a NULL Modified_BY or other variable width column being updated to a larger value), you won't have a thing to worry about when it comes to page density.  Such is the nature of "Sequential Siloed" indexes, which is what this will be.  I'd likely never defrag it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply