historization of high volume data

  • Hello !

    I'm new to sql server high volume usage. I have an application to develop and I would appreciate some advices.

    Say that I'll have to have those tables :

    - cars

    - trains

    - bykes

    - planes

    and for each of them, not a large volume at the beginning but my client wants them to be historized in order to retrieve the information 1 year later (at least). The volume estimation is around 2.000.000 rows for each table after 1 year and tables will have at least 30 columns.

    I've thinked of this kind of schema :

    - cars (PK)

    - carsInfo (FK)

    - trains

    - trainsInfo

    - bykes

    - bykesInfo

    - planes

    - planesInfo

    with foreign keys in the *Info tables.

    But my problem is that absolutely ALL info can change for those tables so every info can change from a month to another. You'll understand that it's not really cars and trains i'm working on but it's just to simplify here.

    I don't really know what to use in the indexes, views, a little help would be very useful.

    Thank you by advance.

    Nicolas

  • Gibou (10/5/2010)


    I'm new to sql server high volume usage. I have an application to develop and I would appreciate some advices.

    Say that I'll have to have those tables :

    - cars

    - trains

    - bykes

    - planes

    and for each of them, not a large volume at the beginning but my client wants them to be historized in order to retrieve the information 1 year later (at least). The volume estimation is around 2.000.000 rows for each table after 1 year and tables will have at least 30 columns.

    Tables holding 2 million rows are far for being "large volume". 🙂

    Gibou (10/5/2010)


    I've thinked of this kind of schema :

    - cars (PK)

    - carsInfo (FK)

    - trains

    - trainsInfo

    - bykes

    - bykesInfo

    - planes

    - planesInfo

    with foreign keys in the *Info tables.

    But my problem is that absolutely ALL info can change for those tables so every info can change from a month to another. You'll understand that it's not really cars and trains i'm working on but it's just to simplify here.

    Research "Slowly Changing Dimensions".

    Gibou (10/5/2010)


    I don't really know what to use in the indexes, views, a little help would be very useful.

    In regards to indexing plan indexes to enforce uniqueness as well as to help during data retrieval.

    In regards to views... ask yourself why would I need a view?

    Hope this points you in the right direction.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • I agree... 2 million isn't alot of data, and don't think you need to create "history" tables that are separate from the regular tables... if you implement effective indexes, and you cluster on the right column, you should be ok. Just pay attention to the queries being run against the data (trace the table usage in profiler). Keep partitioning in your back pocket and don't pull it out unless you need it.

  • getoffmyfoot (10/5/2010)


    I agree... 2 million isn't alot of data, and don't think you need to create "history" tables

    This might be a valid alternative but there are others. That's why I suggested poster to research "Slowly Changing Dimensions" so to chose the solution that better fits the particular scenario.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.

Viewing 4 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply