Inremental loading SSIS- 200-300 tables

  • Hi Guys,

    my project started 2 months ago, and alalready'i'mransferring over 100 tables on each process i run to the DWH.

    i'll probably reach 200-300 tables pretty soon (a huge DB i start with i know, surprised me too ).

    since my BI is OEM ("out of the box") i still have 3 weeks versions (product dev sprints),

    and tables are still changing their structure (data types, column names , new columns etc ) which gave me a real headache , so i ignored it for the first few weeks.

    how did i ignore it :

    1. truncated all the tables before i take them to the ODS/MRR layer

    2. took all the data from the source system fully to the MRR layer

    3. created only the dimensions "increamental" table (which still changes every week with new columns and changed data types )

    4. dynamically creating and populating the staging tables and the warehouse tables.

    now my model started to form, so i have to take care of the incremental loads

    1. it seems easy since i have updatetime for each record , but i also have deletions in my shource system , how can i approach this ?

    2. i've concidered CDC (but i have to put it table by table .. )

    3. any solutions for someone who starts with 100-200 tables ???

    (hellllppppp 🙂 )

    Nadav

  • nedneu (2/10/2013)


    Hi Guys,

    my project started 2 months ago, and alalready'i'mransferring over 100 tables on each process i run to the DWH.

    i'll probably reach 200-300 tables pretty soon (a huge DB i start with i know, surprised me too ).

    since my BI is OEM ("out of the box") i still have 3 weeks versions (product dev sprints),

    and tables are still changing their structure (data types, column names , new columns etc ) which gave me a real headache , so i ignored it for the first few weeks.

    how did i ignore it :

    1. truncated all the tables before i take them to the ODS/MRR layer

    2. took all the data from the source system fully to the MRR layer

    3. created only the dimensions "increamental" table (which still changes every week with new columns and changed data types )

    4. dynamically creating and populating the staging tables and the warehouse tables.

    now my model started to form, so i have to take care of the incremental loads

    1. it seems easy since i have updatetime for each record , but i also have deletions in my shource system , how can i approach this ?

    2. i've concidered CDC (but i have to put it table by table .. )

    3. any solutions for someone who starts with 100-200 tables ???

    (hellllppppp 🙂 )

    Nadav

    CDC might be your best bet since the size of data is likely to outrun your ability to take all data every night and run a MERGE statement to reconcile what changed. Attunity provides free components for SSIS 2012 to help with this: CDC Flow Components

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • When you use SSIS, your source should be stabilized before you start writing ETL.

    Once it does stabilize, I would also go with CDC, as opc.three mentioned.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply