• Although I'm glad to learn new ways of loading an multidimensional database structure, I do feel the solution to be a bit cumbersome.

    This could of course be due to that you dont have control over the complete ETL-chain.

    Our dw also has normalized database and multidimensional database. The norm is loaded daily from 20 different sources and we perform incremental updates of the two main fact tables, Sales and GL.

    From textfile to updated transaction table, given about 3 million updated rows in each table every night, we have a processing time of about 30 mins.

    The logic has the following bases

    1. Norm dim tables create and hold surrogate keys for items, accounts etc. This means surrogate keys are fixed and not recreated every night

    2. Norm fact tables have surrogate keys for; a: every dim member on the row, b: for the fact row itself

    3. All norm tables have an "LatestUpdated"-flag which is set when a nightly load updates the row

    4. Each load step is logged, saving the "latest succesful run data"-flag

    The dw update then do this

    1. Truncate and reload of all dimension tables. This works since norm keep all surrogate keys

    2. Incremental update of the Fact table using a comparison of "LatestUpdated" and "LastSuccessfulRun". Minimum rows are touched and surrogate keys are already in place

    3. Reports tables/cube loads follow...

    I think this approach seem simpler and less error prone.

    This of course require that you have control of how the norm database is loaded.