Incremental aggregation of fact table

  • Hi,

    We have a scenario in our data warehouse where we want to "expand" one of our fact tables (100 million rows) after each daily ETL load. The new table is not an aggregation as such, one extra field will be added that will essentially duplicate the rest of the data in the table.

    What we need is a slick way of maintaining this new table based on daily changes to the underlying fact.

    Points:-

    - A complete rebuild takes 1 hour which is too long to run each day.

    - We have explored CDC but the fact table does not have a clustered key on the unique field (it is on the transaction month column, on which it is also partitioned).

    - There are approximately 200,000 new and changed rows to the fact table each day. although occasionally it is possible that a lot more will be updated (e.g. in the event of our warehouse's row level security changing)

    - The nature of these inserts and updates to the fact table is very varied. It would take a lot more ETL code to replicate all this logic on the new table, and there is also the danger of them becoming out of sync this way.

    Does anybody have any cool ideas as to how to approach this, either at the database level or in the ETL process? (We are using SAP Data Services as the ETL tool but this shouldn't matter).

    Thanks for any assistance.

    b/

Viewing post 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply