• HanShi (9/4/2015)


    In my opinion this is all depending on maintenance, complexity and dependencies.

    Having small tasks each in seperate packages it's easy to execute a single one (if required) and it's easy to run in parallel.

    But having many seperate packages it's hard to keep sequential integrity when some packages can only run after other packages have succeeded. That's the point when you should integrate them into a single package.

    Also when several packages are using the same source data, it's easier to combine them into a single package. Within this package you only have to define (and thus read) the source once. Next you can split the flow into several forks of the source data. Within this single package it is still possible to define parallel processing of flows.

    Aye. For this particular flow, none are depending on each other so they can technically all run together at once if running 33+ packages in parallel is ideal. :w00t:

    But, I think I can ideally get away with all the dimensions in one package and maybe a separate package per fact table that is all controlled by a master.

    1 package for dimensions, 5 packages for facts, 1 master package = 7 packages.