Good Morning All,
We have 30+ ETL applications which are working fine and are end to end automated from data extraction, transformation, load to business reporting, etc. We have powerful production environment (N number of high end processors, 16 GB RAM and 800+ GB Hard disk). Currently these ETL applications are running one by one as stand alone scheduled jobs with their start time.
I want to effectively use this production environment by simultaneous execution of 2 or more ETL applications running at same time. By doing so the resources are efficiently used and time to market readiness of business data will also be significantly reduced. That's the reason I am thinking of a ETL scheduler. For example, if the current CPU utilization is less than 30%, the ETL scheduler may trigger another ETL application.
I am in progress of analysis of developing an ETL scheduler for scheduling these 30+ ETL applications round the clock. I am collecting the resource utilization metrics, factors which will be input to the ETL scheduler. Please shower your thoughts.