How to do parllel ETL for identical sources

  • I have identical source databases(SQL Server) that I want to perform parallel ETL into single DW. The number of source databases can vary, so some sort of dynamic connection configuration will be required. However, executing packages in a loop for each connection will not provide parallel behavior.

    Will appreciate if someone can provide some design pattern to achieve this. Being a new system, it has full flexibility to incorporate this.

  • I believe the merge join will provide you with the parallell capability that you are looking for.

    The Database has to be sorted in the same Order on the Fields that you join on so you will either need to use Sort Task or you a query (SP) as the Data Source and use the Order By Clause.

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • sqlfavor (4/7/2010)


    I have identical source databases(SQL Server) that I want to perform parallel ETL into single DW. The number of source databases can vary, so some sort of dynamic connection configuration will be required. However, executing packages in a loop for each connection will not provide parallel behavior.

    Will appreciate if someone can provide some design pattern to achieve this. Being a new system, it has full flexibility to incorporate this.

    If you can use third-party solutions, check the commercial CozyRoc Parallel Loop Task. It can execute standard SSIS loop container concurrently, utilizing all your server processing resources. In tests, a CPU intensive sequential process when executed in parallel on 4-core machine was executed 3 times faster compared to the sequential. Cheers!

    ---
    SSIS Tasks Components Scripts Services | http://www.cozyroc.com/

  • sqlfavor (4/7/2010)


    I have identical source databases(SQL Server) that I want to perform parallel ETL into single DW. The number of source databases can vary, so some sort of dynamic connection configuration will be required. However, executing packages in a loop for each connection will not provide parallel behavior.

    Will appreciate if someone can provide some design pattern to achieve this. Being a new system, it has full flexibility to incorporate this.

    You can create a generic SSIS package with database/table configurable using configuration files and call the same SSIS package at run time with different parameters using a batch file, or by scheduling them independently using job scheduler (e.g control-m, Argent)

    [font="Arial"]BASKAR BV
    http://geekswithblogs.net/baskibv/Default.aspx
    In life, as in football, you won’t go far unless you know where the goalposts are.[/font]

  • Is a trial version of CozyRoc available?

    I'm sure that it is a great product but I would need to prove that it would help with a specific project that I'm working on or that I will likely be assigned in the near future.

    Thanks.

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Welsh Corgi (11/27/2011)


    Is a trial version of CozyRoc available?

    I'm sure that it is a great product but I would need to prove that it would help with a specific project that I'm working on or that I will likely be assigned in the near future.

    Thanks.

    Hi,

    When you test and develop from Visual Studio (BIDS), no license key is required. A license key is required for each physical machine where you want to schedule and execute packages under SQL Job Agent. You can get 30 days trial key using the included License application. For licensee identifier enter: TRIAL

    The latest version of the library can be downloaded from here.

    ---
    SSIS Tasks Components Scripts Services | http://www.cozyroc.com/

  • ok, thanks.

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply