MULTIPLE DATAFLOW vs. MULTIPLE SOURCE and UNION ALL

  • Currently I am working on SQL 2008. BUT I have a general question, and wants to get some feedback or advice or even see if someone has done a test on it.

    I have a stage table on SQL. and I need to insert into SQL table (one table) on same server.

    My source inside dataflow will be a sql query selecting from stage table based on WHERE condition. SO now which is faster? or even from others persepective which is a better design.?? Here are two scenerio.

    No. 1 Scenerio - Have multiple dataflow tasks that runs in parallel and inserts into same SQL table

    So, something like this

    Dataflow1 DataFlow2 DataFlow3 DataFlow4 DataFlow5 DataFlow6 DataFlow7

    Select * same here so on.......

    From Stage for type = 2

    Where type = 1

    and

    inserts into Prod table with in each dataflow.........................................

    OR No2. Scenerio

    DataFLow ( one data flow)

    Multiple OLEDB SOurce ( here in this case 7, for each type, ) and same above sql query to slect rows

    Do a union all....

    and Insert into Prod table

  • Why can't you write one SQL query to pull all results in one shot? Can you post the table DDL for your staging table and some sample data?

    The article link in my signature gives great tips on posting your sample data.

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • What kind of transformations are you making to the data?

    If you are not doing some major transformations than I would recommend that you just use T-SQL to move the data from the staging table to the production table.

  • Thanks Guys.....

    @john-2, I cannot do in one shot. I know where u coming from. But its just too much logic to handle and lots of data. I am doing cross join to all the stores so , i.e. like 4000 times no of records of products times prodroles. Also, I have to get value from different other tables, and same records might have differet roles.

    So u think doing T-SQL inside SSIS is better for insert. There is already too much T-sql inside ssis. We are even capturing all the business logic and requiremnets inside ssis (without the use of proc), Actual s-ql code.

    That means few procs over 3000 lines of code, inside ssis. Best of luck to me.

  • Well, if you've already got the data in a staging table, you may consider doing as Jack suggests and just move it with T-SQL. If the data is already in the database, why use SSIS? SSIS is a robust ETL tool (Extract - Transform - Load) but it sounds like you are trying to to an ELTR (Extract - Load - Transform - Reload).....I just made that acronym up by the way.

    If you think you need to do this inside of SSIS, I'm not sure if I can tell you which method would be better. I would go with the method that gives you the most code re-use. You don't want to duplicate all of that business logic in multiple places. That would turn into a maintenance nightmare down the road. I would recommend coming up with a solution that allows you to keep all of your business logic (transformations) in one data flow. Yes, it may get complex with UNIONS and multiple sources, but you really don't want to duplicate the business logic all over the place.

    Or, use T-SQL to move the rows into your production table.

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply