SSIS Data Flow Transformations

  • Hi, I am new to SSIS. I would like to know, when it comes to simple queries is SSIS, whether there are any advantages to using data flow transformations, instead of using the Execute SQL Task?

    I understand that there is additional functionality in the data flow tasks, like Fuzzy Lookups for example, but I am refering to simple joining and grouping of tables.

  • jdpowell777 (9/16/2011)


    Hi, I am new to SSIS. I would like to know, when it comes to simple queries is SSIS, whether there are any advantages to using data flow transformations, instead of using the Execute SQL Task?

    I understand that there is additional functionality in the data flow tasks, like Fuzzy Lookups for example, but I am refering to simple joining and grouping of tables.

    A fine question, but you're looking at the purpose of the tool slightly the wrong way. Joining and grouping should almost always be done in the SQL Engine, not the SSIS engine, unless it's nearly unavoidable or your data sources are very disparate. IE: Sometimes you really need to join some DB2 data to some MySQL data and dump it into an Oracle DB. Usually, though, you'd stage this data on the SQL Server.

    Execute SQL task in the Control flow is a one-shot run of your canned code. It's very useful for things like triggering off procedures against staged tables, returning maxDates for Delta changes, things like that. The Data Flow's true power is in the data-stream. While it's semi-RBar, it can handle things on a row by row basis much faster sometimes then a massive join once you've gotten it to the staging table.

    As an example, I took a 3 million row dump with a few lookup table updates coming in from 3 hours down to 10 minutes by judicious use of the Lookup component. As each row goes through the stream you can manipulate them VERY quickly, and can achieve some real speed differences instead of multi-passing the data on the target.

    So, to answer your question, do your best to avoid the MERGE JOIN/SORT/AGGREGATION components, they really will slow down the SSIS stream, unless you just can't seem to find a way to avoid them. Your best bet (usually) when bringing data, particularly from any one data source, that needs joins in SSIS is to do them before it ever gets to you.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Okay, great, thanks for the response

  • jdpowell777 (9/19/2011)


    Okay, great, thanks for the response

    My pleasure, good luck. 🙂


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

Viewing 4 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply