• Randy Knight (11/12/2012)


    rhythm.varshney (11/12/2012)


    yes.I am importing the data to staging table as it is and there I am applying split on a column say name into first name and last name on the basis of space between two.

    so you meant too say derived column would be better approach ??

    Yes. The SSIS expression language is going to be much easier to do this in. And while it is a bit of a paradigm shift, get away from the staging table idea. We want to be doing ETL (Extract, Tranform, Load) rather than ELT (Extract, Load, Transform). SSIS excels at doing these things in memory, in the pipeline. Writing to staging table(s) in the interim slows things down considerably. When things get overly complex for a single dataflow, I will often write to raw files for the interim step rather than a db table ... still much faster.

    Right up until you need to sort data for aggregations or the like, I agree with you. When you need to do that, T-SQL tends to blow the dataflow stream away. I agree with the general principal though. In a staging table you need to write at least twice (once in, once transformed), where if you transform before you write, it's a single dropoff.

    However, don't completely throw away the idea of a staging table. You cannot UPDATE or DELETE from an existing table from SSIS in bulk. If you're doing a MERGE equivalent and not merely delivering new rows to the existing table, deliver to staging and then MERGE it in. That's still part of the LOAD components of ETL though, so I may just be splitting hairs with what Randy was recommending.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA