Restricting Data before using OLE DB Destination

  • I just used the unpivot tool. So now I have more columns than I need. I know how to map the columns. If I need to use the SQL command, i don't know how I'd use it. I know the above is an either/or situation.

    Since my data has been unpivoted in an effort to normalize it, I need to make sure that the information i'm inputting into the destination is DISTINCT. What tool do i need to get DISTINCT data into my destination?

    Thanks!

  • The aggregate tool. Just group by everything.

    Be aware, that's a stream-stopper. It forces all rows into it before it releases any because the data is unsorted. So any following constructs in the stream won't start until the aggregate completes.

    Under most circumstances I tend to do aggregations at the db tier as it's better able to handle the workload. If you're dropping this to a staging table immediately after you'll be better off (unless the volume difference is drastic) dumping everything and then SELECT DISTINCT'ing into your real table.

    If it's for a flatfile or something it's just price of doing business, just try to make sure it's late in the stream.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Jacob Pressures (2/19/2013)


    I just used the unpivot tool. So now I have more columns than I need. I know how to map the columns. If I need to use the SQL command, i don't know how I'd use it. I know the above is an either/or situation.

    Since my data has been unpivoted in an effort to normalize it, I need to make sure that the information i'm inputting into the destination is DISTINCT. What tool do i need to get DISTINCT data into my destination?

    Thanks!

    Quick note: if your source is an RDBMS, it would be faster (probably much faster) to do all of this using SQL rather than SSIS.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Thanks guys! It sounds like doing the staging table is the easiest. This is just an exercise for my internship to help me understand how to use SSIS. I guess i'll do it both ways for practice.

    If i use a staging table, I'm assuming I'd have to end the data flow add another data flow to the Control flow and link the two. I'd then pull the distinct data out of the staging tables and place it in the real destination tables.

    This is my understanding. Any alternatives? Once I put something into a destination table can i take it out in the same data flow? This is why i'm thinking I'd need two data flows.

    Thanks!

  • There are almost always alternatives, and knowing the pros and cons of each is a really useful thing. In this case, once data is in a staging table I would probably run a T-SQL Merge to get the data to its destination, unless it's all INSERTS, in which case your suggested method will work fine.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Thanks very much!

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply