Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Exotic Joins in T-SQL and How They Translate in SSIS Expand / Collapse
Author
Message
Posted Saturday, February 02, 2008 7:47 PM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Monday, March 12, 2012 8:06 AM
Points: 814, Visits: 100
Comments posted to this topic are about the item Exotic Joins in T-SQL and How They Translate in SSIS
Post #450900
Posted Monday, February 04, 2008 12:32 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, April 16, 2009 1:35 AM
Points: 6, Visits: 19
This is not a real join since only the first row from the lookup table is returned:-(
I have had the same problem and my experience is that there is not a general way to do joins in SSIS but your solution is very dependant on the problem at hand. In worst case you have to make a script task.
Post #451026
Posted Monday, February 04, 2008 3:11 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Saturday, September 27, 2008 4:21 PM
Points: 50, Visits: 35
It's only one step to realize that better solution will be move all joins to database with staging area :)
Post #451062
Posted Monday, February 04, 2008 6:27 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Sunday, November 10, 2013 11:52 AM
Points: 877, Visits: 188
Interesting. You say that this improved 10 minutes from using a Merge Join - that doesn't surprise me. I'd be more interested in knowing if there is a performance improvement/degredation compared to simply running it as a SQL statement.

Can I also ask why you chose to implement as a data flow given that you already had it working in T-SQL?

-Jamie


Jamie Thomson
http://sqlblog.com/blogs/jamie_thomson
Post #451120
Posted Monday, February 04, 2008 7:27 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Monday, March 12, 2012 8:06 AM
Points: 814, Visits: 100
Hi Jamie, thanks for your interest on this topic.

To answer the second part of your query first, the decision to implement as a dataflow rather than use the T-SQL in a stored procedure or in a SQL task within SSIS is mainly legacy based. Although the ability to change the way the package works is far simpler in SSIS than having to trudge through the SQL. This will also make it easier for future developers to follow on and improve on existing flows within the package.

On improvement/degradation,
In my experience and that is not the same as everyone else, the improvement on the SQL was markedly greater, although when developing our initial solution it was done mainly on a workstation. I will look into getting the server stats for you with regard to the differences.

Nick.
Post #451143
Posted Monday, February 04, 2008 7:29 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, February 04, 2008 7:27 AM
Points: 3, Visits: 9
Interesting article. I ran into something similar recently, but in our environment, the overhead of the individual select statements was too much for our production environment. I ended up implementing a script task (and started thinking about breaking it out into a new data flow component).
Post #451144
Posted Monday, February 04, 2008 7:45 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Sunday, November 10, 2013 11:52 AM
Points: 877, Visits: 188
Nick Maroudas (2/4/2008)
Hi Jamie, thanks for your interest on this topic.

To answer the second part of your query first, the decision to implement as a dataflow rather than use the T-SQL in a stored procedure or in a SQL task within SSIS is mainly legacy based. Although the ability to change the way the package works is far simpler in SSIS than having to trudge through the SQL. This will also make it easier for future developers to follow on and improve on existing flows within the package.

Very interesting justifications. I wrote an article once comparing these 2 options which you can see here: http://blogs.conchango.com/jamiethomson/archive/2006/03/14/SSIS_3A00_-Data-flows-or-T_2D00_SQL.aspx
I'd be really grateful if you could add your comments into the comments section there.


On improvement/degradation,
In my experience and that is not the same as everyone else, the improvement on the SQL was markedly greater, although when developing our initial solution it was done mainly on a workstation. I will look into getting the server stats for you with regard to the differences.

Nick.

I would expect T-SQL to be significantly faster. You're using a row-by-row lookup which will be very slow in comparison.

-Jamie


Jamie Thomson
http://sqlblog.com/blogs/jamie_thomson
Post #451148
Posted Monday, February 04, 2008 8:51 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Saturday, September 27, 2008 4:21 PM
Points: 50, Visits: 35

To answer the second part of your query first, the decision to implement as a dataflow rather than use the T-SQL in a stored procedure or in a SQL task within SSIS is mainly legacy based. Although the ability to change the way the package works is far simpler in SSIS than having to trudge through the SQL. This will also make it easier for future developers to follow on and improve on existing flows within the package.



Maybe change SSIS task is simple and fast but you don’t forget about software development and ETL other tracks.
- Find impact changes
- Forward changes to test environment
- Multi user development
- Test changes (fe. blackbox test input – output)
- Debug and error handling
- Versioning
- Deployments task
- Logic for concurrent loading/ incidental loading…
… etc.
Maybe it’s only my opinion but TSQL rules !

Post #451187
Posted Monday, February 04, 2008 8:59 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Sunday, November 10, 2013 11:52 AM
Points: 877, Visits: 188
djedgar (2/4/2008)

To answer the second part of your query first, the decision to implement as a dataflow rather than use the T-SQL in a stored procedure or in a SQL task within SSIS is mainly legacy based. Although the ability to change the way the package works is far simpler in SSIS than having to trudge through the SQL. This will also make it easier for future developers to follow on and improve on existing flows within the package.



Maybe change SSIS task is simple and fast but you don’t forget about software development and ETL other tracks.
- Find impact changes
- Forward changes to test environment
- Multi user development
- Test changes (fe. blackbox test input – output)
- Debug and error handling
- Versioning
- Deployments task
- Logic for concurrent loading/ incidental loading…
… etc.
Maybe it’s only my opinion but TSQL rules !



I don't think its only your opinion :)

But I can't help but take issue with some of the things that you say there:
-Find impact changes SSIS has this. Limited, but it has it.
-Forward changes to test environment If this means its possible to promote code to different environments then SSIS has this too.
-Multi user development SSIS has this
-Debug and error handling SSIS has this
-Versioning SSIS has this (versioning of packages). And unless there's some sort of new feature that I don't know about there is no versioning of stored procs within the database engine
-Logic for concurrent loading/ incidental loading Not quite sure what this means but with SSIS its possible to load multiple tables concurrently.

And that's just my opinion :)

-Jamie


Jamie Thomson
http://sqlblog.com/blogs/jamie_thomson
Post #451193
Posted Monday, February 04, 2008 2:22 PM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Saturday, September 27, 2008 4:21 PM
Points: 50, Visits: 35
I didn’t write that it’s missing and I don’t see fault in SSIS only
Let’s evaluate

-Forward changes to test environment If this means its possible to promote code to different
environments then SSIS has this too.

Promote code to different environments should freeze projects but when you want
test package you must write\reconfigure connection – potential errors !

-Multi user development SSIS has this

It’s for me freezing code for some task . We need to implement a little complex data flow task
we have two programmers how we could divide jobs for them ?

-Debug and error handling SSIS has this

Yes but I prefer try catch statements with nested functionality

-Versioning SSIS has this (versioning of packages). And unless there's some sort of new feature
that I don't know about there is no versioning of stored procs within the database engine

But if you can use not commercial concurrent versioning system it’s maybe problem delegate
code because xml include some binary data.

-Logic for concurrent loading/ incidental loading Not quite sure what this means but with SSIS
its possible to load multiple tables concurrently.

In real life ETL is very complex and has very dependent task. When something
is going wrong it’s needed quick find the errors and reprocess flow.
In some situations the best choice will be ignore error data, how can admin do it without modification
packages ? Why to the @#!#!! when you run package in agent job you can only see it as one atomic job. I want to say that SSIS package without more flexibly control loading tool (based on metadata) it’s condemned only on run in dubug mode ;)
Post #451352
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse