Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««1234»»»

Talend vs. SSIS: A Simple Performance Comparison Expand / Collapse
Author
Message
Posted Tuesday, October 22, 2013 6:35 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Monday, November 25, 2013 11:59 AM
Points: 10, Visits: 72
We use both SSIS and Talend. Typically we use SSIS when writing to SQL Server and Talend for everything else. On your performance measurement, you might want to create stand-alone jobs (run from command line) and run them for comparison. Talend jobs run significantly faster on our server after exporting and scheduling to be run in batch instead of running within the interactive development/debugging environment.

Thanks for the article.
Post #1507093
Posted Tuesday, October 22, 2013 6:50 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Friday, November 21, 2014 9:52 AM
Points: 49, Visits: 213
We have just started using Talend for some Oracle data quality projects and I can't even get it to run. It definately requires some training.
Post #1507102
Posted Tuesday, October 22, 2013 7:16 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, September 15, 2014 7:07 AM
Points: 2, Visits: 26
If you mean the DQ offering itself then that is a diff beast altogether (and, when I last looked at it, not overly impressive) - anybody with basic ETL skills will pick up OpenStudio more or less immediately ...


Post #1507119
Posted Tuesday, October 22, 2013 7:29 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Friday, December 13, 2013 6:06 AM
Points: 10, Visits: 8
Just wondering, why in Talend job a tFileOutputMSDelimited (multi - schema) is used instead of a simple tFileOutputDelimited (single - schema). It wouldn't most probably change a lot, but well, it would be more transparent.
Post #1507128
Posted Tuesday, October 22, 2013 7:31 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Friday, May 16, 2014 3:09 AM
Points: 18, Visits: 92
My primary ETL background is SSIS and DTS before that. So I'm a big fan of using it to shunt complex data sets around, including dreaded VB.NET transformations.

I've had cases when trying to get data from DB2 and other non-MS sources in which I just couldn't get SSIS to work, so having played with Talend decided to try it in anger.

It opened the DB2 system straight up, was able to reference tables and data and when told to fire them into a SQL system (to make analysis easier & faster for me), offered to create SQL compliant tables on the fly!

It's a clunky interface, but once adjusted, it's very powerful. I've recommended it for other non-MS scenarios as a powerful tool. The range of native connectors really does it for me, something that for once SSIS needs to catch up with.

SSIS really needs native DB2 and SAP application connectors.



Post #1507131
Posted Tuesday, October 22, 2013 7:33 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Friday, May 16, 2014 3:09 AM
Points: 18, Visits: 92
Re the server specs, they are referenced as...

RAM: 76GB
OS: Windows Server 2008 R2 - 64 bit

Even without the O/S specified, above 3GB will be 64bit; 76GB of RAM will be nothing but.



Post #1507135
Posted Tuesday, October 22, 2013 8:54 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Wednesday, November 6, 2013 10:15 AM
Points: 1, Visits: 44
To be fair, it would have been nice if he had someone that was well versed in Talend as he is with SSIS to make a good competitive test. As simple as his may have been he would not know all the right buttons to push. A cross platform test would be more telling as well as the time and ease of setting up a test.
Post #1507200
Posted Tuesday, October 22, 2013 9:49 AM


Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Monday, November 10, 2014 7:54 AM
Points: 24, Visits: 222
great test and article!

I did similar tests 2 or 3 years ago and was looking at SSIS, Talend and Pentaho. A lot of things changed since then but looks like the same thing true for Talend still - you need to mess with JVM and fight out of memory issues. This is just not good for a professional ETL tool.

I was amazed though with different possibilities and tons of features but these errors turned me away. Pentaho was great though and while they also use JVM and Eclipse, I did not have to mess with JVM.

If you read this article because you are looking for a good ETL tool, check Magic Gartners report - they have some good points there about SSIS, Pentaho and Talend.

If I had to chose between these three, I would probably pick SSIS if have to work with SQL Server and Pentaho for everything else.

SSIS 2012 is a huge improvement over 2008 R2 but still has a long way.

As for the test, it would be nice to see some typical ETL operations as well - try to sort 20MM file in Talend and when group by a few fields. This is then nightmares begin :)
Post #1507226
Posted Tuesday, October 22, 2013 10:55 AM


Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Thursday, October 24, 2013 12:12 PM
Points: 72, Visits: 326
I can understand developers picking up SSIS cause it ships with SQL server and is therefore considered 'free' . It may suffice if you have no comparison.
What I do not understand why anyone would compare it favorably to Pentaho. PDI ships with at least twice as many features 'out-of-the-box' ,extremely easy to use debugging, logging (server / db) for each and every purpose, great and stable GUI, limited set of internal datatypes etc, etc.

I build the same pilot with both SSIS and Pentaho, where SSIS literally took 3 times as much time. The interface feels like wading through mud, excrutiatingly messy with datatypes, and even unstable at times. The debugging features are laughable.

My background is INFA's Powercenter (PC). Though not as complete as PC, for small to medium projects I rate Pentaho over PC as well
Post #1507244
Posted Tuesday, October 22, 2013 11:36 AM


Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Monday, November 10, 2014 7:54 AM
Points: 24, Visits: 222
keep in mind many people who started working with SSIS, have very little or no experience with dedicated ETL tools. In fact, a huge lot of companies still code ETL routines manually and for them going to T-SQL to SSIS (running T-SQL) is a more natural step than from going away entirely from their favorite SQL coding and use something like Pentaho.

Now, if you used any decent ETL tools, you will definitely miss many features in SSIS but it is still a good tool and it comes free with SQL Server that many people forget.

Also the good (and bad at the same time) part about SSIS, if you cannot do something, you can script it using VB or C#. I know it is ugly but many people end up doing that.
Post #1507265
« Prev Topic | Next Topic »

Add to briefcase ««1234»»»

Permissions Expand / Collapse