Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

The New ETL Paradigm Expand / Collapse
Author
Message
Posted Monday, February 7, 2005 4:39 PM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Sunday, November 10, 2013 11:52 AM
Points: 877, Visits: 188

Comments posted to this topic are about the content posted at http://www.sqlservercentral.com/columnists/jthomson/thenewetlparadigm.asp

 



Jamie Thomson
http://sqlblog.com/blogs/jamie_thomson
Post #160171
Posted Tuesday, February 15, 2005 4:24 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Yesterday @ 1:29 AM
Points: 117, Visits: 155

Hi Jamie, thanks for the article. I am yet to try out 2005, but am particularly interested in SSIS. I don't have a broad range of ETL experience, but the ETL work I do largely consists of DTS packages that call Stored Procedures. In essence, the DTS is only really there to help control workflow.

In your article, you discuss the 'E' and the 'L', but what about the 'T'? From what you have seen of SSIS, how well (or otherwise) does it handle transformations? Will we still be using Stored Procedures to do the 'real' work - such as deciding when to update a dimension record instead of inserting a new one. In DTS, we can use lookups in a Data Driven Query Task to achieve this, but the row-by-row processing has put us off due to the performance overhead incurred when dealing with very large dimension/fact loads. Hence the reliance on Stored Procedures and the set-based processing that allows.

I would LOVE to be able to perform the complete end-to-end ETL processing in a standardised environment (like I am told a 'real' ETL tool does - no I haven't used any others!), and am really hoping that SSIS is the answer to this.

Thanks again,

Matt.




Post #161673
Posted Tuesday, February 15, 2005 6:07 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Saturday, May 3, 2014 1:28 PM
Points: 160, Visits: 246

Thanks for the article - can you advise if there is something in SQL2005 to convert exsting DTS packages? If so, how accurate does it appear to be (if you have tested it)?

 




Post #161689
Posted Tuesday, February 15, 2005 12:44 PM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Sunday, November 10, 2013 11:52 AM
Points: 877, Visits: 188

Matt,

Thanks for your reply.

Your approach to using DTS is the one that alot of people take, certainly I know I do. And my company do steadfastly.

You are right that I didn't cover the 'T' and I only really touched on 'E' & 'L', that will come later. In the meantime you may want to check out http://www.sqlservercentral.com/columnists/jthomson/transactionsinsqlserver2005integrationservices.asp or http://www.sqlis.com which has many articles already or my SSIS blog at http://blogs.conchango.com/jamiethomson which I put useful little ditties on now and again.

You still have the choice of using stored procs to do your work and indeed there is an ExecuteSQL Task in SSIS to enable you to do this just as there was in DTS. However, the rich transformation functionality in SSIS is a complete departure from the limited functionality in DTS. To give you a flavour of what transformations you have there are:

Sort - Sort data in the pipeline
Aggregate - Similar to T-SQL aggregate functions
Lookup - Similar behaviour to DTS Data Driven Queries but far far more performant
Conditional split - Split a data set into numerous smaller data sets depending on attributes of the data. Enables filtering.
Derived Column - Using the new SSIS expression language derive new values from existing ones.
Merge and Merge Join - Similar to T-SQL joins

This list basically reflects "standard" functionality is by no means exhaustive. There are many many more transformation components available for you to use in a data flow including things that have never een available before such as fuzzy lookups and fuzzy matchnig (i.e. Match inexact values based on a scoring mechanism - invaluable for deduping data)

I hope this is useful to you. Check out the above references and look out from more material from myself in the future covering some of these concepts in more detail.

 



Jamie Thomson
http://sqlblog.com/blogs/jamie_thomson
Post #161849
Posted Tuesday, February 15, 2005 12:47 PM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Sunday, November 10, 2013 11:52 AM
Points: 877, Visits: 188

Tim,

There is a migration wizard for existing packages. I have not used it myself, largely because the mapping between DTS functionality and SSIS functionality is so obscure I am fearful about what the migration wizard will actually produce. I am convinced (and other people's experiences back this up) that it would be quicker to build the packages from scratch rather than migrate existing ones and I would also expect that the result of doing it yourself would be alot closer to an optimum solution than the migration wizard could produce.

As I say though I haven't used it so on this one you're just going to have to trust me

 



Jamie Thomson
http://sqlblog.com/blogs/jamie_thomson
Post #161851
Posted Friday, September 23, 2005 12:08 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Tuesday, February 28, 2006 4:44 PM
Points: 3, Visits: 1

Dear Jamie,
I just wanted to get some feed back from you about performance in Integration Services.  I've been using it for two months and I have created a package that should pump 2.5 million records from my source to my destination.  Now I will say that I have several steps of sorting (Using the sort component) and merge join (using the merge join component).  But my performance is horrible.  I used the methodology outlined in your Article: " New ETL Paradigm"  I saw this component from ivolva.com  which they created as a third party component called extra sort.  It says it speeds up sorting.  Do you have ANY suggestions??? 

Thank you,

Mag

Post #223212
Posted Friday, November 3, 2006 7:32 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, November 3, 2006 7:22 AM
Points: 3, Visits: 1

The article was good and it's clear SSIS addresses some glaring deficiencies in DTS, but the title "A New ETL paradigm" is a terrible misnomer.  It's a new SQL Server ETL tool, that's all. 

On it's own, DTS is not an ETL application.  It's good for moving data, but transforming and loading should be left to something else.

 

Post #320281
Posted Friday, November 3, 2006 7:45 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Sunday, November 10, 2013 11:52 AM
Points: 877, Visits: 188

In case anyone is interested, Mag and I sorted this one out offline and she was good enough to write a short case study which is well worth a read:

A case study of using LOOKUPs instead of MERGE JOINs
http://blogs.conchango.com/jamiethomson/archive/2005/10/21/2289.aspx

 

-Jamie

 



Jamie Thomson
http://sqlblog.com/blogs/jamie_thomson
Post #320292
Posted Friday, November 3, 2006 7:49 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Sunday, November 10, 2013 11:52 AM
Points: 877, Visits: 188

James,

Fair comment I guess. The title was meant to be fairly colloquial (I think - Its almost two years since I wrote it and was surprised to see it republished).

-Jamie

 



Jamie Thomson
http://sqlblog.com/blogs/jamie_thomson
Post #320294
Posted Friday, November 3, 2006 9:07 AM


SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Monday, January 14, 2013 4:24 AM
Points: 81, Visits: 188
Thanks for the article, Jamie.
It was very useful.
Vincent
Post #320342
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse