Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

SSIS Programming

By U.K Padmaja,

Overview

Most of us have been using Data Transformation Services (DTS) in the previous versions of SQL Server in order to manipulate or move data. With the introduction of SQL Server 2005, Microsoft has introduced a completely re-written version of DTS with a lot more new features which is now known as Integration Services. This new ETL platform is nothing but a set of utilities, applications, designers, components, and services all wrapped up into one powerful software application suite.

The most attracting feature of SSIS is that the data movement and transformation is separate from the package control flow and management. There are two different engines that handle these tasks. Integration Services Data Flow engine takes care of the data movement and transformation whereas the Integration Services Run-time engine manages the package control flow.

The motivation behind this article is that the SSIS API is not well documented in MSDN though you get an overall idea of what is going on. This article assumes that the reader is aware of the fundamental components of an SSIS Package. To know more about what SSIS is all about, check the MSDN Books Online http://msdn2.microsoft.com/en-us/library/ms141026.aspx

DTS vs. SSIS Programming

There are two fundamental approaches one can have for SSIS programming. One is to use the SSIS Designer to create and run packages. The other approach is to use the API to create, configure and run packages from own applications.

Those who are familiar with the DTS programming can see that there is very little code from DTS remaining in Integration Services. Some of the similarities are listed here.

DTS

SSIS

Control flow and Data flow on the same design surface

Control flow and Data flow have been separated

Datapump (supported only one source, transform, and destination)

Data Flow Task or Pipeline (can perform extremely complex and advanced data processing within one pipeline)

Step

Taskhost (An expanded version of Step) which schedules the task execution

Precedence Constraints

Still remains with added features

Connections

Connection Managers

Getting Started

Let us get started with some actual coding in VB.NET, my favorite!

The following table lists the assemblies that are frequently used when programming Integration Services using the .NET Framework (http://msdn2.microsoft.com/en-us/library/ms403344.aspx).

Assembly

Description

Microsoft.SqlServer.ManagedDTS.dll

Contains the managed run-time engine.

Microsoft.SqlServer.RuntimeWrapper.dll

Contains the primary interop assembly (PIA), or wrapper, for the native run-time engine.

Microsoft.SqlServer.PipelineHost.dll

Contains the managed data flow engine.

Microsoft.SqlServer.PipelineWrapper.dll

Contains the primary interop assembly (PIA), or wrapper, for the native data flow engine.

Consider the example of transferring data from a flat file to a database table in SQL Server 2005. Create the destination table in SQL Server prior to running the package. This can be done using a simple T-SQL statement.

The programmatic approach would include the following steps:

  1. Create an SSIS Package

http://msdn2.microsoft.com/en-us/library/ms135946.aspx

  1. Create Control flow, which would be as simple as adding a new Data Flow task in our case.

http://msdn2.microsoft.com/en-us/library/ms135997.aspx

  1. Add two connection managers. We need a Flat file connection manager for the source and an OleDb or SQL Server connection manager for the destination.

Creating OleDb connection manager is not hard.

Dim oleDbConn As ConnectionManager = myPackage.Connections.Add("OLEDB")
oleDbConn.Name = "MyOLEDBConnection"
oleDbConn.ConnectionString = "Provider=SQLOLEDB.1; Integrated Security=SSPI;"
& "Initial Catalog=<db>; Data Source=" & SS_INSTANCE & ";"

The following code shows creating a flat file connection manager and setting up the formatting information for the same.

cnFile = myPackage.Connections.Add("FLATFILE")
cnFile.Properties("Name").SetValue(cnFile, "MyFlatFileConnection")
cnFile.Properties("ConnectionString").SetValue(cnFile, <srcFilePath>)
cnFile.Properties("Format").SetValue(cnFile, "Delimited")
cnFile.Properties("Unicode").SetValue(cnFile,False)

cnFile.Properties("ColumnNamesInFirstDataRow").SetValue(cnFile, True)
cnFile.Properties("DataRowsToSkip").SetValue(cnFile,0)
cnFile.Properties("RowDelimiter").SetValue(cnFile,vbCrLf)
cnFile.Properties("TextQualifier").SetValue(cnFile,"""")
cnFile.Properties("HeaderRowsToSkip").SetValue(cnFile, 0)
cnFile.Properties("HeaderRowDelimiter").SetValue(cnFile,vbCrLf)
cnFile.Properties("CodePage").SetValue(cnFile,1252)

Dim MyFlatFilecn As SSISRuntime.IDTSConnectionManagerFlatFile90 = Nothing

MyFlatFilecn = TryCast(cnFile.InnerObject,SSISRuntime.IDTSConnectionManagerFlatFile90)
 DtsConvert.ToConnectionManager90(cnFile)

 Dim name As SSISRuntime.IDTSName90
 Dim col As SSISRuntime.IDTSConnectionManagerFlatFileColumn90
 cnFile.AcquireConnection(Nothing)

 ' Add columns to the FlatFileConnectionManager

 Dim headers As String()
      'Get the header row from flat file. This can be stored in a database table as well.
      'Assuming that "headers" Array contains the headers
 Dim i As Integer = 0
 While i < headers.Length
 col = MyFlatFilecn.Columns.Add()
 If i = headers.Length - 1 Then
 col.ColumnDelimiter = vbCrLf
 Else
 col.ColumnDelimiter = ","
 End If
 col.ColumnType = "Delimited"
 name = TryCast(col, SSISRuntime.IDTSName90)
 name.Name = headers(i)

 'get the converSion information from the destination table
          'this should assign the appropriate datatype, precision, scale and length to col
 getConversionInfo(col, name.Name) 'implement this
 i = i + 1
 End While
 cnFile.ReleaseConnection(Nothing)
  1. Create Data flow which does the actual job of copying the data from flat file to table. This would involve creating a Flat file source and an OleDb destination. This is fairly standard. We give it for the sake of completeness.

Note: One can add data transformation components here if required.

      DFSource = DTP.ComponentMetaDataCollection.New()
 DFSource.ComponentClassID = "DTSAdapter.FlatFileSource"
 DFSource.Name = "FlatFileSource"

 ' Get the design time instance of the component.
 SourceInst = DFSource.Instantiate()
 ' Initialize the component.
 SourceInst.ProvideComponentProperties()
 ' Specify the connection manager.
 If DFSource.RuntimeConnectionCollection.Count > 0 Then
         DFSource.RuntimeConnectionCollection(0).ConnectionManagerID = 
		   myPackage.Connections("MyFlatFileConnection").ID
         DFSource.RuntimeConnectionCollection(0).ConnectionManager = 
           DtsConvert.ToConnectionManager90(myPackage.Connections("MyFlatFileConnection"))
 End If
 ' Reinitialize the metadata.
 SourceInst.AcquireConnections(Nothing)
 SourceInst.ReinitializeMetaData()
      Dim exOutColumn As IDTSExternalMetadataColumn90
 For Each outColumn As IDTSOutputColumn90 In DFSource.OutputCollection(0).OutputColumnCollection
 exOutColumn = DFSource.OutputCollection(0).ExternalMetadataColumnCollection(outColumn.Name)
 SourceInst.MapOutputColumn(DFSource.OutputCollection(0).ID, outColumn.ID, exOutColumn.ID, True)
 Next
SourceInst.ReleaseConnections()

Similarly add a destination.

  1. Connect Data flow components

http://msdn2.microsoft.com/en-us/library/ms136086.aspx

 

  1. Map source and destination columns
 Dim input As IDTSInput90 = DFDestination.InputCollection(0)
 Dim destinationInputID As Integer = CInt(input.ID)
 Dim vInput As IDTSVirtualInput90 = input.GetVirtualInput()

 For Each vColumn As IDTSVirtualInputColumn90 In vInput.VirtualInputColumnCollection
 ' This will create an input column on the component.
 DestInst.SetUsageType(destinationInputID, vInput, vColumn.LineageID, DTSUsageType.UT_READONLY)
 ' Get input column.
 Dim inputColumn As IDTSInputColumn90 = input.InputColumnCollection.GetInputColumnByLineageID(vColumn.LineageID)
 ' Getting the corresponding external column.
 ' Ex : We will use the column name as the basis for matching data flow columns to external columns.

 Dim externalColumn As IDTSExternalMetadataColumn90 = input.ExternalMetadataColumnCollection(vColumn.Name)
 ' Tell the component how to map.
 DestInst.MapInputColumn(destinationInputID, inputColumn.ID, externalColumn.ID)
Next
  1. Validate and execute the package
  Dim pkgStatus As DTSExecResult = myPackage.Validate(Nothing, Nothing, Nothing, Nothing)

  If pkgStatus = DTSExecResult.Success Then
     Dim pkgResult As DTSExecResult = myPackage.Execute()
  End If

Notes

1: For debugging purposes it is a good idea to save the SSIS package created programmatically which can then be exported to the Designer. This would allow us to check whether the properties of the connection managers etc are set properly. See the link for different options.

http://msdn2.microsoft.com/en-us/library/ms403347.aspx

2: To get a list of data types that SSIS uses, check

http://msdn2.microsoft.com/en-us/library/ms141036.aspx

3: In case a subset of columns from the flat file needs to be transferred to the destination table, then one can identify such columns in Step c, and then delete them from Output Collection of the source in Step d before creating the external columns

Download the code

Conclusion

SQL Server Integration Services, the new ETL platform in SQL Server 2005, is the successor to DTS, which was there in previous versions of SQL Server. SSIS with all its new features is a huge topic in itself and we tried to look at some of the programming aspects of it. We have looked at how we can programmatically create a package to transfer data from a Flat file source to a database table. Interested readers can look at the MSDN for more information on SSIS. A couple of books that I have come across are also worth reading

  1. Microsoft SQL Server 2005 Integration Services by Kirk Haselden
  2. Professional SQL Server 2005 Integration Services by Brian Knight et al
  3. Microsoft SQL Server 2005: A Developers Guide by Michael Otey et al.
Total article views: 16627 | Views in the last 30 days: 32
 
Related Articles
FORUM

Using computed Columns to Create another Computed Column

Using computed Columns to Create another Computed Column

FORUM

constraints on columns while creating the table

constraints on columns while creating the table

FORUM

Creating Dynamic Columns in SSRS 2008

Creating Dynamic Columns in SSRS 2008

FORUM

Create Function to Group Column and display as single column

Create Function to Group Column and display as single colum

FORUM

Create a trigger on a column

need help creating a trigger on a column

 
Contribute

Join the most active online SQL Server Community

SQL knowledge, delivered daily, free:

Email address:  

You make SSC a better place

As a member of SQLServerCentral, you get free access to loads of fresh content: thousands of articles and SQL scripts, a library of free eBooks, a weekly database news roundup, a great Q & A platform… And it’s our huge, buzzing community of SQL Server Professionals that makes it such a success.

Join us!

Steve Jones
Editor, SQLServerCentral.com

Already a member? Jump in:

Email address:   Password:   Remember me: Forgotten your password?
Steve Jones