SQLServerCentral Article

Error Handling in SSIS

,

Anyone who has used SSIS error handling has been frustrated that it returns an Error Code with no description. I decided that I needed a solution. Googling the problem led me to this blog by Jamie Thomson which had a link to a Script Component that gets the Error Description. Excellent! While this was I good start, I also wanted to see the data that caused the error. I started by including the data and outputting to a text file, but, being a database pro, this did not sit well with me. I started looking for a solution that would allow me to store the data in a table and be useable in each dataflow task, and, once again, the internet provided a solution, this article by jwelch on agilebi.com which I highly recommend reading.

My solution is uses the Input0_ProcessInputRow method and reflection, as shown by jwelch’s article, to loop through the columns, building an XML string (name-value pairs) which I insert into a table. I can then query the table to see the errors and the data. I can then verify that I fixed the errors from earlier loads.

Here is the solution:

First, I built this table in my SQL Server DB:

CREATE TABLE [data_xfer].[load_errors]
(
            [load_error_id] [int] IDENTITY(1,1) NOT NULL,
            [package_run_date] [smalldatetime] NULL CONSTRAINT [DF_load_errors_error_date]  DEFAULT (getdate()),
            [error_task] [varchar](100) NULL,
            [error_step] [varchar](100) NULL,
            [error_code] [int] NULL,
            [error_desc] [varchar](100) NULL,
            [error_details] [xml] NULL,
 CONSTRAINT [PK_load_errors] PRIMARY KEY NONCLUSTERED
(
            [load_error_id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

Then, whenever configuring error output, add a script component following these steps:

1. On the Input screen add the columns that could be the source of the error

Properties

2. On the Inputs and Outputs screen add the following columns to the Output:

Properties

a. ErrorStep string(100) – this is going to be the Name of the Script Component which I name using the step which is the source of the error

b. ErrorDataDetails Unicode String(4000) – this is going to be an XML string that will have the data from the Input Columns and will be inserted into an XML Type column in SQL Server

c. ErrorDesc string(100) – The error description gotten from the ErrorCode

d. ErrorTask string(100) – The SSIS Task that the Step is in. In my case it is a DataFlow Task. This is retrieved using the TaskName package variable that is created by SSIS.

e. PackageTime Date – The start time of the package. This is retrieved using the StartTime package variable.

3. On the Script Screen add the following to the ReadOnlyVariables:

Properties

a. TaskName – a default SSIS variable that has the task name

b. StartTime – a default SSIS variable that is the start time of the package

This completes the setup so we can now Design the script in Visual Studio for Applications. Here is the script I am using:

ImportsSystem
ImportsSystem.Data
ImportsSystem.Math
ImportsMicrosoft.SqlServer.Dts.Pipeline.Wrapper
ImportsMicrosoft.SqlServer.Dts.Runtime.Wrapper
' added this import
ImportsSystem.Reflection
PublicClass ScriptMain
    Inherits UserComponent
    Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
        Dim column As IDTSInputColumn90
        Dim rowType As Type = Row.GetType()
        Dim columnValue As PropertyInfo
        Dim strErrorDetails As String = "<fields><field "
        Dim strAttributeName As String
        Dim strValue AsString
        ' Get the error description
        Row.ErrorDesc = ComponentMetaData.GetErrorDescription(Row.ErrorCode).Trim()
        ' This actually gets the script component name
        Row.ErrorSource = ComponentMetaData.Name.Trim()
        ' get the task name and the package start time
        Row.ErrorTask = Variables.TaskName.Trim()
        Row.PackageTime = Variables.StartTime
        Try
            ' loop through the columns in the input to create a generic xml doc
            For Each column In ComponentMetaData.InputCollection(0).InputColumnCollection
               ' remove illegal xml characters in the name
               If column.Name.IndexOf(".") > -1 Then
                    strAttributeName = column.Name.Replace(".", "")
               Else
                    strAttributeName = column.Name
               End If
               If strAttributeName.IndexOf("_") > -1 Then
                    strAttributeName = strAttributeName.Replace("_", "")
               End If
               If strAttributeName.IndexOf(" ") > -1 Then
                    strAttributeName = strAttributeName.Replace(" ", "")
               End If
            ' Skip the Error Columns
IfNot (strAttributeName.Contains("ErrorCode") Or strAttributeName.Contains("ErrorColumn")) Then
                   ' get the value for the column
                    columnValue = rowType.GetProperty(strAttributeName)
                   Dim objColumn As Object = columnValue.GetValue(Row, Nothing)
                   ' if the value is null set to empty string
                   If IsNothing(objColumn) Then
                        strValue = String.Empty
                   Else
                        strValue = objColumn.ToString()
                   End If
                   ' append the node to the xml string
                    strErrorDetails = strErrorDetails & "name=" & _
ControlChars.Quote column.Name.Trim & ControlChars.Quote & _
 " value=" & ControlChars.Quote & strValue & ControlChars.Quote & _
 " /><field "
               End If
            Next
            'finish the xml string
strErrorDetails = strErrorDetails.Substring(0, strErrorDetails.Length - 7) & _ 
"</fields>"
        Catch ex As Exception
            ' if an error occurs log it and keep processing instead of killing the load
            strErrorDetails = "<fields><field name=" & ControlChars.Quote & "error" & _
                                    ControlChars.Quote & " value=" & ControlChars.Quote & ex.ToString.Trim & _
" " & strAttributeName.Trim & ControlChars.Quote & "/></fields>"
        End Try
        ' add the xml output to the output row
        Row.ErrorDetails = strErrorDetails
    End Sub
EndClass

One way I have used this data is to verify I did successfully load one of a group of duplicate records. For example, in my source system I may have a duplicate record for Joe Smith. The records would consist of:

Person No.First NameMiddle NameLast NameBirth Date
000001JoeDavidSmith1/17/1967
100001JoeDavidSmith1/17/1967

As part of my process I have created a table with all duplicates and assign one of the person no.’s as the primary person no. for the transfer. When loading the addresses for this person I will likely get at least 1 for each record meaning my second address insert will fail, violation of data integrity rules. I can use my load_errors table to find that yes I did get each address for Joe Smith loaded once.

Here is an example of how I query the error data:

 Select
            L.error_details,
            L.package_run_date,
            L.error_task,
            L.error_step,
            L.error_code,
            L.error_desc,
            P.person.value('@value','char(6)') as error_person_id
 from
            data_xfer.load_errors L Cross Apply
            error_details.nodes('.//field[@name="personno"]') as P(person)

I do need to write specific queries based on the task that created the error as while the XML generated is a generic name – value document the names do vary.

It is a simple solution that logs errors and data. The meat of the code is not original, but found from the resources noted.

Here are some areas where it can be improved:

  1. If there is binary data in the row the XML output has “System.Byte[]” as the value.
  2. Stripping illegal characters from the XML. I am sure this can be handled better.

I welcome any comments or suggestions on how to improve the process.

References:

SSIS Junkie – SSIS Nugget: Get Error Descriptions - http://blogs.conchango.com/jamiethomson/archive/2005/08/08/1969.aspx

BI Thoughts and Theories – Address Columns Generically in a Script Component - http://agilebi.com/cs/blogs/jwelch/archive/2007/10/21/address-columns-generically-in-a-script-task.aspx

Jack Corbett

www.ntm.org/jack_corbett

Rate

4.63 (27)

You rated this post out of 5. Change rating

Share

Share

Rate

4.63 (27)

You rated this post out of 5. Change rating