Anyone who has used SSIS error handling has been frustrated that it returns an Error Code with no description. I decided that I needed a solution. Googling the problem led me to this blog by Jamie Thomson which had a link to a Script Component that gets the Error Description. Excellent! While this was I good start, I also wanted to see the data that caused the error. I started by including the data and outputting to a text file, but, being a database pro, this did not sit well with me. I started looking for a solution that would allow me to store the data in a table and be useable in each dataflow task, and, once again, the internet provided a solution, this article by jwelch on agilebi.com which I highly recommend reading.
My solution is uses the Input0_ProcessInputRow method and reflection, as shown by jwelch’s article, to loop through the columns, building an XML string (name-value pairs) which I insert into a table. I can then query the table to see the errors and the data. I can then verify that I fixed the errors from earlier loads.
Here is the solution:
First, I built this table in my SQL Server DB:
CREATE TABLE [data_xfer].[load_errors]
(
[load_error_id] [int] IDENTITY(1,1) NOT NULL,
[package_run_date] [smalldatetime] NULL CONSTRAINT [DF_load_errors_error_date] DEFAULT (getdate()),
[error_task] [varchar](100) NULL,
[error_step] [varchar](100) NULL,
[error_code] [int] NULL,
[error_desc] [varchar](100) NULL,
[error_details] [xml] NULL,
CONSTRAINT [PK_load_errors] PRIMARY KEY NONCLUSTERED
(
[load_error_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
Then, whenever configuring error output, add a script component following these steps:
1. On the Input screen add the columns that could be the source of the error

2. On the Inputs and Outputs screen add the following columns to the Output:

a. ErrorStep string(100) – this is going to be the Name of the Script Component which I name using the step which is the source of the error
b. ErrorDataDetails Unicode String(4000) – this is going to be an XML string that will have the data from the Input Columns and will be inserted into an XML Type column in SQL Server
c. ErrorDesc string(100) – The error description gotten from the ErrorCode
d. ErrorTask string(100) – The SSIS Task that the Step is in. In my case it is a DataFlow Task. This is retrieved using the TaskName package variable that is created by SSIS.
e. PackageTime Date – The start time of the package. This is retrieved using the StartTime package variable.
3. On the Script Screen add the following to the ReadOnlyVariables:

a. TaskName – a default SSIS variable that has the task name
b. StartTime – a default SSIS variable that is the start time of the package
This completes the setup so we can now Design the script in Visual Studio for Applications. Here is the script I am using:
ImportsSystem
ImportsSystem.Data
ImportsSystem.Math
ImportsMicrosoft.SqlServer.Dts.Pipeline.Wrapper
ImportsMicrosoft.SqlServer.Dts.Runtime.Wrapper
' added this import
ImportsSystem.Reflection
PublicClass ScriptMain
Inherits UserComponent
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
Dim column As IDTSInputColumn90
Dim rowType As Type = Row.GetType()
Dim columnValue As PropertyInfo
Dim strErrorDetails As String = "<fields><field "
Dim strAttributeName As String
Dim strValue AsString
' Get the error description
Row.ErrorDesc = ComponentMetaData.GetErrorDescription(Row.ErrorCode).Trim()
' This actually gets the script component name
Row.ErrorSource = ComponentMetaData.Name.Trim()
' get the task name and the package start time
Row.ErrorTask = Variables.TaskName.Trim()
Row.PackageTime = Variables.StartTime
Try
' loop through the columns in the input to create a generic xml doc
For Each column In ComponentMetaData.InputCollection(0).InputColumnCollection
' remove illegal xml characters in the name
If column.Name.IndexOf(".") > -1 Then
strAttributeName = column.Name.Replace(".", "")
Else
strAttributeName = column.Name
End If
If strAttributeName.IndexOf("_") > -1 Then
strAttributeName = strAttributeName.Replace("_", "")
End If
If strAttributeName.IndexOf(" ") > -1 Then
strAttributeName = strAttributeName.Replace(" ", "")
End If
' Skip the Error Columns
IfNot (strAttributeName.Contains("ErrorCode") Or strAttributeName.Contains("ErrorColumn")) Then
' get the value for the column
columnValue = rowType.GetProperty(strAttributeName)
Dim objColumn As Object = columnValue.GetValue(Row, Nothing)
' if the value is null set to empty string
If IsNothing(objColumn) Then
strValue = String.Empty
Else
strValue = objColumn.ToString()
End If
' append the node to the xml string
strErrorDetails = strErrorDetails & "name=" & _
ControlChars.Quote column.Name.Trim & ControlChars.Quote & _
" value=" & ControlChars.Quote & strValue & ControlChars.Quote & _
" /><field "
End If
Next
'finish the xml string
strErrorDetails = strErrorDetails.Substring(0, strErrorDetails.Length - 7) & _
"</fields>"
Catch ex As Exception
' if an error occurs log it and keep processing instead of killing the load
strErrorDetails = "<fields><field name=" & ControlChars.Quote & "error" & _
ControlChars.Quote & " value=" & ControlChars.Quote & ex.ToString.Trim & _
" " & strAttributeName.Trim & ControlChars.Quote & "/></fields>"
End Try
' add the xml output to the output row
Row.ErrorDetails = strErrorDetails
End Sub
EndClass
One way I have used this data is to verify I did successfully load one of a group of duplicate records. For example, in my source system I may have a duplicate record for Joe Smith. The records would consist of:
| Person No. | First Name | Middle Name | Last Name | Birth Date |
| 000001 | Joe | David | Smith | 1/17/1967 |
| 100001 | Joe | David | Smith | 1/17/1967 |
As part of my process I have created a table with all duplicates and assign one of the person no.’s as the primary person no. for the transfer. When loading the addresses for this person I will likely get at least 1 for each record meaning my second address insert will fail, violation of data integrity rules. I can use my load_errors table to find that yes I did get each address for Joe Smith loaded once.
Here is an example of how I query the error data:
Select
L.error_details,
L.package_run_date,
L.error_task,
L.error_step,
L.error_code,
L.error_desc,
P.person.value('@value','char(6)') as error_person_id
from
data_xfer.load_errors L Cross Apply
error_details.nodes('.//field[@name="personno"]') as P(person)
I do need to write specific queries based on the task that created the error as while the XML generated is a generic name – value document the names do vary.
It is a simple solution that logs errors and data. The meat of the code is not original, but found from the resources noted.
Here are some areas where it can be improved:
- If there is binary data in the row the XML output has “System.Byte[]” as the value.
- Stripping illegal characters from the XML. I am sure this can be handled better.
I welcome any comments or suggestions on how to improve the process.
References:
SSIS Junkie – SSIS Nugget: Get Error Descriptions - http://blogs.conchango.com/jamiethomson/archive/2005/08/08/1969.aspx
BI Thoughts and Theories – Address Columns Generically in a Script Component - http://agilebi.com/cs/blogs/jwelch/archive/2007/10/21/address-columns-generically-in-a-script-task.aspx
Jack Corbett

