Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

The Multi Phase Data Pump

By Dinesh Asanka,

Multiphase Data Pump

Introduction

When transferring data using DTS, the multiphase data pump is a very flexible task that you can use, but it is not just one process; there are various processes attached to this DTS task. If your developers are allowed to code at each of the different phases that are available, they can add much flexibility to the DTS process. Among the functionalities that the above process provides are those listed below.

  • The ability to restart the data pump without having to reload large number of rows that were already processed. You can add functions to save processed raw data, batches, or partial batches, writing that data back to the source or a status table for later use.

  • Individual handling of different types of insert or transformation errors. For example, you could add special error handlers to customize problems handling NULL data or constraint violations.

  • Customizing data pump initialization or termination steps. For example, on data pump initialization you could write out a schema header to a file prior to writing XML data to the file.

This option can be enabled from Enterprise Manager. Right click the Data Transformation Services folder and select Properties. Then enable the Show multi-phase pump in DTS designer.

Data Pump Phases

The following phases are available for the use of the Data pumping process

1. Pre Source Phase. This is activated before the whole data pump and therefore only once for this task.

2. Row Transform Phase. This is the actual moving of the data. It's what you would normally see in a data pump.

3. Post Row Transform Phase (On Transform Failure). Caused by errors such as data type conversion errors.

4. Post Row Transform Phase (On Insert Failure). Caused by errors such as Primary key Violations.

5. Post Row Transform Phase (On Insert Success). This fires when the row is valid and moves to the destination.

6. Batch Complete Phase. This will fire for every batch insert you do, based on the Insert Batch Size.

7. Post Source Data Phase. Executes after the last row of data has been transformed (or not). This is also executed only once for the task.

8. Pump Complete Phase. As the name suggests, this fires after the data pump complete. This phase executes only once for the entire task as well.

Source : SQL Server 2000 , Books on line

How to Configure the Data Pump

Lets start with a simple example. We will use a simple DTS package, which will use to transfer data from Access to SQL Server. The package contains two tasks, as shown below.

Right click the Transform Data Task and select Properties, then select the Transformations tab. You will see a dialog like the one below.

Remove all the existing predefined transformations which DTS gives you when creating the task. Then select New button and select ActiveX Script and press OK. You will get a dialog that has a tab called "Phases", which lists all the phases of the data pump as shown below.

You can now select the phases that you need. Then select the General tab of the same dialog and press Properties button. In that dialog select Phases tab. You will see code stubs for each of the phases that is enabled.

For the above ActiveX Script Transformation Properties dialog users are allowed to include those functions and coding that you need. For example you can log the errors to the test files or you can insert them to a table or else you can display them in a message box or a file. If you are a developer, you know how helpful those error messages are. The code to write these messages to a file is below.

'**********************************************************************
' Visual Basic Transformation Script
'************************************************************************

' Copy each source column to the destination column
strLogFile = "E:\DTSLOG.txt"
Set fso = CreateObject("Scripting.FileSystemObject")
Set logfile = fso.CreateTextFile(strLogFile) 
DTSGlobalVariables("CurrentRow").Value = 0

Function Main()
    DTSDestination("Fname") = DTSSource("Fname")
    DTSDestination("LName") = DTSSource("LName")
    'Check for NULL's and mark as an error so that the 
TransformFailure phase gets called.
    if isnull(DTSSource("Fname")) or isnull(DTSSource("LName")) 
then
        Main = DTSTransformStat_Error
    else
        Main = DTSTransformStat_OK
    end if

End Function

Function PreSourceMain()
    'Initialize global variables
    DTSGlobalVariables("BatchesComplete").Value = 0
    'Create log file

    logfile.WriteLine "Package execution started On: " & Now & 
"." 
    PreSourceMain = DTSTransformstat_OK 
End Function

Function PostSourceMain()
    PostSourceMain = DTSTransformstat_OK 
End Function

Function TransFailureMain()
    DTSGlobalVariables("CurrentRow").Value = 
DTSGlobalVariables("CurrentRow").Value + 1 
    DTSGlobalVariables("BatchesComplete").Value = 0
    'write status to log file
    logfile.writeline "Could not insert source data at row " & 
DTSGlobalVariables("CurrentRow").Value 
    'do not insert the row at destination
    TransFailureMain = DTSTransformstat_SkipInsert
End Function

Function InsertSuccessMain()
    DTSGlobalVariables("CurrentRow").Value = 
DTSGlobalVariables("CurrentRow").Value + 1 
    InsertSuccessMain = DTSTransformstat_OK
End Function

Function InsertFailureMain()
    InsertFailureMain = DTSTransformstat_OK
End Function

Function BatchCompleteMain()
    DTSGlobalVariables("BatchesComplete").Value = 
DTSGlobalVariables("BatchesComplete").Value + 1
    BatchCompleteMain = DTSTransformstat_OK
End Function

Function PumpCompleteMain()
    PumpCompleteMain = PostSourceMain 
    logfile.WriteLine "Package execution completed on: " & Now & 
"."'
    logfile.WriteLine DTSGlobalVariables("BatchesComplete").Value 
& " batches processed."
    'cleanup code
    logfile.close
    set fso = nothing
End Function

If you ran this for a package, the text file output would be something like this:

Package execution started On: 2005-03-09 3:31:51 PM.
Could not insert source data at row 2
Could not insert source data at row 3
Package execution completed on: 2005-03-09 3:31:51 PM.
1 batches processed.

From the above output we can verify that errors occurred in the rows 2 and 3.

Below I have listed some of the constants, which will be helpful when programming this task in DTS.

Constant Description
DTSTransformStat_AbortPump Processing is terminated with the current row
DTSTransformStat_Error Indicates the transformation encountered an error.
DTSTransformStat_ErrorSkipRow Terminate further processing of this row, for nonerror reasons.
DTSTransformStat_ExceptionRow Terminate further processing of this row as an exception and call the error sink, and write this row to exception file.
DTSTransformStat_NoMoreRows The current row is the last to be processed. The current row is processed as specified by other transformation status values. This value differs from DTSTransformStat_AbortPump in that no error is raised.
DTSTransformStat_OK Default conversions (if any) succeeded. Write the row to destination if specified, without calling any error handlers.
DTSTransformStat_SkipFetch Do not fetch the next row; reexecute all transforms against the current source and destination rows.
DTSTransformStat_SkipInsert Do not write the current row to the destination.

Source : SQL Server 2000 , Books on line

Insert Failure

An Insert failure function will be triggered when data fails to insert. A suitable example for this kind of failure is if you try to insert a Null value in a Non-Nullable column.

Transform Failure

A simple example for this kind of failure is inserting a different type of data than the column allows. For example if DTS is trying to transfer data from a char field to an int field, then this event is fired.

Conclusion

Most of the times DTS gives us headaches when errors occur. Like in other development tools, it is much easier if we can debug the package to find these errors. By using the multiphase data pump mechanism, developers will be able to tackle more complicated DTS packages while making their job much easier.

Total article views: 11854 | Views in the last 30 days: 7
 
Related Articles
FORUM

Getting error in Lookup Transformation in SSIS

Getting error in Lookup Transformation in SSIS

FORUM

error in derived column transformation

error in derived column transformation

FORUM

SQl Server 2008 Function Error

Function Error

FORUM

Transformation VB Script

Import Transformation VB Script

FORUM

LIKE (Wild Card) Functionality in Conditional Split Transformation

LIKE (Wild Card) Functionality in Conditional Split Transformation

Tags
dts    
sql server 7    
 
Contribute

Join the most active online SQL Server Community

SQL knowledge, delivered daily, free:

Email address:  

You make SSC a better place

As a member of SQLServerCentral, you get free access to loads of fresh content: thousands of articles and SQL scripts, a library of free eBooks, a weekly database news roundup, a great Q & A platform… And it’s our huge, buzzing community of SQL Server Professionals that makes it such a success.

Join us!

Steve Jones
Editor, SQLServerCentral.com

Already a member? Jump in:

Email address:   Password:   Remember me: Forgotten your password?
Steve Jones