SQLServerCentral Article

Batch ETL of Multiple Data Files Using an SSIS Foreach Loop Container

,

An SSIS ETL package created with the Import/Export Wizard will process one data file at a time. There may be situations, however, when multiple data files will be arriving continuously for extraction to the same table. A base SSIS package generated using the Import/Export Wizard can be modified to process these files in batch mode.

In this tutorial we will take an SSIS package generated by the Import/Export wizard and modify it to process multiple files and move those files to a backup directory.

1: Download the tutorial files

The zip file attached to this tutorial contains a default SSIS package (BaseBatchETL.dtsx) generated by the Import/Export Wizard, the same SSIS package modified to process multiple files in batch mode (BatchETL.dtsx), a SQL script (CreateCustomersTable.sql) for generating the ETL destination table, and six data files (NewCustomersXX.txt) for extraction into the destination table.

2: Create the destination table with the CreateCustomersTable.sql script

Open the CreateCustomersTable.sql script in SQL Server Management Studio.

USE [TestDB]
GO
/****** Object:  Table [dbo].[customers]    Script Date: 05/10/2013 15:46:30 ******/SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[customers](
    [FirstName] [nvarchar](100) NULL,
    [LastName] [nvarchar](100) NULL,
    [Company] [nvarchar](100) NULL,
    [Address] [nvarchar](100) NULL,
    [City] [nvarchar](100) NULL,
    [County] [nvarchar](100) NULL,
    [State] [nvarchar](100) NULL,
    [ZIP] [nvarchar](100) NULL,
    [Phone] [nvarchar](100) NULL,
    [Fax] [nvarchar](100) NULL,
    [Email] [nvarchar](100) NULL,
    [Web] [nvarchar](100) NULL
) ON [PRIMARY]
GO

Execute the script to create the destination table.

Browse to the new table to confirm that it was created.

3: Create source and backup folders and place the data files in the source folder

Create the folders C:\BatchETL and C:\BatchETL\backup and copy the six data files to the C:\BatchETL folder.

4: Open, configure and execute BaseBatchETL.dtsx in Business Intelligence Development Studio

Open the BaseBatchETL.dtsx SSIS package in BIDS to show that it consists of a Data Flow Task generated by the Import/Export Wizard.

Double-click the DestinationConnectionOLEDB connection manager to bring up the Connection Manager configuration panel, change the server and database names to the server and database where you created the customers table and click the OK button.

Double-click on the SourceConnectionFlatFile connection manager to show that the connection manager is pointing to the NewCustomers01 data file in the C:\BatchETL\ folder.

Execute the BaseBatchETL package.

Confirm that the 5000 records in the NewCustomers01.txt file have been added to the customers table.

Truncate the customers table to prepare for execution of the BatchETL.dtsx package we will create from the BaseBatchETL.dtsx package.

5. Add a Foreach Loop Container to BaseBatchETL.dtsx

Drag-and-drop a Foreach Loop Container from the Toolbox into the Control Flow panel.

Drag the Data Flow Task into the Foreach Loop Container.

6. Configure the Foreach Loop Container

Click on the Variables tab and ad the global variable FilePath.

Double-click on the Foreach Loop Container to bring up the Foreach Loop Editor, then click on the Collection node.

Change the contents of the Folder textbox to C:\BatchETL\ and the Files textbox to NewCustomers*.txt, then click on the Variable Mappings node.

Select the Variable User::FilePath and set the Index to zero, then click the OK button to close the Foreach Loop Editor.

7. Configure the SourceConnectionFlatFile connection manager

Click on the SourceConnectionFlatFile connection manager to bring it up in the properties window. Click on the button next to the Expressions parameter...

...to bring up the property expressions editor. Set the Connection string property to the FilePath global variable we created earlier, then click the OK button to return to the main editing window.

At this point the package will process multiple files. We are now going to add a script task to move each data file to backup as it is extracted.