Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

Dynamically create and populate staging tables from CSV files

By Stan Kulp,

The following SSIS package dynamically creates staging tables from comma-separated value (CSV) files in a subdirectory using the field names contained in the first line of each file, then populates the newly created tables with the data contained in the remaining lines of each file. Each staging table is given the same name as the corresponding CSV file.

Begin by creating an Integration Services project named 'DynamicStagingTables.'

Drag and drop a Script Task component from the Toolbox to the Control Flow panel of the default SSIS package.

Execute the following SQL statements in SQL Server Management Studio to create a database named 'DynamicStagingTables.'

USE master
GO

BEGIN TRY
DROP DATABASE DynamicStagingTables
END TRY
BEGIN CATCH
END CATCH

CREATE DATABASE DynamicStagingTables

'Command(s) completed successfully' will be displayed if the database has been created.

Right-click the 'ConnectionManagers' panel and select 'New ADO.NET Connection...' from the menu list.

Click the 'New' button of the 'Configure ADO.NET Connection Manager' screen.

Configure the appropriate server name and database entries, then click the 'OK' button.

Select the newly created 'DynamicStagingTables' connection manager and click the 'OK' button.

The 'DynamicStagingTables' connection manager is now available in the "Connection Managers' panel.

Double-click on the 'Script Task' component to bring up the 'Script Task Editor.'

Select 'Microsoft Visual Basic 2008' from the 'ScriptLanguage' menu list and click the 'Edit Script' button.

The empty VB.NET script template looks like this.

Add this code to the 'Imports' section of the script.

        Imports System.IO

Replace the commented lines around the text 'Add your code here with the following VB.NET code.

        Dim source_directory As String = "C:\Temp\"
        Dim source_file As String = "*.csv"

        Dim di As New System.IO.DirectoryInfo(source_directory)
        Dim aryFi As IO.FileInfo() = di.GetFiles(source_file)
        Dim fi As IO.FileInfo

        Dim message As String = "The following files matching the pattern '" & source_file & "' were found in '" & source_directory & ".'" & Chr(13) & Chr(10) & Chr(13) & Chr(10)

        For Each fi In aryFi
            message += di.ToString & fi.ToString & Chr(13) & Chr(10)
        Next

        Dim button1 As DialogResult = MessageBox.Show(message, "Matching Files in Designated Subdirectory", MessageBoxButtons.OK)

        For Each fi In aryFi

            Dim sLine As String
            Dim arrText As New ArrayList()
            Dim objReader As New StreamReader(di.ToString & fi.ToString)

            Do
                sLine = objReader.ReadLine()
                If Not sLine Is Nothing Then
                    arrText.Add(sLine)
                End If
            Loop Until sLine Is Nothing

            objReader.Close()

            Dim FirstLine As String = ""

            For Each sLine In arrText
                If FirstLine = "" Then
                    FirstLine = sLine
                End If
            Next

            Dim FieldNames() As String = Split(FirstLine, ",")
            Dim FieldList As String = "These are the parsed fields in the first line of the '" & fi.ToString & "' file." & Chr(13) & Chr(10) & Chr(13) & Chr(10)
            Dim count As Integer = Split(FirstLine, ",").Length
            Dim mySqlStatement As String = "BEGIN TRY DROP TABLE " & fi.ToString.Substring(0, fi.ToString.Length - 4) & " END TRY BEGIN CATCH END CATCH CREATE TABLE " & fi.ToString.Substring(0, fi.ToString.Length - 4) & " ("
            Dim comma As String = ""

            For index As Integer = 0 To count - 1
                mySqlStatement += comma & FieldNames(index) & " VARCHAR(20)"
                comma = ","
                FieldList += FieldNames(index) & Chr(13) & Chr(10)
            Next

            mySqlStatement += ")"

            Dim button2 As DialogResult = MessageBox.Show(FieldList, "Field Names", MessageBoxButtons.OK)

            Dim msg As String = "This SQL statement will be used to create the '" & fi.ToString.Substring(0, fi.ToString.Length - 4) & "' table." & Chr(13) & Chr(10) & Chr(13) & Chr(10)

            Dim button3 As DialogResult = MessageBox.Show(msg & mySqlStatement, "DDL Statement", MessageBoxButtons.OK)

            Dim myADONETConnection As SqlClient.SqlConnection = DirectCast(Dts.Connections("XW4100-9.DynamicStagingTables").AcquireConnection(Dts.Transaction), SqlClient.SqlConnection)
            Dim myCommand As New SqlClient.SqlCommand(mySqlStatement, myADONETConnection)
            myCommand.ExecuteNonQuery()
            myADONETConnection.Close()

            FirstLine = ""
            Dim line_count As Integer = 1

            For Each sLine In arrText

                If FirstLine = "" Then
                    FirstLine = sLine
                Else
                    line_count += 1

                    Dim fields() As String = Split(sLine, ",")
                    Dim ValueList As String = "These are the parsed values in line " & line_count & " of the '" & fi.ToString & "' file." & Chr(13) & Chr(10) & Chr(13) & Chr(10)
                    Dim FieldCount As Integer = Split(sLine, ",").Length

                    mySqlStatement = "INSERT INTO " & fi.ToString.Substring(0, fi.ToString.Length - 4) & " VALUES ("

                    comma = ""

                    For FieldIndex As Integer = 0 To FieldCount - 1
                        mySqlStatement += comma & "'" & fields(FieldIndex) & "'"
                        ValueList += fields(FieldIndex) & Chr(13) & Chr(10)
                        comma = ","
                    Next

                    mySqlStatement += ")"

                    Dim button4 As DialogResult = MessageBox.Show(ValueList, "Field Values", MessageBoxButtons.OK)

                    msg = "This SQL statement will be used to insert record " & line_count - 1 & " into the '" & fi.ToString.Substring(0, fi.ToString.Length - 4) & "' table." & Chr(13) & Chr(10) & Chr(13) & Chr(10)

                    Dim button5 As DialogResult = MessageBox.Show(msg & mySqlStatement, "DML Statement", MessageBoxButtons.OK)

                    myADONETConnection = DirectCast(Dts.Connections("XW4100-9.DynamicStagingTables").AcquireConnection(Dts.Transaction), SqlClient.SqlConnection)
                    myCommand = New SqlClient.SqlCommand(mySqlStatement, myADONETConnection)
                    myCommand.ExecuteNonQuery()
                    myADONETConnection.Close()

                End If

            Next

            Dim button6 As DialogResult = MessageBox.Show("The file '" & di.ToString & fi.ToString & "' will be moved to '" & di.ToString & "Backup\" & fi.ToString & ".'", "'Move Data File' Command", MessageBoxButtons.OK)
            Dim file As New System.IO.FileInfo(di.ToString & fi.ToString)

            Try

                file.MoveTo(di.ToString & "Backup\" & fi.ToString)

            Catch ex As Exception

                Dim temp As New System.IO.FileInfo(di.ToString & "Backup\" & fi.ToString)
                temp.Delete()
                file.MoveTo(di.ToString & "Backup\" & fi.ToString)

            End Try

        Next

Create a subdirectory named 'C:\Temp\' and copy the three text files from the attached 'csv_files.zip' file into it, then create the folder 'Backup' inside 'C:\Temp\.'

Each CSV file contains four comma-separated value lines. The first line contains the field names, the remaining three lines contain the values to be mapped to those fields in each line.

This is the first code executed when you run the SSIS package.

        Dim source_directory As String = "C:\Temp\"
        Dim source_file As String = "*.csv"

        Dim di As New System.IO.DirectoryInfo(source_directory)
        Dim aryFi As IO.FileInfo() = di.GetFiles(source_file)
        Dim fi As IO.FileInfo

        Dim message As String = "The following files matching the pattern '" & source_file & "' were found in '" & source_directory & ".'" & Chr(13) & Chr(10) & Chr(13) & Chr(10)

        For Each fi In aryFi
            message += di.ToString & fi.ToString & Chr(13) & Chr(10)
        Next

        Dim button1 As DialogResult = MessageBox.Show(message, "Matching Files in Designated Subdirectory", MessageBoxButtons.OK)

It reads the names of the all the files with a '.csv' extension in the 'C:\Temp\' subdirectory into an array list and displays them in a popup.

When you click the 'OK' button, this code is executed.

        For Each fi In aryFi

            Dim sLine As String
            Dim arrText As New ArrayList()
            Dim objReader As New StreamReader(di.ToString & fi.ToString)

            Do
                sLine = objReader.ReadLine()
                If Not sLine Is Nothing Then
                    arrText.Add(sLine)
                End If
            Loop Until sLine Is Nothing

            objReader.Close()

            Dim FirstLine As String = ""

            For Each sLine In arrText
                If FirstLine = "" Then
                    FirstLine = sLine
                End If
            Next

            Dim FieldNames() As String = Split(FirstLine, ",")
            Dim FieldList As String = "These are the parsed fields in the first line of the '" & fi.ToString & "' file." & Chr(13) & Chr(10) & Chr(13) & Chr(10)
            Dim count As Integer = Split(FirstLine, ",").Length
            Dim mySqlStatement As String = "BEGIN TRY DROP TABLE " & fi.ToString.Substring(0, fi.ToString.Length - 4) & " END TRY BEGIN CATCH END CATCH CREATE TABLE " & fi.ToString.Substring(0, fi.ToString.Length - 4) & " ("
            Dim comma As String = ""

            For index As Integer = 0 To count - 1
                mySqlStatement += comma & FieldNames(index) & " VARCHAR(20)"
                comma = ","
                FieldList += FieldNames(index) & Chr(13) & Chr(10)
            Next

            mySqlStatement += ")"

            Dim button2 As DialogResult = MessageBox.Show(FieldList, "Field Names", MessageBoxButtons.OK)

It loops through the array list containing the CSV file names and reads all the lines of the first CSV file into another array list. It then loops through the new array list to read the first line. It parses this line into field names and displays them in a popup.

When you click the 'OK' button the following cod is executed...

            Dim msg As String = "This SQL statement will be used to create the '" & fi.ToString.Substring(0, fi.ToString.Length - 4) & "' table." & Chr(13) & Chr(10) & Chr(13) & Chr(10)
            Dim button3 As DialogResult = MessageBox.Show(msg & mySqlStatement, "DDL Statement", MessageBoxButtons.OK)

...which displays the SQL statement that will be used to create the first table from the fields in the first line of the first CSV file.

When you click the 'OK' button of the popup, the DDL statement is executed from the following code, creating the 'people' table.

    Dim myADONETConnection As SqlClient.SqlConnection = DirectCast(Dts.Connections("XW4100-9.DynamicStagingTables").AcquireConnection(Dts.Transaction), SqlClient.SqlConnection)
    Dim myCommand As New SqlClient.SqlCommand(mySqlStatement, myADONETConnection)
    myCommand.ExecuteNonQuery()
    myADONETConnection.Close()

At this point the 'people' table has been created and is ready to accept records.

The following code loops through the csv file array list once more, skipping the first line and parsing the second line for data to map to the table fields.

            FirstLine = ""
            Dim line_count As Integer = 1

            For Each sLine In arrText

                If FirstLine = "" Then
                    FirstLine = sLine
                Else
                    line_count += 1

                    Dim fields() As String = Split(sLine, ",")
                    Dim ValueList As String = "These are the parsed values in line " & line_count & " of the '" & fi.ToString & "' file." & Chr(13) & Chr(10) & Chr(13) & Chr(10)
                    Dim FieldCount As Integer = Split(sLine, ",").Length

                    mySqlStatement = "INSERT INTO " & fi.ToString.Substring(0, fi.ToString.Length - 4) & " VALUES ("

                    comma = ""

                    For FieldIndex As Integer = 0 To FieldCount - 1
                        mySqlStatement += comma & "'" & fields(FieldIndex) & "'"
                        ValueList += fields(FieldIndex) & Chr(13) & Chr(10)
                        comma = ","
                    Next

                    mySqlStatement += ")"

                    Dim button4 As DialogResult = MessageBox.Show(ValueList, "Field Values", MessageBoxButtons.OK)            

The 'button4' popup displays the field values contained by the second line of the first CSV file.

When you click the 'OK' button the following code is executed...

                    msg = "This SQL statement will be used to insert record " & line_count - 1 & " into the '" & fi.ToString.Substring(0, fi.ToString.Length - 4) & "' table." & Chr(13) & Chr(10) & Chr(13) & Chr(10)
                    Dim button5 As DialogResult = MessageBox.Show(msg & mySqlStatement, "DML Statement", MessageBoxButtons.OK)

...which displays the SQL statement that will insert the first record into the new staging table.

When you click the 'OK' button, the following code is executed to insert the record into the table.

                    myADONETConnection = DirectCast(Dts.Connections("XW4100-9.DynamicStagingTables").AcquireConnection(Dts.Transaction), SqlClient.SqlConnection)
                    myCommand = New SqlClient.SqlCommand(mySqlStatement, myADONETConnection)
                    myCommand.ExecuteNonQuery()
                    myADONETConnection.Close()

                End If

After all three records have been inserted into the first staging table, the following code is executed...

            Dim button6 As DialogResult = MessageBox.Show("The file '" & di.ToString & fi.ToString & "' will be moved to '" & di.ToString & "Backup\" & fi.ToString & ".'", "'Move Data File' Command", MessageBoxButtons.OK)
            Dim file As New System.IO.FileInfo(di.ToString & fi.ToString)

            Try

                file.MoveTo(di.ToString & "Backup\" & fi.ToString)

            Catch ex As Exception

                Dim temp As New System.IO.FileInfo(di.ToString & "Backup\" & fi.ToString)
                temp.Delete()
                file.MoveTo(di.ToString & "Backup\" & fi.ToString)

            End Try

...which displays the following popup before moving the CSV file to the 'Backup' subdirectory.

Keep clicking 'OK' buttons until all three tables contain three records, then execute the following query in SQL Server Management Studio.

USE DynamicStagingTables
GO

SELECT * FROM [DynamicStagingTables].[dbo].[people]
SELECT * FROM [DynamicStagingTables].[dbo].[products]
SELECT * FROM [DynamicStagingTables].[dbo].[states]

This confirms that staging tables have been created and populated from the original CSV files.


Conclusion

The purpose of a staging table is to get data from a file into a table with as little fuss as possible, where it can be processed further using SQL. SSIS Data Flow Tasks are orders-of-magnitude faster than any script task written in sequential code. The method described isn't practical for large data files, but could work satisfactorily for smaller (under 10,000 lines) data files.

Resources:

csv_files.zip | DynamicStagingTables2008.dtsx | DynamicStagingTables2005.dtsx
Total article views: 10277 | Views in the last 30 days: 22
 
Related Articles
ARTICLE

TSQL Challenge 78 - Parse single field into multiple fields

The database field contains long string with data for Quantity and Parts. The challenge is to parse ...

FORUM

Manipulate String

String Output

SCRIPT

Delimited String Parser

Parses delimited string into a table of up to 9 varchar fields.

FORUM

string Comparison.

'string' = 'string___' ???

FORUM

Deconstructing and Reconstructing a character string field.

Using Numbers table and FOR XML to re-build a text string.

 
Contribute

Join the most active online SQL Server Community

SQL knowledge, delivered daily, free:

Email address:  

You make SSC a better place

As a member of SQLServerCentral, you get free access to loads of fresh content: thousands of articles and SQL scripts, a library of free eBooks, a weekly database news roundup, a great Q & A platform… And it’s our huge, buzzing community of SQL Server Professionals that makes it such a success.

Join us!

Steve Jones
Editor, SQLServerCentral.com

Already a member? Jump in:

Email address:   Password:   Remember me: Forgotten your password?
Steve Jones