Blog Post

Eliminating Empty Output Files in SSIS

,

So you’ve got some packages that regularly extract data to one or more text files, but you know that from time to time some of the queries will not return any data.  However, you find in SSIS that, in a flat file export package, the output file is created regardless of whether any rows are written to the file, and in cases where there are no rows returned from the source, you’ll end up with empty data files in your output directory.  Although not a critical failure, having empty output files can be a nuisance, and unfortunately, there aren’t any native settings in the Flat File Destination or the Flat File Connection Manager objects that will clean up empty files.  Fortunately, SSIS does provide a combination of other tools that you can use to emulate this behavior.

In our example, I’m going to create a package to extract data from a table using a query for which I know that no rows would be returned.  When I connect this data source to a flat file destination and execute the package, I’ll see an empty data file in my export directory.  Next, to demonstrate the intended behavior, I’ll add a Row Count transformation to store the number of affected rows in a variable, and create a File System Task object to delete the output file.  Finally, I’ll use an expression constraint to only delete the file if the variable attached to the row count is zero.

First, let’s create a test table to query:

USE testdb
GO

CREATE TABLE invoices (
  invoiceid     INT    IDENTITY ( 1 , 1 ),
  vendorid      INT,
  invoicedate   DATETIME,
  invoiceamount DECIMAL(10,2))

INSERT invoices
VALUES(12, '8/3/2009', 4125.50),
      (53, '8/13/2009', 1095.25),
      (46, '8/15/2009', 729.50),
      (33, '8/23/2009', 3421.50)


Now, I’ll create a basic package that will export to text the invoices for the past 30 days, a reasonable business requirement.  Since we don’t currently have any invoices matching that criteria, we’ll end up with an empty output file. The original package is shown below:

df_f1

 

Now for the new-and-improved version, I’m going to drop a Row Count transformation into the data flow to save the number of affected rows to a variable:

 

df_f2

 

Finally, I’ll create a File System Task and configure it to delete the output file.  To insure that a valid data file is not deleted, I’ll create a precedence constraint using an expression to only execute the delete if the row count variable is equal to 0:

 

df_f3


When you execute this package, you’ll see that the File System Task object is executed because there are no rows matching our query.  You can test the package by inserting another row into the database that will be returned by the query, and you’ll see that the data file is exported but not deleted.

The code samples showing the "before" and "after" versions of the packages can be found here.  Enjoy!

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating