So you’ve got some packages that regularly extract data to one or more text files, but you know that from time to time some of the queries will not return any data. However, you find in SSIS that, in a flat file export package, the output file is created regardless of whether any rows are written to the file, and in cases where there are no rows returned from the source, you’ll end up with empty data files in your output directory. Although not a critical failure, having empty output files can be a nuisance, and unfortunately, there aren’t any native settings in the Flat File Destination or the Flat File Connection Manager objects that will clean up empty files. Fortunately, SSIS does provide a combination of other tools that you can use to emulate this behavior.
In our example, I’m going to create a package to extract data from a table using a query for which I know that no rows would be returned. When I connect this data source to a flat file destination and execute the package, I’ll see an empty data file in my export directory. Next, to demonstrate the intended behavior, I’ll add a Row Count transformation to store the number of affected rows in a variable, and create a File System Task object to delete the output file. Finally, I’ll use an expression constraint to only delete the file if the variable attached to the row count is zero.
First, let’s create a test table to query:
USE testdb GO CREATE TABLE invoices ( invoiceid INT IDENTITY ( 1 , 1 ), vendorid INT, invoicedate DATETIME, invoiceamount DECIMAL(10,2)) INSERT invoices VALUES(12, '8/3/2009', 4125.50), (53, '8/13/2009', 1095.25), (46, '8/15/2009', 729.50), (33, '8/23/2009', 3421.50)
Now, I’ll create a basic package that will export to text the invoices for the past 30 days, a reasonable business requirement. Since we don’t currently have any invoices matching that criteria, we’ll end up with an empty output file. The original package is shown below:
Now for the new-and-improved version, I’m going to drop a Row Count transformation into the data flow to save the number of affected rows to a variable:
Finally, I’ll create a File System Task and configure it to delete the output file. To insure that a valid data file is not deleted, I’ll create a precedence constraint using an expression to only execute the delete if the row count variable is equal to 0:
When you execute this package, you’ll see that the File System Task object is executed because there are no rows matching our query. You can test the package by inserting another row into the database that will be returned by the query, and you’ll see that the data file is exported but not deleted.
I’ve attached the before and after SSIS packages if you’d like to take it for a test drive. Enjoy!
I cannot find the attached SSIS packages
Where can I doenload them
kind regrds
Rene van Noorloos
I do have this problem of empty files. I am fairly new to SSIS and would appreciate being able to learn from your SSIS packages described above. Where can I go to downlod them?
Thank You.
Rick Hardy
You could also use an execute sql task to capture the count of rows value and then set an expression on the disable property of the data flow. So when @[User::RowCount]==0 is true, the Disable property would trigger, and the file wouldn't even be generated if there were no rows.
I like juliesmith's approach since there is no file IO operation involved since we are determining file creation before hand.
Thanks to all.
Amol
I too, am an SSIS newbie and am running into problems with this exercise. If you could point us to your solution, it would be greatly appreciated.
For those who requested the sample files, I'm not sure why they don't show up on this post - I have included them on the attachments. I'll get with Steve to find out how to display them. In the meantime, shoot me an email at tdmitch [at] gmail and I'll email you the packages. Thanks!
Hello Juliesmith,
I am a newbie in this area.
Can you be more detail how to Disable property would trigger, and the file wouldn't even be generated if there were no rows.
Thanks in advance.
Krishna.
Here is a step by step.
1. Create a variable (I always use the whole package as my scope, so that I don't accidentally name two variables the same thing) data type is Int32.
Make sure the value of the variable is hardcoded in the beginning to something > 0 (or the data flow will NEVER) work.
2. Add an Execute SQL task to the package previous to the execution of the data flow task which moves the data from the table to the output file.
The sql is something like:
SELECT
COUNT(invoiceid)
AS CountOf
FROM dbo.invoices
3. Set the result set property on the general tab of the execute sql task to single row.
4. Click the result tab of the execute sql task.
5. Click add to add a row to resultset.
6. Change the name of the result to 0 (zero--this is important and NOT intuitive)
7. Map the variable name to your variable beside the ResultSet Name from the drop down of variables.
You are done configuring the execute sql task.
1. Now, select your data flow which does the work of moving the records to the output file.
2. Go to the properties window of the data flow task.
3. Click on the expressions field of the properties window
4. Click the ellipses
Clicking the ellipses brings up the Property Expressions Editor
5. Select Disable from the drop down under property.
6. Click the ellipses beside the expression box for it.
This brings up the expression builder.
7. Drag your variable down to the expression window from the variables in upper left pane.
8. Follow it with == 0 (should look like this @[User::RowCount]==0 )
9. Click ok to close the expression builder and then click ok to close the property expression editor .
WOOTNESS!!! This worked perfectly - thank you!!!!
Set 'DelayValiation' property of the data flow task to true.