My Visual Studio crashed and I didn't know why!


Recently I had to develop a simple ETL task with the help of SSIS to transfer an Excel data file into a SQL Server table for reporting. It was nothing special, simple file and only a fraction of it needed to be transfer to I had use a SQL command to make reference to a required Excel range. However a later change request to add an extra column for the dataset being transferred made me repeating a mistake that I've already made before a few years ago, but I totally forgot about that experience. And in a addition to that my Visual Studio 2008 crashed with no error messages.

I tried to replicate this test scenario using VS 2012 with a little bit more luck but still with almost no error messages to investigate after the crash.

So here is a sample Excel file:

And let's say, initially I needed to load only Column1 and Column2 data. So my data source in the data flow SSIS task with the SQL command

SELECT Column1, Column2 FROM [SheetData$B5:C10]

looked like this:

This worked well really well, until a customer requested to add a reference to the 3rd column of the file into the scope of the data transfer and I easily made this adjustment in the SQL command:

SELECT Column1, Column2, Column3 FROM [SheetData$B5:C10]

And right after pressing OK button in the dialog window the whole Visual Studio crashed with no warning message. So that's why I didn't know why it was happening .

After reopening the SSIS package, turning it to the [Offline Work] mode, changing the SQL command with the reference for the Column3 and then executing the package I was able to get more details for the original error:
in VS2008 it was "The external metadata column collection is out of synchronization with the data source"
in VS2012 it was "Complete metadata information for the data source is not available"

Refreshing metadata on the datasource didn't help, my VS kept crashing on this simple step of adding Column3 into the SQL command of the datasource task.

And a bit later I realized that the whole problem was in the SQL command itself, particularly in the FROM clause of it. If you remember the Excel file table and the original range reference was made for 2 columns using this statement: [SheetData$B5:C10]. However Column3 was out of scope of this range and my final SQL command should have been written this way:

SELECT Column1, Column2, Column3 FROM [SheetData$B5:D10]

Then instantly I realized that I had already made similar mistake of incorrect Excel range specification in the FROM clause; and I had to make it again in order to learn it this time. So I hope by creating this first post of my blog I will remember this mistake even better 🙂

Enjoy your day and happy data adventures!