Importing Excel File - Strings as Numbers

  • I am trying to set up an SSIS package to import an Excel file.  One of the columns contains text that sometimes looks like numbers.  (For example: 115626, G11584, 117897, N78956)  While I can get SSIS to get the data into a varchar field in a SQL table, it ignores the fields with the characters in front of them and brings them across as null.  I believe I have run into this before and 'tricked' SSIS into thinking the field is a string by putting an obvious string as the first row but this is not working at the moment.  Plus, it would require an intervening step in the process which is not cool!

    I was able to change the External and Output columns of the source to strings but it is the Source Error Output column which will not come in line and causes a discrepancy between the External Column and the Error output column.  Therefore SSIS generates an error and will not run.

    Any suggestions?

    TIA,

    Beth

  • This was removed by the editor as SPAM

  • Just did some importing of excel.  I found that importing to a VarChar was a problem.  But convert the fields to NVarChar solved a lot of problems.

    chuck

  • Thanks for the suggestion Chuck. 

    Unfortunately, the problem is not getting the data into the SQL table, it is getting it out of Excel.  SSIS is very flexible and allows me to change the data types for the External and Output Columns of the source; however it will not let me change the data type of the Error Output columns.

    When I change the External Column to string, the Error Output column will error out because of the mismatch (it remains float) and will not allow me to change it.  Because I can not change the External Column to string, the only values it will read on numeric ones.  It will bring a null for cases that are actually strings.

    This seems very goofy to me, there has to be an easy way to change the way SSIS views an Excel column.

    Beth

  • The problem you are experiencing is fairly common when importing from Excel. The problem occurs because the OLE/DB provider for Excel scans the first few rows of a spreadsheet to determine the datatype for each column you are using. Based on the values it finds in these rows, it guesses what the datatype should be and uses that for that column. So, if you have numerics in the first few rows, chances are you will get a numeric datatype and when a non-numeric value is encountered the OLE/DB providers says "OK, that is not a number so I will send nulls instead".

    There are ways to adjust this behaviour. Have a look at the properties of your Excel connection. It should look something like :

    Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\Temp\MySpreadsheet.xls;Extended Properties="Excel 8.0;HDR=YES";

    Try changing the Extended Properties to

    Properties="Excel 8.0;HDR=YES;IMEX=1;";

    Another trick is to change the number of rows that Excel checks when determining that datatype. From memory, this defaults to 8. This value is stored in the registry so the usual warnings about editting the registry apply. And these changes will need to be made wherever the package is being run. Have a look at http://support.microsoft.com/kb/281517 - this has all the details on this.

    Cheers

    Stephen

  • Thanks Stephen - this worked like a charm!  I changed the extended property of the connection to convert the mixed to text.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply