Excel import - force column to be string (SSIS 2005)

  • I know this topic has been covered in a few places - but it reoccurs regularly and I have NOT got a resolution for it.

    The problem is similar to this post:

    http://www.sqlservercentral.com/Forums/Topic1052632-148-1.aspx#bm1060473

    It seems absolutely stunningly rediculous that I cannot import from Excel into SQL 2005 using an "Excel Source" and specify that one (or many) of the columns I am importing are to be (forced) treated as strings. The data I have is a column that is almost completely numbers... but the occasionally either an alphanumeric will legitmately appear or or even a very long set of digits (up to 20 characters).

    From reading around the subject the issue is the JET driver and the way it samples the data before it 'decides' what data type to use.

    Proposed solutions that I have tried are:

    1) use IMEX=1 in connection does not work as the import is many rows and the strings can legitamatly appear in the last row

    2) Change the TYPEGUESSROWS is not an option (and nor should it be in my opinion) - the dev PC is part of a BIG organisation and I do not hae admin rights to change the Registry.

    Methods that I have tried:

    3) using a 'dummy' spreadsheet to set up the connection and so make SSIS think that the column is full of strings. i.e. make a spreadsheet with string data - this works in design, but when I replace the real spreadsheet and run the datatypes are re-set!!

    4) Using CSTR(field) as part of the Command SQL that selects from excel (i. "Select CSTR(fieldname) from spreadsheet). This seems to set the data types correctly but when running SSIS throws an error.

    Am i missing something mind-numbingly obvious?

    I do not have control over the import file, I could open it programmatically and use a Script task to process the data, but surly this is inefficent and unneccessary? I could perhaps import into a text/csv and then manipulate the data from there - again surly not an 'efficient' solution?

    Any pointers much appreciated!

  • you could change the datatype at the destination table and then either impliciitly or explicitly convert the data in the data flow.

  • The data type at the destination is varchar but the assumed data type of "double" nullifies any data that doesn't fit. So the alpha numeric data and the 'numeric' looking data that is (for example) 20 digits long are set to NULL during the import - so i do not have the data to convert.

    Or do you mean something different to this?

  • About the only way to do that with the JET engine is to have some alpha characters in that field in the first few records. I agree, this is a major issue.

    I don't know if the new ACE drivers are any better - have you tried them out yet? The ACE drivers are available here. If you try them out, please get back with us on how they work out!

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Thanks Wayne - no I haven't and not sure I'll be able to in the 'risk free' environment I have to enhabit!

    On this issue you're Avatar says it all I think 🙂 !

    It's crazy - I'm going to try a few more things and also see if I can get the file changed to csv.... not a pretty solution really.

  • A quick update... temporary solution is to actually turn off the "use headers" option in the connection so that the first row is text (i.e. import the headers then delete the header row later). Rubbish solution as I then need to convert the non-text fields back to whatever they are meant to be.

  • The issue is this: The way Excel is setup and interacts with SSIS, the first 8 rows of the column are critical (as set by the registry default). If you have mixed data types in the first 8 rows, ie numerics in some and alpha characters inothers, all will be imported fine. However if for example you have only numerics in the first 8 rows, then the column is "determined" to be an Integer and any alpha characters are dropped when the data is imported. The two solutions I have found so far are:

    1. Make sure if there is to be alpha data in brought in from the column then some is in the first 8 rows.

    2. Hack the registry on the machine where the job will be running to look further than 8 rows.

    I would list the "IMEX=1" "fix" except when the data is numeric for each of the first 8 rows in the column then this is overridden. So for my purposes (over 15,000 lines of data and no alpha data until about 7,000 lines in) this is a non-solution.

    I am currently exploring alternate solutions but I may have to go back to the source and require alpha data in the first 8 rows.

  • tbanaski,

    Are you saying, that for each column in an Excel file that I want to import, I need to have an alpha Numeric value in one of the first 8 rows if that column has a mix of numeric and alpha data?

    Thanks

    GF

  • Yes, as has been explored there are very few options with the JET driver (don't know about the ACE driver). The number of rows read to determine the data type can be changed, or else you need to have the alpha character in the first few rows for string columns.

    If you have a header row you can include it as data rather than a header and set all rows as text in this way, in the data flow you ignore the first row then manually convert any numbers to the appropriate type.

    Or else if you can have 2 header rows, the first being your regular header row and the second being a datatype setting row, then you can set the first row as headers as normal, and the second row will be included in the data to set the data types but you filter it out later.

  • The way I solved this problem for my group was to simply use QUOTENAME(columnname, '"') and make sure that I used '"' as the text qualifier.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • It would be nice to know where you do the QUOTENAME(columnname, '"')

  • Jeff Moden (10/16/2011)


    The way I solved this problem for my group was to simply use QUOTENAME(columnname, '"') and make sure that I used '"' as the text qualifier.

    Can anyone explain more about this??

    This kind of statement is like "I know how, but I won't tell." kinda statement.

    Best Regards.

  • Donny Z (8/19/2013)


    Jeff Moden (10/16/2011)


    The way I solved this problem for my group was to simply use QUOTENAME(columnname, '"') and make sure that I used '"' as the text qualifier.

    Can anyone explain more about this??

    This kind of statement is like "I know how, but I won't tell." kinda statement.

    Take a look at: http://technet.microsoft.com/en-us/library/ms176114(v=sql.105).aspx

    A simple search brought that back. :rolleyes:

    HTH,

    Rob

  • Right click on the Excel source transform, choose Show Advanced Editor, and modify the column data type properties in the Input and Output properties tab

  • I am stuck at a similar dilemma with Excel source and Jet provider.

    Cannot edit registry key.

    I was thinking about this.. but I am not sure if it is possible.. Can I add a dummy row with more than 255 chars each time data flow task runs by using a script task and then take the dummy row out using an identifier?

    Script Task (add dummy row, move all rows one level below dummy)

    Excel Source

    Some other Task (which you would normally go to and take dummy row out)

    I have near zero scripting skills and am looking for help to see if this is a possibility

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply