OPENROWSET - how to read everything as text

  • INSERT INTO [NewTable]

    SELECT * FROM OPENROWSET

    (

    'MSDASQL', 'Driver={Microsoft Text Driver (*.txt; *.csv)};DBQ=c:\SomeFolder\;'

    , 'SELECT * from [SomeFile.csv]'

    );

    The problem is that apparently the driver tries to guess the datatype for each field, and where the cast fails it simply reads in a null. For example, lets say I have the following

    SomeCode SomeName

    100 A

    299 B

    22 C

    123 D

    ABC E

    900 F

    It seems to figure that "SomeCode" is an integer, and it will read "ABC" as NULL. Is there any way I can stop this from happening. All I want is for the data to be handled as varchars all the way through

  • In the olden days we did this kind of thing with BCP. This still works ok for ad hoc imports, but it is not the easiest tool in the world to use. SQL used to come with a tool called DTS that was really easy to do ad hoc imports with. They replaced it with SSIS.

    If you are doing more than an ad hoc import you probably want to look at something like SSIS or an actual program to do the import. Lack of error trapping logging of failed rows will make this type of solution non-viable for production use.

  • If do try and use a *.csv (Comma Separated Values) driver it will probably work better on data like this:

    IntColumn1 CharColumn2 CharColumn3

    1,"Row1Col2","Row1Col3"

    2,"Row2Col2","Row2Col3"

  • If you can initiate the process from outside the database engine I would recommend SSIS or BCP. If using only T-SQL is a requirement I would recommend using the BULK INSERT command. All three tools that I mentioned leverage the same underlying technology, namely the SQL Server Bulk-copy API, and all three support the same functional options but in three different implementation forms. Hopefully one of them will fit your needs.

    In any case I would stay away from the Microsoft Text Driver. There are too many other better tools for this kind of work.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Once you get the syntax of the format file down, the BULK INSERT command is a rock-solid approach. You get to control the data type and maximum length of each and every column, the delimiters and everything else. You can write the whole thing in a stored procedure and the format file controls how the data is brought you into the staging table. I've used it for years and it's very reliable.

    The format file is covered at https://msdn.microsoft.com/en-us/library/ms190393.aspx.

  • Ed Wagner (1/29/2016)


    Once you get the syntax of the format file down, the BULK INSERT command is a rock-solid approach. You get to control the data type and maximum length of each and every column, the delimiters and everything else. You can write the whole thing in a stored procedure and the format file controls how the data is brought you into the staging table. I've used it for years and it's very reliable.

    The format file is covered at https://msdn.microsoft.com/en-us/library/ms190393.aspx.

    I'll also add that it will also allow you to sequester bad rows in a file for troubleshooting without it blowing the whole batch off, if you need something like that.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply