OPENROWSET BULK misses last row of file

  • I have a problem whereby some of our import files are missing the row terminator (carriage return) in the last row of the text file. When I use OPENROWSET BULK to import the files, the last row of the file is missed when it excludes the row terminator ie. just EOF. This would be OK if OPENROWSET returned an error (MAXERRORS=0), but instead it just ignores the last row. Because we have so many files to import, I am faced with having to write a CLR component to verify that the last row terminator exists before the file is imported. From my searches online, I don't see that anyone else has raised this issue, so I am wondering whether I have missed the obvious.

    Here's a simplified clip of my script (the dynamic SQL IS required because of runtime variations in file paths and table/field names):

    SET @OpenRowSetSQL = 'INSERT INTO MyInsertTable (Field1, Field2 Field3, Field4) SELECT Field1, Field2, Field3,@SomeOtherVariable FROM OPENROWSET(BULK ''c:\MyImportFile.txt'', FORMATFILE=''MyFormatFile.xml'', MAXERRORS=0) txt'

    EXEC sp_ExecuteSQL @OpenRowSetSQL

    The format file looks like this:

    <?xml version="1.0"?>

    <BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

    <RECORD>

    <FIELD ID="1" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="50" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>

    <FIELD ID="2" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="50" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>

    <FIELD ID="3" xsi:type="CharTerm" TERMINATOR="\t" MAX_LENGTH="1" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>

    <FIELD ID="4" xsi:type="CharTerm" TERMINATOR="\r" MAX_LENGTH="255" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>

    </RECORD>

    <ROW>

    <COLUMN SOURCE="1" NAME="Username" xsi:type="SQLNVARCHAR"/>

    <COLUMN SOURCE="2" NAME="ActionRef" xsi:type="SQLNVARCHAR"/>

    <COLUMN SOURCE="3" NAME="ReadOnly" xsi:type="SQLBIT"/>

    <COLUMN SOURCE="4" NAME="ScopeID" xsi:type="SQLNVARCHAR"/>

    </ROW>

    </BCPFORMAT>

    [actually the row terminator for field ID 4 is "backslash r backslash n", but when I post to this forum the backslash n isn't shown, presumably because its a control character sequence]

    And as I mentioned, if the row terminator exists on the last row, everything is fine. But if the row terminator is missing from the last row, all but that last row is imported - but without returning an error or indication of skipped bytes. I don't even get an "unexpected EOF" message.

    I'd appreciate any pointers with either trapping the error, or forcing import of that last row, even if the EOF is encountered before the row terminator.

    Thanks!

  • I didn't receive a response to this question and found no solution through Google. Just for reader information, I worked around the problem by implementing a CLR component function 'ASCLastChar(@FilePath, @ASCVal)' that examines the file and returns TRUE if the ASCII value of the last character of the file matches @ASCVal, False otherwise. The function jumps straight to the last character of the file without reading it all. This means my OPENROWSET BULK script can run quickly and safely without the risk of missing the last row in any of the files. I just reject files in which the last row terminator is missing. Hope this helps anyone encountering the same problem.

  • I have the exactly the same problem as you and are glad that you were able to.find a solution within the SQL environmnent. Could you pelase share the code for the CLR function?

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply