Bulk Insert format file but input file not always fixed

Question

Bulk Insert format file but input file not always fixed

jdixon-586320

SSC Enthusiast

Points: 184
More actions
June 20, 2013 at 12:18 pm

#295592

I am trying to do a bulk insert using a non-xml file format. When my input file exactly matches my format file the bulk insert works fine. I have data that is older and was not written in the exact same manner as the newer input data. In the case of this older data, the last field may either not be there at all or be shorter than the allotted characters. This causes an unexpected end of file error. Here is an example of what I mean:
Format file:
8.0
3
1 sqlchar 0 5 "" 1 recid ""
2 sqlchar 0 5 "" 2 testfield ""
3 sqlchar 0 0 "/r/n" 0 skipfield ""
The data that works is
1 test1
2 test2
The data that does not work is
1 test
2 tes
another one that does not work (the first field is 5 characters with nothing for the second field)
1
2
Without creating a format file for each case, is there a way to have the bulk insert work with the variable length in the last field?

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1003873 More actions · Answer 1

jdixon-586320 (6/20/2013)
I am trying to do a bulk insert using a non-xml file format. When my input file exactly matches my format file the bulk insert works fine. I have data that is older and was not written in the exact same manner as the newer input data. In the case of this older data, the last field may either not be there at all or be shorter than the allotted characters. This causes an unexpected end of file error. Here is an example of what I mean:
Format file:
8.0
3
1 sqlchar 0 5 "" 1 recid ""
2 sqlchar 0 5 "" 2 testfield ""
3 sqlchar 0 0 "/r/n" 0 skipfield ""
The data that works is
1 test1
2 test2
The data that does not work is
1 test
2 tes
another one that does not work (the first field is 5 characters with nothing for the second field)
1
2
Without creating a format file for each case, is there a way to have the bulk insert work with the variable length in the last field?

Unfortunately, BCP was written to work with very well formed and consistent data formats. Even the "skip rows" part of BCP requires that the rows being skipped be properly formatted with the same number and types of delimiters. As you're finding out, it REALLY doesn't like ragged right formats especially when fixed field lengths are being used.

The way I get around such a thing is to import the lines/rows as blobs (BULK INSERT works very well here but so does BCP without a format file) and then use substring/CAST/CONVERT to do the parsing. Using your test data, something like the following will handle the ragged right fixed-field problem quite well.

SELECT RecID = CAST(SUBSTRING(d.String,1,5) AS INT)

, TestField = RTRIM(SUBSTRING(d.String,6,5))

FROM

(--==== This would be your staging table instead of test data

SELECT '1 test1' UNION ALL

SELECT '2 test2' UNION ALL

SELECT '3 test' UNION ALL

SELECT '4 tes' UNION ALL

SELECT '5' UNION ALL

SELECT '6'

)d(String)

;

Here's the output:

RecID TestField

----------- ---------

1 test1

2 test2

3 test

4 tes

5

6

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)