RE: Bulk insert with format file - handling quotation mark text qualifiers in header row

SSC Guru

Points: 1003863

October 9, 2013 at 9:21 pm

caspersql (10/9/2013)
Okay after hammering away for a while I figured this out. Since my header row contains text qualifiers, I had assumed that I needed to include those in my XML format file 'Field ID' and 'Column Source' attributes. However that wasn't required. All I needed to do was include the text qualifiers in the 'terminator' attribute.
This format file works for my sample text file:
<?xml version="1.0"?><BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <RECORD> <FIELD ID="Column1Name" xsi:type="CharTerm" TERMINATOR='\",\"' MAX_LENGTH="4000"/> <FIELD ID="Column2Name" xsi:type="CharTerm" TERMINATOR='\",\"' MAX_LENGTH="4000"/> <FIELD ID="Column3Name" xsi:type="CharTerm" TERMINATOR='\r' MAX_LENGTH="4000"/> </RECORD> <ROW> <COLUMN SOURCE="Column1Name" NAME="Column1Name" xsi:type="SQLVARYCHAR"/> <COLUMN SOURCE="Column2Name" NAME="Column2Name" xsi:type="SQLVARYCHAR"/> <COLUMN SOURCE="Column3Name" NAME="Column3Name" xsi:type="SQLVARYCHAR"/> </ROW></BCPFORMAT>
This still leaves the text qualifier as the first character in column1 and the last character of the last column in my imported data. However I can just use T-SQL string manipulation after the import to remove those per this thread: http://www.sqlservercentral.com/Forums/FindPost87417.aspx

I hate the XML format files. Requires two different line types for each column, is tag bloated, generally looks like crap, and etc. 🙂

Try using a standard format file. Much easier to grasp IMHO. I haven't tested this particular one but should auto-magically get rid of the first and last quote...

10.0

4

1 SQLCHAR 0 1 "" 0 FirstQuote ""

2 SQLCHAR 0 8000 "\",\"" 1 Column1Name ""

3 SQLCHAR 0 8000 "\",\"" 2 Column2Name ""

4 SQLCHAR 0 8000 "\"\r\ n" 3 Column3Name ""

NOTE: REMOVE THE SPACE FROM BETWEEN THE \ and the n in row 4. This forum "eats" that particular combination of characters.

It also makes the column widths wider than the columns in the table so that it auto-magically checks for oversize data. The import will fail if the data is wider than the columns in the table. You could sequester the failed rows by using the error file feature of either BCP or BULK INSERT.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)