Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Bulk Insert - suddenly getting issues with EOL chracters? Expand / Collapse
Author
Message
Posted Friday, September 28, 2012 10:02 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, March 1, 2013 5:49 AM
Points: 9, Visits: 69
Hi all

Apologies if this subject has been done to death - I'm going round in circles reading threads on here and in other forums, but nothing is really helping me get to the bottom of the problem.

OK, bit of background. I have an existing routine running on SQL2005 SP2. It uses dynamic SQL (please don't lecture!) to perform a bulk insert operation. It's been running fine for months, and in the production environment is still doing so.

I'm doing some dev work and wanting to extend this routine to perform an additional bulk insert, using a new external text file. The dev server is also 2005 SP2. But when I tried to execute the following code:

bulk insert <tablename>
from '<sourcefile>
with (datafiletype = 'char', fieldterminator = ',',
ROWTERMINATOR = '')

I get the following error:

Msg 4863, Level 16, State 1, Line 1
Bulk load data conversion error (truncation) for row 1, column 106 (<columnname>).

The source text file is, as far as I can tell, a perfectly standard, ANSI-coded text file (produced from an SAP-broadcast report, as it happens). I have opened it up in Notepad++ and checked the row delimiter and it's just LF.

I've been through all sorts of permutations, trying to change either the bulk insert or the text file, and am still getting problems. To try and get to the bottom of it, I've begun again with a simple little test. I created a text file using Notepad++ that looks like this:

a,b,c,d,e,f,g,h,i,j,1,2,3,4,5,6,7,8,9,0,1.1,1.2,10

I've deliberately avoided hitting Enter at the end of the line, so there are NO carriage returns or line feeds. I created a test table in SQL with the following columns:

col_a varchar(50)
.
.
col_j varchar(50)
col_1 int
.
.
col_10 int
col_11 decimal(5,2)
col_12 decimal(5,2)
col_13 int

When I run this command:

bulk insert test
from '<testfile>'
with (datafiletype = 'char', fieldterminator = ',',
rowterminator = '\r')

it works perfectly. However, as soon as I try to add a second line of data, it fails, with the same error as before. If I just hit Enter at the end of the first row (which adds CR/LF), WITHOUT typing anything on the next row, the bulk load also works!!

It's as though anything I type AFTER an end-of-line is being counted as part of the first row, i.e. the EOL characters are being ignored. I am going mad here, trying to get to the bottom of it, and I just can't work out what is going wrong. I've read so much about row terminators, file formats, etc that I've confused myself.

Could anyone help explain what's going on here? I shall be eternally grateful....

Julian

Edit: Should have said, the server collation is Latin1_General_CI_AS
Post #1365986
Posted Tuesday, October 2, 2012 8:06 AM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Yesterday @ 1:49 AM
Points: 3,033, Visits: 3,233
I have had something like this before - but the data came from an Excel created csv - so might not apply to your scenario but might be worth a look.
Check the line before the one giving the error for any odd characters double or single quotes where you don't expect them to be.
In instances where I have seen this it was due to a comma in a text field which caused the field to be double quoted on its export from Excel. This in turn caused the problem in bulk insert.


-------------------------------
Posting Data Etiquette - Jeff Moden
Smart way to ask a question

There are naive questions, tedious questions, ill-phrased questions, questions put after inadequate self-criticism. But every question is a cry to understand (the world). There is no such thing as a dumb question. ― Carl Sagan
I would never join a club that would allow me as a member - Groucho Marx
Post #1367062
Posted Tuesday, October 2, 2012 10:07 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, March 1, 2013 5:49 AM
Points: 9, Visits: 69
Stuart

Many thanks for your reply. I have been doing investigation into this, and I'm getting some inconsistent findings.

If I open the file in Excel or Notepad++ then it looks fine, each data row is represented as I would expect to see it and there is apparently a LF character at the end of each row. If, however, I open the file in normal Notepad, I get a very different result. If I turn off word wrap, I get what looks like a continuous string of data without any segregation of data rows. All that I get is a wrap at the 1024-character limit that Notepad seems to inflict on any long string. I presume this continuous-string behaviour is because Notepad doesn't recognse whatever character is being used as an end-of-line character?

If I turn on word wrap, two issues manifest themselves. Firstly, some data fields containing text are split across two lines, even though Notepad hasn't reached it's natural 1024-character wrapping limit. An example would be:

Field1,Field2,Field 3 which is a character description that should be
displayed on one row but is split across two,Field4,Field5,..............

The second issue is that not every new data row actually starts at the beginning of a line in the file - some of them just continue exactly after the end of the previous data row, something like this:

Here,is,the,first,2000,character,long,data,row
and,this,is,a,word,wrap
Here,is,the,second,2000,character,long,data,row
and,this,is,a,word,wrap
Here,is,the,third,2000,character,long,data,row
and,this,is,a,word,wrapAnd,this,is,actually,the,fourth,data,row,but
it,continues,where,the,last,one,left,off

They're both inconsistent issues as well - neither occur every nth row, and they don't always occur together. There can be any number of "good" data rows and then one issue, followed by two good rows and two bad ones.

I'm just putting this down to some weird formatting coming out of the source system rom where the file is being generated. I have found out that it's not coming directly from SAP as I had thought, there's another piece of software generating this output, so I presume it's doing something odd when it generates the file. I'm going to try and find another way to create the file in the first place, to avoid the issue. I'd be interested though to hear any thoughts you might have anyway.

Thanks for your help
Post #1367152
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse