June 5, 2025 at 8:51 am
Hi,
Does anyone know how to handle a raw data file where some rows contain carriage return characters?
The problem is that certain rows, which should be treated as a single line, are being split into multiple lines due to embedded carriage returns in the file. The file was exported using isql from Sybase, not via the native BCP format. As a result, SSIS treats the split lines as separate rows instead of a single continuous row. What’s the recommended way to handle this in SSIS?
Thanks in advance
June 5, 2025 at 11:02 am
if the file is a proper CSV file with the each column value enclosed in double quotes (and inside double quotes proper escaped) then you just need to define the file in SSIS as delimited and double quoted.
if not a proper CSV file then you don't have any way to deal with it
June 5, 2025 at 3:27 pm
You could also import into a staging table with a VARCHAR(MAX) column. If I recall correctly an identity column will preserve the order of the rows from file to table. Then you could use tsql to combine rows. Not ideal and not an SSIS solution but for your consideration.
June 7, 2025 at 12:43 pm
the source file came from .txt format only, the instruction is no manipulation will be done from the source file , if somehow need to manipulate the problem with the rows from the source file is should be under SSIS method way. i can give sample
file.txt(tab delimited)
column1 column2 column3
value1 value1 long text,
continuation of long text <-- went to another line , this one should be treated in SSIS as part of the single row.
does this make sense? Thank you advance.
June 9, 2025 at 2:01 pm
If I had to do this, the first thing I'd try to figure out is how to detect the issue, regardless of the final implementation. How can you possibly know if a CR is a legitimate newline or the continuation from previous line? Given your sample value, that's going to be difficult. Can the provider of the data fix the datafile? Maybe add delimiters/quotes or something?
If you can reliably identify where this occurs, then you could "preprocess" the source file into a proper delimited CSV file and then import it. Could all be done in SSIS, sure, but the "preprocess" might take a script task or some difficult SSIS finagling. You're concerned about "manipulation", but this is not "manipulation", it's "reformatting for import". No change to the data, but rather just formatting it properly so SSIS can then import it.
But yeah, regardless of what tool you use to import this, you're going to have to figure out how to identify the occurrences so that you can handle them.
June 9, 2025 at 2:43 pm
I'm going to repeat what I said above.
using SSIS (or any other tool that deals with correctly built CSV files) having a cr, lf or field delimiter within the contents of a column is NOT an issue.
for this particular case if the field quote is set, and field delimiter is set, and IF the file is indeed a correctly built CSV file, then it will load the content correctly and assume that that CR is part of the content of the field, not a row delimiter.
June 9, 2025 at 4:37 pm
Not sure why you're repeating what you said. Looks to me like the example OP posted is definitely NOT a correctly built CSV file. You're certainly correct that this would be a slam dunk if the file was a properly formatted CSV file, but it's not. If OP cannot get their provider to fix this (and they really should), what are they supposed to do?
June 9, 2025 at 6:14 pm
well. taking in consideration the OP mentioned that this is a tab delimited file, and is complaining about CR on middle of a text, then what I said applies - as only reason for SSIS not to treat the file correctly is if one of the fields ALSO has a tab on it AND IF the file does not have the proper Quotes around field contents.
also the OP did not post a file, only gave an example of how it looks like to him.
Viewing 8 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply