SSIS loosing rows on text file import

  • Hi. I only have a small amount of experience with SSIS and I have something very odd happening. I have a very simple package that imports data from a tab delimited text file into a target table. My package just has a flat file source and an ole db destination.

    I had been getting truncation errors on my last column which is char(1), so I increased the column to varchar(1000) just for giggles. This stopped the errors from occurring. I suspect it was some end of row funny character or something. This column is a char(1) field but I can live with this workaround.

    So now that my package runs without errors I've realized that a bunch of rows are missing from my target table. My text file has 633,059 rows but my target table only has 480,212 rows, so I am missing 152,847 rows. WTF ?!?! SSIS runs this package without any errors or warnings, so POOF my data just disappears.

    I do know from querying the target table that the missing rows are poor quality data. The missing rows only have 2 fields populated out of the 87 fields in the flat file.

    I have googled this but cannot find any information on what might be causing this. I do know that another guy in my office was able to get past this by importing the file into access first, then into sql server. For whatever reason that worked. But I am developing an automated process and pulling the data into access first is not an option for me.

    Has anybody else come across this type of funkiness or have any ideas about what might be happening ? I'm pretty stumped so any suggestions would be great.

    Thanks

    Bill, Charlotte NC

  • It sounds like you might have inconsistent end of line characters and what that means is that it isn't able to tell that the line is done and to start a new one. I think the char(1) thing was a symptom of that. If you could pick out a line that has this problem you could look at to see exactly what characters are being used. I think you will find that they are not consistent.

    CEWII

  • Ok, I think I figured this out. Was a pain in the neck but here goes.

    The last large chuck of the file has only 2 of the 87 fields populated, so each row has many tab characters in series. I opened the file in notepad++ to display hidden characters, then I counted the tabs and the records that were failing were missing two tabs between the last populated field and the end of the row.

    So the records in question were not formatted properly. Once I removed these rows the import worked correctly. It was not obvious at all because of all the non-visible whitespace. I *knew* there was something amiss at the end of these rows but it just took a while to figure out exactly what.

    Luckily I do not need these rows and can safely remove them from the flat file but we definitely have to notify our data supplier to tell them that their data formatting is not consistent.

  • Thank you for posting this! I converted an Excel file into a Text file and ran into the same issue. I didn't realize you can leave one "extra space" in a cell in Excel which caused my issue. I found the row with the extra hidden character in the column that threw off the tabbing in that row. Of course I Googled and looked everywhere and found nothing on why my data was "disappearing". Sorry you had to go through that many rows. Fortunately mine was a tiny 480 rows. I was able to figure it out in my file thanks to your post. 🙂

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply