Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

How do I find the record where "column delimiter not found" Expand / Collapse
Author
Message
Posted Monday, March 21, 2005 5:45 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Sunday, December 12, 2010 5:41 PM
Points: 3, Visits: 31
I have a | delimited flat file with 100 variables. I'm getting the "column delimiter not found" message when I run the dts package. The file has over 5M records in it, and I don't have enough virtual memory on the server to open it in TextPad. Can anyone provide suggestions on how to track down this problem? I tried to use this from the command line: findstr /V ".*|.*|.*,.*|.*" c:\YourCSVfile.txt I tried this with 99 delimiters, but I got a message that my search string was too long.

Also, is there a way I can tell if I have Service Packs 1-3 loaded? I this could help solve the problem. I'm new to this job, so please forgive me for being 'green'!



Post #168911
Posted Monday, March 21, 2005 6:01 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Friday, July 22, 2011 5:35 AM
Points: 1,758, Visits: 9

1.  The problem COULD be the 5mm records and you are running into a buffer problem.

Have you tried importing the file directly into a "staging" table that has 1 column and then parse from there?

2.  Run SELECT @@VERSION  You should get a 8.00.xx number.  You can look up online what service packs etc you have with that #

 





Good Hunting!

AJ Ahrens


webmaster@kritter.net
Post #168913
Posted Monday, March 21, 2005 6:40 AM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Monday, April 7, 2014 12:17 AM
Points: 419, Visits: 58

What I would try is put the "max error count" in the Options tab of the properties task doing the import to a certain number e.g. 10.

Then in the properties of the DTS package on the Logging tab enter a name for an error file.
When a row fails that row will be put in the error file.
To check whether any rows do come through you should put the max error count to a number higher than the number of rows in the table. I don't know the upper limit for this parameter though.

 

good luck,

 

fred

Post #168923
Posted Tuesday, March 22, 2005 8:44 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Tuesday, September 16, 2014 5:09 PM
Points: 977, Visits: 277

Following up on what AJ wrote, you could do the this:

1. Create the following UDF ( dbo.fOccurs() )
2. Run the script that follows the UDF, adapted for your situation.  The script will load the data file, and then list any lines that do not contain the correct number of delimiters. Of course, this method won't give you the line number that is causing the problem.
 
Also, asking SQL Server to load five million rows will probably result in blocks of data being inserted in parallel. The physical order of the table data may not match the order of the data in the source file.

---

Another option is to download a trial version of Multi-Edit at

    http://www.multieditsoftware.com/demo.php

I've used Multi-Edit for years. It can load files of any size, and among all of its other uses, is great for examining files. If you go this route, you can use it to break of the .csv file into several smaller files that may make identifying

------------------------------------------------------------------
-- UDF
------------------------------------------------------------------

CREATE FUNCTION dbo.fOccurs
(
  @searchIn  varchar(8000),  -- The string to be searched.
  @searchFor varchar(8000)   -- The string to be counted.
)
RETURNS int
BEGIN
  DECLARE @len_diff int

  -- Returns number of times a string occurs in another string.

  -- The first length calculation is the original size minus the length
  -- after all occurrences of the search string are removed.
  -- We divide that result by the length of the search string to get
  -- the number of occurrences.

  IF Len(@searchFor) = 0
    SET @len_diff = 0
  ELSE
  BEGIN
    SET @len_diff = Len(@searchIn) - Len(Replace(@searchIn, @searchFor, ''))
    SET @len_diff = @len_diff / Len(@searchFor)
  END

  RETURN @len_diff
END

GO

------------------------------------------------------------------
-- Script
------------------------------------------------------------------

DROP TABLE ImportData
GO

CREATE TABLE ImportData ( line varchar(8000) )

BULK INSERT ImportData
     FROM 'c:\YourCSVfile.txt'
     WITH ( CODEPAGE = 'RAW'
          , DATAFILETYPE = 'char'
          , FIELDTERMINATOR = '|'
          , ROWTERMINATOR = ''
          , FIRSTROW = 1
          )

SELECT line
  FROM ImportData
 WHERE dbo.fOccurs(line, '|') <> 99 -- or however many you expect




Post #169210
Posted Tuesday, March 22, 2005 4:07 PM


SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Wednesday, October 1, 2014 8:20 AM
Points: 932, Visits: 368
I have had need to check for common errors (for example, data where the user has included a EOL character in the middle of a field) in large data files. I have found that some basic preprocessing through a text editor can be very helpful in these situations.

My text editor of choice is TextPad, rather than MultiEdit. However, any text editor that includes find and replace based on regular expressions should be able to do the trick, or something similar.

What I did in my case was build a regular expression to mark lines with exactly the right number of fields (actually, exactly the right number of delimiters). My editor includes an option to "bookmark" lines, and to flip all bookmarks (so that what were the bookmarked lines are now not bookmarked, and what were the non-bookmarked lines are bookmarked). I toggle the bookmarks, and look at all lines that are now bookmarked (which should be all lines that had too many or too few delimiters).

This is an excellent bet in situations where you are expected to get as much of the data added to the system successfully as you can, or where errors that cannot be trivially resolved by you (once you find them) are rare. In a situation where your organization is content with rejecting bad data records and shipping them back to the sender, creating an exception file in DTS might be the best way to go.



R David Francis
Post #169371
Posted Wednesday, March 23, 2005 1:09 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Sunday, December 12, 2010 5:41 PM
Points: 3, Visits: 31
These are all very helpful. I sincerely appreciate you all sharing your experience with me. I'm currently loading all the records I can get, and then I'll go back and investigate using the suggesions provided. I'd live to browse this file with TextPad, but it won't open - I suppose I'm blowing out the virtual memory.

Thanks again,
Dave



Post #169664
Posted Wednesday, March 23, 2005 1:25 PM


SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Wednesday, October 1, 2014 8:20 AM
Points: 932, Visits: 368
Oops - sorry, I missed the TextPad note in your original post.

Unix systems have the 'head' and 'tail' commands that would let you grab chunks of the file to investigate. If you're comfortable with an appropriate tool (I'd do it in Perl, even on Windows; odds are someone has created a tool to do this anyway), you can break the file into chunks yourself.

Heck, you could always slurp the whole thing in w/ a DTS package, and write it back out a chunk at a time.

Just a few thoughts for future references (or for lurkers interested in the subject), as you seem to be cranking along at this point....



R David Francis
Post #169671
Posted Thursday, August 20, 2009 1:58 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, March 26, 2010 5:04 AM
Points: 1, Visits: 33
I had an 11 million row CSV file that had a small number of rows where the delimiters were wrongly set for some reason. I got the error:
[Tx File [1]] Error: The column delimiter for column [X] was not found.

I couldn't get SSIS to send these rows elsewehere during the load because (I'm guessing) the lack of delimiter meant that it couldn't skip onto the next row.

In the end I imported it to an old instance of SQL 2000 which is a lot more forgiving and then transferred it and began investigating and cleaning from there.

regards, Matt
Post #774054
Posted Friday, April 23, 2010 11:13 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Wednesday, October 19, 2011 12:45 PM
Points: 61, Visits: 53
I had this exact error and the cause was an errant double quote, eventhough the text qualifier was a double quote. Hope this helps anyone else looking for potential causes.
Post #909688
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse