Following up on what AJ wrote, you could do the this:
1. Create the following UDF ( dbo.fOccurs() )
2. Run the script that follows the UDF, adapted for your situation. The script will load the data file, and then list any lines that do not contain the correct number of delimiters. Of course, this method won't give you the line number that is causing the problem.
Also, asking SQL Server to load five million rows will probably result in blocks of data being inserted in parallel. The physical order of the table data may not match the order of the data in the source file.
---
Another option is to download a trial version of Multi-Edit at
http://www.multieditsoftware.com/demo.php
I've used Multi-Edit for years. It can load files of any size, and among all of its other uses, is great for examining files. If you go this route, you can use it to break of the .csv file into several smaller files that may make identifying
------------------------------------------------------------------
-- UDF
------------------------------------------------------------------
CREATE FUNCTION dbo.fOccurs
(
@searchIn varchar(8000), -- The string to be searched.
@searchFor varchar(8000) -- The string to be counted.
)
RETURNS int
BEGIN
DECLARE @len_diff int
-- Returns number of times a string occurs in another string.
-- The first length calculation is the original size minus the length
-- after all occurrences of the search string are removed.
-- We divide that result by the length of the search string to get
-- the number of occurrences.
IF Len(@searchFor) = 0
SET @len_diff = 0
ELSE
BEGIN
SET @len_diff = Len(@searchIn) - Len(Replace(@searchIn, @searchFor, ''))
SET @len_diff = @len_diff / Len(@searchFor)
END
RETURN @len_diff
END
GO
------------------------------------------------------------------
-- Script
------------------------------------------------------------------
DROP TABLE ImportData
GO
CREATE TABLE ImportData ( line varchar(8000) )
BULK INSERT ImportData
FROM 'c:\YourCSVfile.txt'
WITH ( CODEPAGE = 'RAW'
, DATAFILETYPE = 'char'
, FIELDTERMINATOR = '|'
, ROWTERMINATOR = '\n'
, FIRSTROW = 1
)
SELECT line
FROM ImportData
WHERE dbo.fOccurs(line, '|') <> 99 -- or however many you expect