SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Problems importing CSV files using SSIS.


Problems importing CSV files using SSIS.

Author
Message
Orlando Colamatteo
Orlando Colamatteo
SSChampion
SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)

Group: General Forum Members
Points: 14905 Visits: 14396
asso.al-hamawandi (4/18/2011)
I had the same problem in the SSIS 2008 data flow package . I had a CSV (comma delimited) source data with one of the columns data quoted with a double quote( " ). The import worked fine using the SQL 2000 DTS import as it recognized the Text Qualifier value as a Double Quote {"}, but this was not the case when I started to use the SSIS 2008. The Text Qualifier shows <None> on the Flat file connection manager object (Text Qualifier) property. I tried to change the <None> to what I have in the DTS 2000 Double Quote {"}, but it didn't fix the problem. I kept researching on this until I found the simple answer to fix this is to put only (") in the Text Qualifier Property and this worked fine and I got rid of the double quotes in the data and the data imported successfully as it was in the source even with the data which contained comma in between.

You find below the link where I found my answer from:

http://www.mssqltips.com/tip.asp?tip=1316

You're not serious right? Did you even read this thread? Try processing the example row in the original post using the method outlined in the article and you'll soon realize the underlying issue.

__________________________________________________________________________________________________
There are no special teachers of virtue, because virtue is taught by the whole community. --Plato
namakakiweyho
namakakiweyho
Grasshopper
Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)

Group: General Forum Members
Points: 11 Visits: 108
I just spent a whole weekend on this problem. My CSV file contains text data contained within double quotes, eg "blah",1,1/10/2011,"blah" and numeric and date data. All fields are delimited by a comma as dictated by the CSV file format. If your field contains a comma but is enclosed by double quotes, then you must specify (") in the Text Qualifier box on the Flat Form Connection Manager Editor form....BUT, if you have already experienced problems with the file import, then you must delete the flat file connection and then create a new instance, otherwise no matter what you do, SSIS will not recognise the text qualifier. It works!!
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)

Group: General Forum Members
Points: 85615 Visits: 41082
John Kotuby (4/13/2011)
So much for the rant.


I'd post a rant of my own but you did a pretty good job of it. :-P

I'm trying to find it but there used to be a "standard" written a long time ago about true Comma Separate Value files including when and how text identifiers where used. It all boiled down to what we all know and what Excel can do. Things with embedded quotes or delimiters must be encapsulated in quotes and the embedded quotes must be doubled up.

It's a real shame they did away with the "Jet Drivers" when MS went to 64 bit. I believe you can recover some of the lost functionality by downloading the "Ace" drivers. I don't have a URL for the MS download for those... should take someone only a minute or two of Googling to find it.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Randy Doub
Randy Doub
Old Hand
Old Hand (386 reputation)Old Hand (386 reputation)Old Hand (386 reputation)Old Hand (386 reputation)Old Hand (386 reputation)Old Hand (386 reputation)Old Hand (386 reputation)Old Hand (386 reputation)

Group: General Forum Members
Points: 386 Visits: 778
BUT, if you have already experienced problems with the file import, then you must delete the flat file connection and then create a new instance, otherwise no matter what you do, SSIS will not recognise the text qualifier. It works!!


Makes me wonder how many times I've furiously deleted and started over something I "know" should work... and then it does! Angry
jawaharrajan
jawaharrajan
Forum Newbie
Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)

Group: General Forum Members
Points: 7 Visits: 16
Thanks for pointing out the need to only use " rather than "" - saved me many hours of frustration
Robert Brown-419172
Robert Brown-419172
Grasshopper
Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)

Group: General Forum Members
Points: 11 Visits: 15
Thanks. This solved my problem!
himshamisra
himshamisra
Grasshopper
Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)

Group: General Forum Members
Points: 11 Visits: 1
namakakiweyho (10/22/2011)
I just spent a whole weekend on this problem. My CSV file contains text data contained within double quotes, eg "blah",1,1/10/2011,"blah" and numeric and date data. All fields are delimited by a comma as dictated by the CSV file format. If your field contains a comma but is enclosed by double quotes, then you must specify (") in the Text Qualifier box on the Flat Form Connection Manager Editor form....BUT, if you have already experienced problems with the file import, then you must delete the flat file connection and then create a new instance, otherwise no matter what you do, SSIS will not recognise the text qualifier. It works!!


Awesome dude thanks for posting this!
Bryan Lundberg
Bryan Lundberg
Grasshopper
Grasshopper (10 reputation)Grasshopper (10 reputation)Grasshopper (10 reputation)Grasshopper (10 reputation)Grasshopper (10 reputation)Grasshopper (10 reputation)Grasshopper (10 reputation)Grasshopper (10 reputation)

Group: General Forum Members
Points: 10 Visits: 140
Okay, I know this is an old post, but I recently had the same issue and this was one of the first threads I found that described the issue. Now that my import is running, here is what I did.

The original issue on this thread being that SSIS has a hard time importing data that is sometimes text qualified. This seems to be a pretty common way for Excel or MS Access to export data.

In my case, I had a comma-delimited file with a field of numbers. Numbers less than 1000 were not text qualified. Numbers >= 1000 were written containing commas and text qualified with double quotes.

So a sample set of records looked like this:

Color,Date,Purchases
Red,"Sep 1, 2014",500
Blue,"Sep 2, 2014",750
Green,"Sep 3 2014","1,000"

If you tell SSIS that the Purchases field is text qualified, it looks for the qualifier in the first two rows and fails to parse the data correctly. If you tell it the data is not text qualified, it finds the "extra" comma in the third row, and fails to parse correctly.

The solution that got me most of the way there was posted here:
http://geekswithblogs.net/sathya/articles/how-to-import-and-export-csv-files-directly-in-ssis.aspx

Steps:
Create a new connection Manager
Select New Oledb Connection
Select the Provider as Microsoft Jet 4.0 Oledb Provider
After choosing that on the same window at the left you will be having two tabs to switch over. one would be All (Which mentions the properties of the connection) and other would be the connection info
Choose All. You would see a property called ExtendedProperties.
In that paste this without quotes : "text;HDR=Yes;FMT=Delimited" --I actually used CSVDelimited
There will be another property called Mode, which will mention 16 (readonly) and if you want to make that Read/Write change it to 19.
In the Server or file name text box under the connection tab, give the path of the csv file. Give it only till the folder name and not the csv file itself. (Ex: If your file is c:\sathya\contact.csv give it as c:\sathya)
And you can use this Oledb Connection in Oledb Source Task, Lookup task and Oledb Destination task also if you have changed the mode to 19 i.e. read/write.
While writing the query to fetch from csv, since you have given only up to the folder as the datasource you need to write query like "select * from contact.csv"


At this point, my import ran. However, my Purchases data still came in as NULL values where the source contained commas. Progress.

Then, based on this:
http://kb.tableau.com/articles/knowledgebase/jet-incorrect-data-type-issues

I went into the registry under HKLM\SOFTWARE\Wow6432Node\Microsoft\Jet\4.0\Engines\Text
and changed the string value for ImportMixedTypes to Text.

For good measure, I added the rows key set to 0 (scan all rows for type) as well.

Data now imports as a string (with the commas).

In my case, I used the DerivedColumn task to strip out commas on the way in so I can treat it as a number too.

Anyway, my 2 cents. Good luck.
Rob de Vos
Rob de Vos
Old Hand
Old Hand (331 reputation)Old Hand (331 reputation)Old Hand (331 reputation)Old Hand (331 reputation)Old Hand (331 reputation)Old Hand (331 reputation)Old Hand (331 reputation)Old Hand (331 reputation)

Group: General Forum Members
Points: 331 Visits: 214
I have added a Powershell step that removes the extra " chars.
I use the | symbol as field seperator which is a regex special char, so the regex expression has a \ in front of the | char to escape it.
Replace both when you're character is not an special char for regex.

$ExportFileLocation = 'C:\MyCsvFilesLocation\Exports'
Get-ChildItem $ExportFileLocation -Filter *.csv | `
Foreach-Object{
#write-host $_.Fullname
(get-content $_.Fullname -ReadCount 0) -replace '(?<!\|)"(?!\||$)',''| set-content $_.Fullname
}

Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)

Group: General Forum Members
Points: 85615 Visits: 41082
Rob de Vos (5/29/2015)
I have added a Powershell step that removes the extra " chars.
I use the | symbol as field seperator which is a regex special char, so the regex expression has a \ in front of the | char to escape it.
Replace both when you're character is not an special char for regex.

$ExportFileLocation = 'C:\MyCsvFilesLocation\Exports'
Get-ChildItem $ExportFileLocation -Filter *.csv | `
Foreach-Object{
#write-host $_.Fullname
(get-content $_.Fullname -ReadCount 0) -replace '(?<!\|)"(?!\||$)',''| set-content $_.Fullname
}


So what happens if the full name is supposed to have double quotes in it?

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search