Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««123»»

SSIS and dealing with embedded double quotes Expand / Collapse
Author
Message
Posted Wednesday, March 9, 2011 3:37 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Thursday, December 18, 2014 12:52 PM
Points: 13,636, Visits: 11,509
Phil Parkin (3/9/2011)
That's a message I always try to get across. The fact that the files are usually called 'Comma Separated' is not helping the cause one bit and I think we have an uphill struggle.

Good work with the testing by the way. Not got much work to do today?


Nope. I'm actually between projects for the moment, so I'm enhancing my SSIS skills by performing such little tests.




How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?

Member of LinkedIn. My blog at LessThanDot.

MCSA SQL Server 2012 - MCSE Business Intelligence
Post #1075415
Posted Wednesday, March 9, 2011 3:43 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 2:15 AM
Points: 5,317, Visits: 12,354
Impressive commitment! When I am 'between projects' I tend to work on improving my guitar skills


Help us to help you. For better, quicker and more-focused answers to your questions, consider following the advice in this link.

When you ask a question (and please do ask a question: "My T-SQL does not work" just doesn't cut it), please provide enough information for us to understand its context.

It is better to keep your mouth shut and appear stupid than to open it and remove all doubt. (Mark Twain)
Post #1075417
Posted Wednesday, March 9, 2011 7:01 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Wednesday, December 10, 2014 12:03 PM
Points: 103, Visits: 488
Well a. i'm a new employee and the file we get are text qualified with "" and comma delimited. so a row looks like this
"some text","some text2","some name "BookTitle" to buy ", "end"


when the package sees a " it thinks that its the end of the field, and complains that it can't find the , after the "

I talked my supervisor about saying it isn't text qualified, but that would mean the "" would show up in our rows.

I guess based on all the responses we can ask for a different delimiter or for them not to include the "" in the title names, but i need some solution to load these things because a change is probably not going to happen super fast.

Right now all i can do is open the file and manually remove the lines or fix them.
Post #1075520
Posted Wednesday, March 9, 2011 7:09 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 2:15 AM
Points: 5,317, Visits: 12,354
natarshia.peck (3/9/2011)
Well a. i'm a new employee and the file we get are text qualified with "" and comma delimited. so a row looks like this
"some text","some text2","some name "BookTitle" to buy ", "end"


when the package sees a " it thinks that its the end of the field, and complains that it can't find the , after the "

I talked my supervisor about saying it isn't text qualified, but that would mean the "" would show up in our rows.

I guess based on all the responses we can ask for a different delimiter or for them not to include the "" in the title names, but i need some solution to load these things because a change is probably not going to happen super fast.

Right now all i can do is open the file and manually remove the lines or fix them.


Your idea about importing as not text qualified has some merit. You could bring the data in 'as is' and then run some UPDATE commands afterwards that would tidy most of it up (by removing the first and last character of the offending fields, where field length > 2, perhaps).



Help us to help you. For better, quicker and more-focused answers to your questions, consider following the advice in this link.

When you ask a question (and please do ask a question: "My T-SQL does not work" just doesn't cut it), please provide enough information for us to understand its context.

It is better to keep your mouth shut and appear stupid than to open it and remove all doubt. (Mark Twain)
Post #1075526
Posted Wednesday, March 9, 2011 7:37 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Wednesday, December 10, 2014 12:03 PM
Points: 103, Visits: 488
Thanks. I have to move on this fast because I inherited this package and I have other packages to develop/test. So I need to report up what the deal is with the pack failing and some 'resolution'.

The thing that perplexes me most, and I just had another dev look at this, is why when the parse fails, the redirect on error doesn't just skip the row.

I guess like the post says, there is nothing in ssis that will do this.
Post #1075550
Posted Wednesday, March 9, 2011 9:00 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Tuesday, November 12, 2013 8:44 AM
Points: 1,073, Visits: 6,334
One way to speed up the preprocessing of your CSV files is to use Log Parser to convert them to TSV files. A TSV should be easier for SSIS to make use of; Log Parser 2.2

For example, if input.csv contained this:
Author, Title,Quote
"Salinger","The Catcher in the Rye","""People always think something's all true."""
"Trumbo","Johnny Got His Gun", """S.O.S. Help me."""

This would change the delimiters to tabs and remove the double quotes.
LOGPARSER -i:CSV -o:TSV "SELECT Author, Title,Quote INTO output.tsv FROM input.csv"

Post #1075630
Posted Wednesday, March 9, 2011 9:26 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Wednesday, December 10, 2014 12:03 PM
Points: 103, Visits: 488
thanks i am going to test that today
Post #1075656
Posted Thursday, March 10, 2011 8:46 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Monday, September 29, 2014 10:25 PM
Points: 254, Visits: 755
Building on the other suggestions and comments earlier in the thread, and if you're able to make file change requests, I'd recommend removing the text qualifier (so you don't get the possibility of the qualifier showing up in the middle of the text, bombing out the import like you've seen), but instead of just changing the field delimiter from a comma to some other single character (pipe, etc.), consider a double-character field delimiter.

I implemented this as a fix for some vendor data a few years back after your same issue came up with them, switched to no text qualifiers and a double-character field delimiter (we're using a tilda-carat: ~^ as a delimiter), and haven't had any issues since. It'll make the files look ugly, but it works.

Removing text qualifiers and only switching the field delimiter to another character still leaves open the chance that bad data with that new delimiter character can show up and mess with your import (I never believe the response, "That character will never be sent in our data." ... it'll happen eventually). A double-character delimiter reduces the chance that a data field will ever contain that combo to almost nothing.

Just a thought ... but bad data is still bad data if it's not sent to you in your established/spec format.
Post #1076309
Posted Thursday, March 10, 2011 12:14 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Thursday, December 18, 2014 12:52 PM
Points: 13,636, Visits: 11,509
dg227 (3/10/2011)

Just a thought ... but bad data is still bad data if it's not sent to you in your established/spec format.


And you know what they say:
crap in is crap out




How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?

Member of LinkedIn. My blog at LessThanDot.

MCSA SQL Server 2012 - MCSE Business Intelligence
Post #1076504
Posted Thursday, March 10, 2011 12:17 PM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Monday, September 29, 2014 10:25 PM
Points: 254, Visits: 755
Koen Verbeeck (3/10/2011)
dg227 (3/10/2011)

Just a thought ... but bad data is still bad data if it's not sent to you in your established/spec format.


And you know what they say:
crap in is crap out


So true ...
Post #1076507
« Prev Topic | Next Topic »

Add to briefcase ««123»»

Permissions Expand / Collapse