Expanding on my prior post... (Sorry this is so long. I know that a lot of people already know these details, but I think some people don't)
Programs that 'export' data into a delimited file need to deal with the fact that field-data itself might contain the field-delimiter character, and do it in such a way that a downstream 'import' process can **un-ambiguously** parse out all fields in each record correctly.
For CSV (Comma-Separated-Value) files, the delimiter is the comma character, and the way to deal with field-data that *contains* a comma, is to surround the field-data with quotes [chr(34)]. The 'import' process regards all commas found *within* quotes as part of the current-field-data, and not as a delimiter to start a new field.
But now you also have to deal with the fact that the field-data itself might *contain* one-or-more quotes [chr(34)]. You don't want the 'import' process to regard those embedded quotes as a signal to terminate the current field-data, but rather pull-in those quotes as part-of the field data. The way this is done is for the 'export' process to replace *every* chr(34) found in each field with chr(34)chr(34) (ie: two contiguous quotes). and for the 'import' process to regard pairs of consecutive chr(34) as *data* to pull into the field, and not terminate the field-data until a solitary chr(34) is found.
Properly designed CSV 'export' and 'import' processes are desgined to do all of the above, and there is *never* any ambiguity and everything works. Perfectly. All the time.
While you might be using a properly designed 'import' program, you sometimes have to deal with files that were not created by a properly designed 'export' program. Some programmers write their own 'export' programs and fail to deal with these issues in the correct way. Their algorithm might do something unsophisticated like this: For each field: just put chr(34) on both ends of the source field-data (verbatim), and then just separate each field with a comma.
So, a source-record that looks like this:
SOURCE: [Name: Kenneth Smith] [Address: 555 Main St] [City: Austin]
CSV: "Kenneth Smith","555 Main St", "Austin"
So far, so good, no ambiguities or problems there.
But if the source record looks like this (notice the embedded quotes in the Name and the embedded comma in the Address):
SOURCE: [Name: Kenneth "Kenny" Smith] [Address: 555 Main St, Apt 3C] [City: Austin]
A *properly* formatted CSV record should be:
GOOD CSV: "Kenneth ""Kenny"" Smith","555 Main St, Apt 3C","Austin"
And the 'import' program unambiguously 'knows' that each pair of quotes surrounding Kenny are part of the data, and that they do not signal the end of the field, and it also knows that the solitary quote after Smith unambiguously signals the end of the field. The paired-quotes [chr(34)chr(34)] are restored back to back to just [chr(34)] on the receiving end, and all is good.
But a poorly-written export process might produce this (field data inserted *verbatim* between the quotes):
BAD CSV: "Kenneth "Kenny" Smith","555 Main St, Apt 3C","Austin"
The import program sees the solitary quote just before Kenny and assumes that's the end of the field data, but then the next character is not the expected field delimiter (comma) and therefore an error occurs. Well, you might say that the *import* program should simply regard any single-quote that is *not-immediately-followed-by* a comma as part of the field data and move on. YES THAT WOULD WORK IN THIS EXAMPLE.
*BUT* what if the source field data *does* have a quote followed by a comma in it? Or worse yet, a quote-comma-quote in it?
SOURCE: [Name: Kenneth "," Smith] [Address: 555 Main St, Apt 3C] [City: Austin]
GOOD CSV: "Kenneth "","" Smith","555 Main St, Apt 3C","Austin"
BAD CSV: "Kenneth "," Smith","555 Main St, Apt 3C","Austin"
In this BAD CSV case, it is utterly impossible for an general-use CSV 'import' process (or any general-use pre-processor, for that matter) to reliably reconstruct the source record properly because every place you see quote-comma-quote in this example it is (to the importer) unambiguously a signal end the field and start a new field.
With the GOOD CSV, the import process can unambiguously handle everything perfectly. It sees the contiguous pairs of quotes as part of the field-data and not as field terminators, and those pairs [chr(34)chr(34)] get restored back to [chr(34)] on the receiving end. All is good.