Removing noisi commas in CSV file (SSIS)

  • I am quite new on using SSIS dtsx on SQL server 2005. I have CSV files that have been cleansed removing extra commas (in text fields) opening the CSV file in excel and using a replace command. The problem is that I cannot do the same thing in SSIS as the CSV file cannot be opened with the Excel Connection Manager.

    Moreover, I also cannot use a transformation option as the error occurs directly in the Flat File Source.

    How Can I handle this problem?

    Thanks in advance for your help, Damien

  • Hi Raunak,

    First of All, thanks for your reply. I already tried with a Flat File Connection; but the error occurred directly on the Flat File Source.

    The solution I am working one now is to use a script task to open the file in excel and use a macro to remove the noisy commas. But using SSIS 2005 I can't find Microsoft.Office.Interop.Excel.dll to use Interop within VB.net ...

    Regards,

    Damien

  • Can you post the error message here? How are you configuring the Flat File Connection Manager and why does the component throws error? Can you post a sample data with first 10 rows?

    Raunak J

  • damien.gulluni (10/7/2013)


    Hi Raunak,

    First of All, thanks for your reply. I already tried with a Flat File Connection; but the error occurred directly on the Flat File Source.

    The solution I am working one now is to use a script task to open the file in excel and use a macro to remove the noisy commas. But using SSIS 2005 I can't find Microsoft.Office.Interop.Excel.dll to use Interop within VB.net ...

    Regards,

    Damien

    Is there a text qualifier defined?

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Yes sure.

    You will find here all the component I use:

    - Flat File Connection Manager (format:delimited; text qualifier: "; header row delimiter: {CR}{LF};

    - Flat File Source;

    - Data Conversion Transformation (used to transform all the string to Unicode String)

    - Ole DB Destination Editor (used to insert the data in database)

    The error following error appears directly in the Flat File Source.

    Error: 0xC0202055 at DFT_LoadingExcelSheets, Flat File Source [1183]: The column delimiter for column "Billing End Date" was not found.

    Error: 0xC0202092 at DFT_LoadingExcelSheets, Flat File Source [1183]: An error occurred while processing file "\\cfsdata01\vmnc\RAID\Data Upload\Sources\Eboat\file.csv" on data row 1441.

    Error: 0xC0047038 at DFT_LoadingExcelSheets: SSIS Error Code DTS_E_PRIMEOUTPUTFAILED. The PrimeOutput method on component "Flat File Connection" (1183) returned error code 0xC0202092. The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing. There may be error messages posted before this with more information about the failure.

    Error: 0xC0047021 at DFT_LoadingExcelSheets: SSIS Error Code DTS_E_THREADFAILED. Thread "SourceThread0" has exited with error code 0xC0047038. There may be error messages posted before this with more information on why the thread has exited.

    Error: 0xC0047039 at DFT_LoadingExcelSheets: SSIS Error Code DTS_E_THREADCANCELLED. Thread "WorkThread0" received a shutdown signal and is terminating. The user requested a shutdown, or an error in another thread is causing the pipeline to shutdown. There may be error messages posted before this with more information on why the thread was cancelled.

    Error: 0xC0047021 at DFT_LoadingExcelSheets: SSIS Error Code DTS_E_THREADFAILED. Thread "WorkThread0" has exited with error code 0xC0047039. There may be error messages posted before this with more information on why the thread has exited.

    Information: 0x40043008 at DFT_LoadingExcelSheets, DTS.Pipeline: Post Execute phase is beginning.

    The row content is:

    "toto","tata","tata""toto"", OJSC","tata","tata","CB ""tata"", tat","tt","tata","tata","59595","0","tata","0","","tata","tata","1","1200","tata","1200","tata","20.50","N","","2010/05/26 00:00:00",""

    I changed the text values with toto tata due to some confidential information. But the problem is that some noisy comma are part of text.

    Hope this helps,

    Regards,

    Damien

  • It seems to me your text qualifiers are the problem, not the commas.

    Anyway, all of these issues can easily be avoided by choosing a decent delimiter.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • I fully agree when you have the control of the file produced .. The file comes from Business Object world and I have only choice between tab, semicolon, and comma which are all delimiter that can be used in the file I receive.

    But for me the main problem is that the excel engine can handle that case while SSIS cannot.

    Regards,

    Damien

  • As a good practice, you should use the Preview command while configuing the connection manager and tasks? What is the issue with Billing Date Column? It says missing column delimiter?

    Raunak J

  • Yes preview is cool, but I can only see the first 100 rows when the problem appears latter on the file.

    Regards,

    Damien

  • damien.gulluni (10/7/2013)


    I fully agree when you have the control of the file produced .. The file comes from Business Object world and I have only choice between tab, semicolon, and comma which are all delimiter that can be used in the file I receive.

    But for me the main problem is that the excel engine can handle that case while SSIS cannot.

    Regards,

    Damien

    It's not SSIS' fault that your csv is malformed.

    As I see it, the text qualifiers are not correctly handled. (as I see multiple text qualifiers in one column)

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Anyway this is not the concern of my request to say if SSIS is responsible or not ...

    What I am looking for, and thank you again for your time, is to find a solution with all the constraint I have.

    Actually the process is fully manual and I am trying to automate it with SSIS.

    In order to open the file with excel in SSIS I am trying to use a script task. But now the issue I am facing is that I cannot find Excel Interop component to be used in vb.net script (SSIS 2005 version).

    Regards,

    Damien

  • Can you see all the columns in the file along with the header names?

    Raunak J

  • Yes Raunak.

    Damien

  • damien.gulluni (10/7/2013)


    Anyway this is not the concern of my request to say if SSIS is responsible or not ...

    What I am looking for, and thank you again for your time, is to find a solution with all the constraint I have.

    Actually the process is fully manual and I am trying to automate it with SSIS.

    In order to open the file with excel in SSIS I am trying to use a script task. But now the issue I am facing is that I cannot find Excel Interop component to be used in vb.net script (SSIS 2005 version).

    Regards,

    Damien

    It's a .csv file, not an Excel file. You can use the StreamReader class to read the file.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply