Manual tasks in SSIS design

  • Hi,

    I am getting CSV files with hundreds of columns many of which have irregular characters in the column names, such as |, " ",  -, /, etc.  I've created table by manually importing the file, which renamed those columns in the table nicely by removing the character and replacing some of them with underscore.  When I try to build a data flow to import the file into the existing table I have to manually map all those hundreds of columns as the names do not exactly match.  If I create a new table the junk symbols go into the column names.  Is there a way to tell the SSIS designer to map the columns based on the order?

    Also my table has nvarchar columns and the connection manager considers the source files to be varchar, so I have to change every column in the connection manager to Unicode string.  Is there a way to do it for all columns at once?

    Thanks.

     

  • Unfortunately, the short answer is 'no and no', to the best of my knowledge.

    Regarding your second point, the 'pure' SSIS way of handling this is to send all of the columns through a Data Conversion component. Alternatively, you can in theory edit the DTSX file in a text editor such as Notepad++ and do a FIND/REPLACE to modify the source data types quickly. Make sure you have a backup copy of the package before doing this!

    If the answer to your question can be found with a brief Google search, please perform the search yourself, rather than expecting one of the SSC members to do it for you.
    See https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help/ for details of how to post T-SQL code-related questions.

  • You might want to review this article: https://www.sqlservercentral.com/articles/auto-assign-column-data-types-for-flat-file-source-import

    It outlines a method of using the destination table to define a destination and then 'swapping' that for a source and retains the meta-data for the connection.  I would be interested to see if this works for you.

    Jeffrey Williams
    Problems are opportunities brilliantly disguised as insurmountable obstacles.

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Phil Parkin wrote:

    Unfortunately, the short answer is 'no and no', to the best of my knowledge.

    Regarding your second point, the 'pure' SSIS way of handling this is to send all of the columns through a Data Conversion component. Alternatively, you can in theory edit the DTSX file in a text editor such as Notepad++ and do a FIND/REPLACE to modify the source data types quickly. Make sure you have a backup copy of the package before doing this!

    This is what I ended up doing (opening the package in notepad and doing find and replace).  A bit  risky, so I wanted to see if there is a different way or more legit way to do this.

    I did not quite understand your suggestion about the Data Conversion component for the columns.  I am getting CSV files with hundreds of columns.  Are you suggesting that I somehow update the first row in the CSV file containing the column names to bring them in line with the table column names?

     

  • Jeffrey Williams wrote:

    You might want to review this article: https://www.sqlservercentral.com/articles/auto-assign-column-data-types-for-flat-file-source-import

    It outlines a method of using the destination table to define a destination and then 'swapping' that for a source and retains the meta-data for the connection.  I would be interested to see if this works for you.

    I think manually creating the table and entering data into it will take even more time.  The length of the columns are not the problem for me.  It is the varchar vs nvarchar.  My table is nvarchar, but the import tool considers the csv file data as varchar.  So I had to open the package file in notepad and do find and replace.

     

  • Roust_m wrote:

    I did not quite understand your suggestion about the Data Conversion component for the columns.  I am getting CSV files with hundreds of columns.  Are you suggesting that I somehow update the first row in the CSV file containing the column names to bring them in line with the table column names?

    No, I was referring to this: https://docs.microsoft.com/en-us/sql/integration-services/data-flow/transformations/data-conversion-transformation?view=sql-server-ver15

    I don't think it will particularly speed things up for you in this case, but I mentioned it in passing just in case it's of interest.

    If the answer to your question can be found with a brief Google search, please perform the search yourself, rather than expecting one of the SSC members to do it for you.
    See https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help/ for details of how to post T-SQL code-related questions.

  • Regarding your question about column names, have you considered creating a text file which contains (only) your desired column names as headings (which match the names in your target table) and then appending the source file to this file before importing?

    This would be worthwhile only if the source files have consistent formats (in terms of column order and number of columns).

    If the answer to your question can be found with a brief Google search, please perform the search yourself, rather than expecting one of the SSC members to do it for you.
    See https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help/ for details of how to post T-SQL code-related questions.

  • Roust_m wrote:

    Jeffrey Williams wrote:

    You might want to review this article: https://www.sqlservercentral.com/articles/auto-assign-column-data-types-for-flat-file-source-import

    It outlines a method of using the destination table to define a destination and then 'swapping' that for a source and retains the meta-data for the connection.  I would be interested to see if this works for you.

    I think manually creating the table and entering data into it will take even more time.  The length of the columns are not the problem for me.  It is the varchar vs nvarchar.  My table is nvarchar, but the import tool considers the csv file data as varchar.  So I had to open the package file in notepad and do find and replace.

    That is not what that article shows - the basics are simple.

    1. Create a table with the expected format of the file to be imported
    2. Use that table definition to define your file connection
    3. Use the table as a source and create a destination using the file connection.  This creates the definition based on the tables definition
    4. Now you can use that same file connection as a source

    The other way around this kind of issue is to use a staging table - where the columns are defined as varchar.  Then a simple truncate/insert procedure that moves the data from the staging table into the final table - where SQL Server will then implicitly convert the data to your nvarchar columns.

    With all that said - if the data coming into the system from these files is varchar, then why do you need them defined as nvarchar?

    Jeffrey Williams
    Problems are opportunities brilliantly disguised as insurmountable obstacles.

    How to post questions to get better answers faster
    Managing Transaction Logs

Viewing 8 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply