Manual tasks in SSIS design

Question

Manual tasks in SSIS design

Roust_m

SSCoach

Points: 17410
More actions
May 12, 2021 at 11:44 pm

#3882812

Hi,
I am getting CSV files with hundreds of columns many of which have irregular characters in the column names, such as |, " ", -, /, etc. I've created table by manually importing the file, which renamed those columns in the table nicely by removing the character and replacing some of them with underscore. When I try to build a data flow to import the file into the existing table I have to manually map all those hundreds of columns as the names do not exactly match. If I create a new table the junk symbols go into the column names. Is there a way to tell the SSIS designer to map the columns based on the order?
Also my table has nvarchar columns and the connection manager considers the source files to be varchar, so I have to change every column in the connection manager to Unicode string. Is there a way to do it for all columns at once?
Thanks.

Viewing 8 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply

Phil Parkin SSC Guru Points: 247227 More actions · Answer 1

Unfortunately, the short answer is 'no and no', to the best of my knowledge.

Regarding your second point, the 'pure' SSIS way of handling this is to send all of the columns through a Data Conversion component. Alternatively, you can in theory edit the DTSX file in a text editor such as Notepad++ and do a FIND/REPLACE to modify the source data types quickly. Make sure you have a backup copy of the package before doing this!

Jeffrey Williams SSC Guru Points: 90351 More actions · Answer 2

You might want to review this article: https://www.sqlservercentral.com/articles/auto-assign-column-data-types-for-flat-file-source-import

It outlines a method of using the destination table to define a destination and then 'swapping' that for a source and retains the meta-data for the connection. I would be interested to see if this works for you.

Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

― Charles R. Swindoll

How to post questions to get better answers faster
Managing Transaction Logs

Roust_m SSCoach Points: 17410 More actions · Answer 3

Phil Parkin wrote:

Unfortunately, the short answer is 'no and no', to the best of my knowledge.
Regarding your second point, the 'pure' SSIS way of handling this is to send all of the columns through a Data Conversion component. Alternatively, you can in theory edit the DTSX file in a text editor such as Notepad++ and do a FIND/REPLACE to modify the source data types quickly. Make sure you have a backup copy of the package before doing this!

This is what I ended up doing (opening the package in notepad and doing find and replace). A bit risky, so I wanted to see if there is a different way or more legit way to do this.

I did not quite understand your suggestion about the Data Conversion component for the columns. I am getting CSV files with hundreds of columns. Are you suggesting that I somehow update the first row in the CSV file containing the column names to bring them in line with the table column names?

Roust_m SSCoach Points: 17410 More actions · Answer 4

Jeffrey Williams wrote:

You might want to review this article: https://www.sqlservercentral.com/articles/auto-assign-column-data-types-for-flat-file-source-import
It outlines a method of using the destination table to define a destination and then 'swapping' that for a source and retains the meta-data for the connection. I would be interested to see if this works for you.

I think manually creating the table and entering data into it will take even more time. The length of the columns are not the problem for me. It is the varchar vs nvarchar. My table is nvarchar, but the import tool considers the csv file data as varchar. So I had to open the package file in notepad and do find and replace.

Phil Parkin SSC Guru Points: 247227 More actions · Answer 5

Roust_m wrote:

I did not quite understand your suggestion about the Data Conversion component for the columns. I am getting CSV files with hundreds of columns. Are you suggesting that I somehow update the first row in the CSV file containing the column names to bring them in line with the table column names?

No, I was referring to this: https://docs.microsoft.com/en-us/sql/integration-services/data-flow/transformations/data-conversion-transformation?view=sql-server-ver15

I don't think it will particularly speed things up for you in this case, but I mentioned it in passing just in case it's of interest.

Phil Parkin SSC Guru Points: 247227 More actions · Answer 6

Regarding your question about column names, have you considered creating a text file which contains (only) your desired column names as headings (which match the names in your target table) and then appending the source file to this file before importing?

This would be worthwhile only if the source files have consistent formats (in terms of column order and number of columns).

Jeffrey Williams SSC Guru Points: 90351 More actions · Answer 7

Roust_m wrote:

Jeffrey Williams wrote:
You might want to review this article: https://www.sqlservercentral.com/articles/auto-assign-column-data-types-for-flat-file-source-import
It outlines a method of using the destination table to define a destination and then 'swapping' that for a source and retains the meta-data for the connection. I would be interested to see if this works for you.
I think manually creating the table and entering data into it will take even more time. The length of the columns are not the problem for me. It is the varchar vs nvarchar. My table is nvarchar, but the import tool considers the csv file data as varchar. So I had to open the package file in notepad and do find and replace.

That is not what that article shows - the basics are simple.

Create a table with the expected format of the file to be imported
Use that table definition to define your file connection
Use the table as a source and create a destination using the file connection. This creates the definition based on the tables definition
Now you can use that same file connection as a source

The other way around this kind of issue is to use a staging table - where the columns are defined as varchar. Then a simple truncate/insert procedure that moves the data from the staging table into the final table - where SQL Server will then implicitly convert the data to your nvarchar columns.

With all that said - if the data coming into the system from these files is varchar, then why do you need them defined as nvarchar?

Jeffrey Williams
“We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

― Charles R. Swindoll

How to post questions to get better answers faster
Managing Transaction Logs