Import Multiple flat files with different structure into a single table

Question

Import Multiple flat files with different structure into a single table

Welsh Corgi

SSC Guru

Points: 116520
More actions
January 10, 2010 at 12:00 am

#141588

I need to create a package that imports multiple flat files (18) with different file structures into a single table 7 nights a week.
I can think of some possible ways but suggestions on the best approach, alternatives & pros & cons would be greatly appreciated. 🙂
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Viewing 11 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

LutzM SSC Guru Points: 107049 More actions · Answer 1

Would you mind sharing some more details regarding the file structure and if those files are located in the same folder or have a "patterned" file name?

Also, would you share the ways you think of?

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

Welsh Corgi SSC Guru Points: 116520 More actions · Answer 2

The Files have the same data but they are in different positions and I got the impression that there may be some manipulation of individual columns to make them standard (Script task.

All of the files are in the same directory.

The first 5 characters of the file identifies the source and the last 8 characters identify the date, in the format YYYYMMDD.

I'm not sure but in there may be a requirement to load each file in date order and if for some reason the Package failed and could end up with from from each source for more than one day.

I do not have the requirements, I was just a quick can you do something like this. :hehe: It's is nice when someone asked you if you can do something and how long would it take but does not give you the requirements but there is nothing that I can do about it.

I'm just trying to get prepared the best that I can

I could load each file into a separate staging table and then consolidate them.

I could us a Script task to manipulate the format and insert each record into the table.

There is probably a better way of doing this that I'm not aware of this.

Perhaps I should have not posted this question but I will not be able to get the requirement until sometime next week and they will want it yesterday.

So any information, partial ideas would be appreciated.

Thank you.

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Welsh Corgi SSC Guru Points: 116520 More actions · Answer 3

I found this link but I'm not sure I could utilize some of these approaches to satisfy part(s) of the (vague) requirements that I was provided?

I asked for more details but I was just asked could I do it and how long from a nontechnical Business Owner.

:laugh:

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Welsh Corgi SSC Guru Points: 116520 More actions · Answer 4

Another approach that I could take would be to load each file with the same format into an array, move the file into a temp directory, use the File System Object to open the original file and write to another file and save that it a directory that is eventually uses the For Each Loop Container to import each file.

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Welsh Corgi SSC Guru Points: 116520 More actions · Answer 5

My apologies...

The is an example of table that I want to populate from multiple files with different formats:

CREATE TABLE [dbo].[MemberFormat](

[FirstName] [nvarchar](11) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

[LastName] [nvarchar](15) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

[EMailAddress] [nvarchar](55) COLLATE SQL_Latin1_General_CP1_CI_AS NULL

) ON [PRIMARY]

This is a sample input File:

LastName20090107.txt

LastNameName|FirstName|EMailAddress

Miller|Andy|Andy.Miller@adventure-works.com

This is another sample input File with a different format:

FirstName|SurName|EMailAddress

John|Mahmood|John.Mahmood@adventure-works.com

They may be more than one LastName and FirstName File for different dates.

I hope this helps?

Did I leave out any pertinent information?

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Phil Parkin SSC Guru Points: 247180 More actions · Answer 6

So all of the source files have the same number of columns and could be accommodated by string datatypes?

That is, you could read in all the data as col1, col2, col3.

If you treat line 1 as data (not a heading), the script component could do the dynamic mapping for you (it would need some logic analysing line 1 & working out the correct mapping to be applied for the file).

Welsh Corgi SSC Guru Points: 116520 More actions · Answer 7

Sir,

I can't confirm that because the BO is out of town.

I have always been able to set standards for naming conventions and file formats but I unfortunately I have no control of that at this point.

If I get the part where I could filter the files that have the same number of column with the same data but the columns are in a different order that would be a big help.

Dumb question but can I use a condition split task (or an appropriate task) to branch off to a different mapping in the case where the files are in a different order by identifying the file format via the naming convention of the file by storing this in a variable?

There are some file that require manipulation (derived columns) but to make it simple I thought that I could filter these records and process them separately?

Am I totally out to lunch on my thought processes?

Thank you for your help!

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Phil Parkin SSC Guru Points: 247180 More actions · Answer 8

Welsh Corgi (1/11/2010)
Sir,
I can't confirm that because the BO is out of town.
I have always been able to set standards for naming conventions and file formats but I unfortunately I have no control of that at this point.
If I get the part where I could filter the files that have the same number of column with the same data but the columns are in a different order that would be a big help.
Dumb question but can I use a condition split task (or an appropriate task) to branch off to a different mapping in the case where the files are in a different order by identifying the file format via the naming convention of the file by storing this in a variable?
There are some file that require manipulation (derived columns) but to make it simple I thought that I could filter these records and process them separately?
Am I totally out to lunch on my thought processes?
Thank you for your help!

That is not a dumb question.

I am imagining a Foreach loop which will process all of the files.

An initial task within the Foreach loop would be a Script task to evaluate what type of file you are dealing with. It would set a variable accordingly ('Type1', 'Type2' etc etc).

After that, have multiple dataflows which accommodate the different file types.

Connect the Script task to the dataflows using Precedence Constraints (Type: Expression and Constraint). Use the Expression bit of the Constraint to add in your "If variable = 'Type2'" etc bits).

Phil

Welsh Corgi SSC Guru Points: 116520 More actions · Answer 9

Taking into consideration that the Report Server is currently on SQL Server 2000 and the Production Servers are using 2005 & 2008 and there may be a funding issue; perhaps using Bulk Insert in T-SQL would be a better approach since all I have to work with is an evaluation copies of 2005 and 2008 at this time.

It appears that it would involve less coding than SSIS if I'm unsuccessful in getting the trading partner's that pprovide the data to provide consistent file formats and naming convention that will simplify the coding?

The following is an example?

http://www.sqlteam.com/article/using-bulk-insert-to-load-a-text-file

Does this sound like a practical approach given the circumstances?

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Phil Parkin SSC Guru Points: 247180 More actions · Answer 10

Phil Parkin

SSC Guru

Points: 247180

January 13, 2010 at 1:42 am

#1102921

Yes it does. Give it a try and see how you get on.