SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Loading a Series of Flat Files


Loading a Series of Flat Files

Author
Message
Andy Jones, DBA
Andy Jones, DBA
Ten Centuries
Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)

Group: General Forum Members
Points: 1337 Visits: 544
Comments posted to this topic are about the content posted at http://www.sqlservercentral.com/columnists/ajones/load.asp

.
philcart
philcart
SSCertifiable
SSCertifiable (5.3K reputation)SSCertifiable (5.3K reputation)SSCertifiable (5.3K reputation)SSCertifiable (5.3K reputation)SSCertifiable (5.3K reputation)SSCertifiable (5.3K reputation)SSCertifiable (5.3K reputation)SSCertifiable (5.3K reputation)

Group: General Forum Members
Points: 5325 Visits: 1441
Good article. Always like articles with code that I can play with

One comment I would make though is that I'd eliminate the GOTO's by using positive logic. eg:

IF @rtn = 0
begin
do something
if @rtn = 0
begin
do something more
end
else
begin
flag error
end
end
else
begin
flag error
end


Also, is it my browser settings or your formatting that double spaces all the lines?


Edited by - phillcart on 10/31/2002 02:39:37 AM

Hope this helps
Phill Carter
--------------------
Colt 45 - the original point and click interface

Australian SQL Server User Groups - My profile
Phills Philosophies
Murrumbeena Cricket Club
johnhind
johnhind
Forum Newbie
Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)

Group: General Forum Members
Points: 7 Visits: 1
Good article - I agree that it's good to see the code as well. Perhaps it ought to be downloadable, and have a version with your error checking as this is very important.

It would be interesting to write a VB app, using ADO to load the data, to see which would be fastest.

Personally I would use VB, but I'm a little biased !

thanks
John, UK



Andy Jones, DBA
Andy Jones, DBA
Ten Centuries
Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)

Group: General Forum Members
Points: 1337 Visits: 544
Thanks for your comments - some answers: -

I used goto labels because my original solution required the logging of a specific error message and then a general one e.g. 'Load of <file> failed at <Datatime>' so I used the goto label to keep all logging in the same place.
I think the double space formatting is my fault!
With regard to the speed of load, I have found the data pump task to be extremely efficient when loading files with a large number of records, although I havn't got any comparison figures with other methods.


Regards,
Andy Jones

.
jloesch
jloesch
SSC-Addicted
SSC-Addicted (440 reputation)SSC-Addicted (440 reputation)SSC-Addicted (440 reputation)SSC-Addicted (440 reputation)SSC-Addicted (440 reputation)SSC-Addicted (440 reputation)SSC-Addicted (440 reputation)SSC-Addicted (440 reputation)

Group: General Forum Members
Points: 440 Visits: 7
My biggest project of the last 2 years has been a process that automatically logs and imports files that are uploaded to our server from stores. This process handles both transaction and customer files from 4 different POS systems, each with their own file format. The files also come compressed/archived in .ZIP, .Tar, or .tar.Z formats. (I skipped a lot of details, it gets even more 'interesting'.)

For my solution, I used 3 DTS packages. The first one has an ActiveX script that uses the file system object to loop through the files in the directory, logs any new ones, and performs any extraction that is necessary. This happens every 5 minutes.

The second package, which is executed every 30 minutes, goes through the log (a table) searching for files that haven't imported, and calls the third package for each file that needs to be imported, passing the filename via a global variable.

That's the basics of the process. I personally like the DTS method better because of flexibility, error handling capability, and logging. Whenever an error occurs, I get an email telling me where it occurred, and if it happens in an ActiveX script, the email includes the line of code where the error occurred. I also prefer to keep my DTS packages in SQL Server, its simpler to access them, and they're always backed up.

One final advantage to the way I do it, everything I use is documented. May not be well-documented, but at least there's something there. You want to be careful about using undocumented stuff, especially since it may not always be there.

Anyway, there's my 2-cents worth. If anyone is interested in more information, I'll be happy to provide. My process works; that's gotta be worth something.

James C Loesch


James C Loesch
philcart
philcart
SSCertifiable
SSCertifiable (5.3K reputation)SSCertifiable (5.3K reputation)SSCertifiable (5.3K reputation)SSCertifiable (5.3K reputation)SSCertifiable (5.3K reputation)SSCertifiable (5.3K reputation)SSCertifiable (5.3K reputation)SSCertifiable (5.3K reputation)

Group: General Forum Members
Points: 5325 Visits: 1441
quote:

Thanks for your comments - some answers: -

I used goto labels because my original solution required the logging of a specific error message and then a general one e.g. 'Load of <file> failed at <Datatime>' so I used the goto label to keep all logging in the same place.
I think the double space formatting is my fault!
With regard to the speed of load, I have found the data pump task to be extremely efficient when loading files with a large number of records, although I havn't got any comparison figures with other methods.


Regards,
Andy Jones



You can still keep the error logging in one place. In my example where I've said flag error this could set an error flag and/or build an error message. Then at the end of the proc you'd just check to see if an error occurred and then log it.

I had a bad experience with goto's as a child and have been mortally afraid of them ever since


Thanks
Phill Carter

Edited by - phillcart on 10/31/2002 4:11:12 PM

Hope this helps
Phill Carter
--------------------
Colt 45 - the original point and click interface

Australian SQL Server User Groups - My profile
Phills Philosophies
Murrumbeena Cricket Club
Deuce
Deuce
SSC Veteran
SSC Veteran (294 reputation)SSC Veteran (294 reputation)SSC Veteran (294 reputation)SSC Veteran (294 reputation)SSC Veteran (294 reputation)SSC Veteran (294 reputation)SSC Veteran (294 reputation)SSC Veteran (294 reputation)

Group: General Forum Members
Points: 294 Visits: 14
Interesting solution. Thanks for bringing those undocumented xp's to light. Smile

Personally, I'd do a couple things different. It's probably the programmer in me, but I'd replace sp_GetFiles, and sp_FolderFileExist with functions. Then instead of working with ##Temp tables, your curser would be based on

select * from dbo.GetFiles(@myfolder)

I'd also use BCP instead of DTS to load the files. Again, personal preference.

(Johnhind, I'd put money on dts bulk load or bcp over any vb solution.)

Thanks again,

John



Andy Jones, DBA
Andy Jones, DBA
Ten Centuries
Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)

Group: General Forum Members
Points: 1337 Visits: 544
Good point about using a UDF. With regard to using BCP, my original requirement used files which didn't have a consistent number of fields on each line e.g.: -
1,2
1,2,3
1
etc...

I couldn't get BCP / Bulk insert to work with these types of files, so that was the reason behind using DTS.

Thanks for the reply.

Regards,
Andy Jones

.
Deuce
Deuce
SSC Veteran
SSC Veteran (294 reputation)SSC Veteran (294 reputation)SSC Veteran (294 reputation)SSC Veteran (294 reputation)SSC Veteran (294 reputation)SSC Veteran (294 reputation)SSC Veteran (294 reputation)SSC Veteran (294 reputation)

Group: General Forum Members
Points: 294 Visits: 14
Andy,

Must be fixed width fields, otherwise you could spedify the delimeter in the BCP command. In the case of fixed width files, you could bcp into a generic table:

create table Import
( id int identity primary key,
data varchar(4000)
)

Then parse the data field as needed in sql. Just an option.

John



Scorpion_66
Scorpion_66
Ten Centuries
Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)Ten Centuries (1.1K reputation)

Group: General Forum Members
Points: 1051 Visits: 26
I have been doing this for years with stored procs, BCP, and a few control tables. Multiple Import sources, multiple import structures within the same file, going to multiple tables determined by structure of data line.
I was very interested in seeing it done using DTS, excellent article....



Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search