Parameterised Bulk Insert

  • I have 40 plus .csv files that I need to load into SQL.  I'm using the Bulk Insert task for this.  I did not need to use a Format File, because the format of the .csv file and corresponding table were identical.  I worked out a way of parameterising the Bulk Insert task so that I could create one package that does the Bulk Insert, and then other DTSs can call that as required.

    This worked really well up until the point that it was decided the .csv files would use " characters to surround text field values.  As soon as a comma appeared in a field value it confused my Bulk Insert DTS into thinking there were extra field(s) and it crashed.

    I am now trying to get the Bulk Insert task to recognise " characters as surrounding text fields (using generate Format File etc), but I still need to be able to parameterise the Format File like I currently do with the source file/table names.

    And that's the bit that I cannot get to work.  Despite using a Dynamic Properties task to set the Format File at runtime, the Bulk Insert will not recognise " characters and so crashes if commas appear in field values.

    I'd really appreciate some input on how to solve this problem.  Maybe there's a completely different way to paramterise a Bulk Insert?  I did try to build a SQL string that did a Bulk Insert, but I could not find a parameter for specifiying " characters as surrounding text values.

  • This was removed by the editor as SPAM

  • Not sure why you are hanging on to bulk insert?  I don't think it meets your needs.  Why not trash it and use text file source instead.  Select the appropriate settings - in this case text delimiter is ".  Run this through a transformation into your table.

    Do you have a directory with 40 files going into one table?  If so, you might consider using an ActiveX with the FileSystemObject to loop through all files in the directory until done. 

    Like this: http://www.sqldts.com/?246

    [font="Courier New"]ZenDada[/font]

  • Bulk Insert is significantly quicker than using 'text file source'.  However, that may not be so relevant with these new data loads as the amount of data being loaded is significantly less.

    There are 40+ files going into 40+ tables.  That bit is no problem however, it's just the parameterised data load.  Anyway, I'm going to consider using 'text file source' now.

    Thanks for the tips 🙂

  • Well it ain't fast if it don't work, eh? 

    Keep in mind that you can run multiple transformations all at the same time - that will save you some time right there.  Default for the package is four, but you can change it on the general tab in the package properties dialog box.

    The bcp parameters shouldn't be an issue.  You just need to go through the transformation dialog box and check the corresponding properties/constants/constraints.  If copy column is not adequate, you can also ActiveX any transformation.  If copy column IS fine, then deselect all, reselect all to combine all the transformations into a single COM object.  That is also a time saver.  What parameters are you using and why?

    If you get a chance, pick up Sams Microsoft SQL server 2000 DTS and read chapter 11.  It has a great discussion regarding exactly your issue.

    [font="Courier New"]ZenDada[/font]

  • Bulk Insert will work, but perhaps not in the parameterised way that I am trying for.

    The parameterised parts are: destination table name, source file name, and (potentially) format file name.

  • Okay then, you don't have a problem.  Source File is handled in its connection object. Destination Database is handled in the connection, table in the transformation.  Instead of the format file, you will now be mapping your transformations from the source to the desination in the transformation dialog box.  If all the mappings are different, the simplest thing to do is to create separate transformations for each import.  If you are doing the same mapping over and over, you can re-use the transformation if you like.  But this takes some coding and it will run in sequence, not in parallel.  Do it like this: create an ActiveX script as the first step.  Right click inside your package and select Disconnected Edit to view the connections, tasks and steps in your package, and their properties.  This will give you the syntax you need to refer to your objects.  Assign the DataSource of your CSV to its path, and the DestinationObjectName of the Data Pump Task to the table.  Run  the ActiveX to your transformation.  Run the transformation to another ActiveX that sets the first ActiveX to waiting and voila you have a loop - say for example if you are going to loop through all the files in a folder with the FSO.  Don't forget to handle a graceful exit from your loop.

    Actually, you can re-map dynamically too.  But I can't imagine why you would want to do that instead of just creating multiple transformations.

    Have fun!

    [font="Courier New"]ZenDada[/font]

  • Fun - yes...

    Thanks for the information.  Now to see how I get on!

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply