SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

SQLRunner

Add to Technorati Favorites Add to Google
More Posts Next page »
All Posts

Importing binary files with SSIS

Rating: (not yet rated) Rate this |  Discuss | 1,026 Reads | 266 Reads in Last 30 Days |no comments

I have gotten a number of emails over the past few days asking about how I import binary files into SSIS as well as how to improve throughput by making tasks parallel.  I have, scattered throughout this blog, articles which show bits and pieces.  I have articles on how to use the import file task, articles on how to use the Enhanced Threading Framework I put together, articles on using SHA-1 to find duplicates, etc.  This is all well and good for the person who is looking for a specific piece of the puzzle, but what about the person who wants the whole puzzle?  I put together a sample SSIS package that has all of the components (including sql scripts to build the database pieces):

  • Script task to create a list of files to import (.pdf, but can easily be modified)
  • Data Flow Task to import that list into a table
  • Structures to show the Enhanced Threading Framework (2 engines in the example)
  • Within each ETF Engine:
    • Call to the procedure that extracts a single file to be imported
    • Script Component that sources the file to be imported
    • Import File Component
    • Script Component that calculates the SHA-1 hash
    • Lookup to only push files that do not exist (based upon hash)
    • Data destination to push the file and hash into the table

    The source can be downloaded here.


    SSIS : Using the VariableDispenser object within Script Components

    By Josef Richberg in SQLRunner 04-26-2010 12:31 PM | Categories:
    Rating: (not yet rated) Rate this |  Discuss | 218 Reads | 140 Reads in Last 30 Days |no comments

     I have become accustom to modifying package variables within my Script Task using the VariablesDispenser  object.  It is convenient and powerful.  I started working with this object due to issues I found while hacking my way through SSIS2005.  I would put the variables in the ReadOnly or ReadWrite section and would get weird errors attributed to exclusive locking..blah,blah,blah.  I naturally carried this practice over to SSIS2008 and things work the same; in theScript Task.

    Getting heavily involved in SSIS development I broke down and decided to do more Script Component work.  Prior to SSIS I had done my absolute best to keep away from VB.Net and along with that, Script Components.  I just didn't like the way VB 'felt'.  To be honest, there is nothing wrong with choosing or avoiding a language based upon personal preference, so long as the language chosen is proper for the job.  Seeing C# as an option in SSIS2008, I decided that would be my language of choice and was once again excited about getting into Script Components to begin developing again.

    My first foray back into programming Script Components, I had a hard time using the VariablesDispenser in theScript Component.  I couldn't find the Variables object.  Thinking there had to be an easier way, I went back to using the ReadOnly/ReadWrite Variables property.  This wasn't much easier.  Turns out, these variables can only be accessed within certain sections of your Script Component.  ReadWrite variables are available through the Variables object in the PostExecute() method only:

    img2

    ReadOnly variables are only available to the Variables object in the PreExecute() method.  I also found some interesting bugs regarding this (see previous post script gotchas).   This worked fine for me, but being more stubborn than smart, I wanted to figure out a way to access these variables the way I was most comfortable, using the VariablesDispenser.  A few sleepless nights later I found what I was looking for; IDTSVariables100 object.

     image1

    You no longer access the VariableDispenser object through Dts and you no longer pass in the a variable of typeVariables.  Instead you access the package level variables directly through VariableDispenser and pass all of the locked package variables into a local variable of type IDTSVariables100.  

    "So what does this buy me?".  First, I am not constrained to have my ReadOnly variables accessible only in thePreExecute() method and my ReadWrite variables accessible only in the PostExecute() method.  This means I don't need local variables to pass information around or keep counts of things to later push out.  This is really only useful inScript Source and Script Destination objects where you don't loop through the methods for each row.  It's also easier to copy/past the code in the above example to move to it multiple scripts vs having to hand code each script'sReadOnly/ReadWrite section.  Last, but not least; personal preference.


    SSIS : Using the VariableDispenser object within Script Components

    By Josef Richberg in SQLRunner 04-26-2010 8:31 AM | Categories: Filed under: , ,
    Rating: (not yet rated) Rate this |  Discuss | 952 Reads | 179 Reads in Last 30 Days |2 comment(s)

     I have become accustom to modifying package variables within my Script Task using the VariablesDispenser  object.  It is convenient and powerful.  I started working with this object due to issues I found while hacking my way through SSIS2005.  I would put the variables in the ReadOnly or ReadWrite section and would get weird errors attributed to exclusive locking..blah,blah,blah.  I naturally carried this practice over to SSIS2008 and things work the same; in theScript Task.

    Getting heavily involved in SSIS development I broke down and decided to do more Script Component work.  Prior to SSIS I had done my absolute best to keep away from VB.Net and along with that, Script Components.  I just didn't like the way VB 'felt'.  To be honest, there is nothing wrong with choosing or avoiding a language based upon personal preference, so long as the language chosen is proper for the job.  Seeing C# as an option in SSIS2008, I decided that would be my language of choice and was once again excited about getting into Script Components to begin developing again.

    My first foray back into programming Script Components, I had a hard time using the VariablesDispenser in theScript Component.  I couldn't find the Variables object.  Thinking there had to be an easier way, I went back to using the ReadOnly/ReadWrite Variables property.  This wasn't much easier.  Turns out, these variables can only be accessed within certain sections of your Script Component.  ReadWrite variables are available through the Variables object in the PostExecute() method only:

    img2

    ReadOnly variables are only available to the Variables object in the PreExecute() method.  I also found some interesting bugs regarding this (see previous post script gotchas).   This worked fine for me, but being more stubborn than smart, I wanted to figure out a way to access these variables the way I was most comfortable, using the VariablesDispenser.  A few sleepless nights later I found what I was looking for; IDTSVariables100 object.

     image1

    You no longer access the VariableDispenser object through Dts and you no longer pass in the a variable of typeVariables.  Instead you access the package level variables directly through VariableDispenser and pass all of the locked package variables into a local variable of type IDTSVariables100.  

    "So what does this buy me?".  First, I am not constrained to have my ReadOnly variables accessible only in thePreExecute() method and my ReadWrite variables accessible only in the PostExecute() method.  This means I don't need local variables to pass information around or keep counts of things to later push out.  This is really only useful inScript Source and Script Destination objects where you don't loop through the methods for each row.  It's also easier to copy/past the code in the above example to move to it multiple scripts vs having to hand code each script'sReadOnly/ReadWrite section.  Last, but not least; personal preference.


    Simple Series : SSIS Conditional Split

    By Josef Richberg in SQLRunner 03-26-2010 12:22 AM | Categories:
    Rating: (not yet rated) Rate this |  Discuss | 184 Reads | 108 Reads in Last 30 Days |no comments

    Conditional Split is a very powerful component, enabling you to send records in different directions or exclude them from the downstream altogether.  I use this component quite often in my SSIS packages.  In the industry I work in, publishing, there are many times we'll get a file in from an aggregation company or a large distributor of our books (like Amazon).  Many times the ISBN is poorly formatted and this might be for any number of reasons.  A good example of this is we generally get this aggregated information via excel sheet (yes, because humans love excel) and if you don't format the sheet right, an ISBN10, which starts '00', will have the leading zeros truncated.  We might have an isbn with the dashes in place, much like you see on the Barnes&Noble site.  We have processes to scrub this data, but in the end they might have bad data, so I use a conditional split to extract this data out and push it to an error section, be it an error template we mail back to a person and/or a local table that we store this information in.  I have found at times embedded control totals.  

    We love control totals as it gives us numbers to compare our process to.  If we add up a bunch of numbers and our numbers match the control totals given, we know we are doing things the same way the aggregator is.  This is wonderful, except we do not want to control totals in the item level table, that is what will will sum to match control totals.  The control totals go into a different table.  This is where a gotcha with Conditional Split comes in.

    Have you ever noticed the numbers on the left hand side and the up and down arrows on the right of the component.  Neither did I until I got bit.  Conditional Split component compares each condition in that order, top to bottom, first to last.   This means that if you have a very restrictive conditional after a less restrictive condition, you run the risk of the first condition (less restrictive) meeting the criteria of the data you wanted for the more restrictive.  This means you data will go down the wrong path.  Let me explain again, with pictures:

     

    img 1

    It turns out that in the excel source, all of the columns except C103_NetAmount are null.  That means the InvalidISBN will match the columns that are destined for GrandTotal.  The GrandTotal condition is much more restrictive than the BadISBN condition.  The simple way to fix this is to select GrandTotal and then click the up_arrow.  This will move the GrandTotal condition up to number 1.  The result is below:

    img2

    Remember to put your conditions in top to bottom order; Most restrictive to least restrictive.  I also tend to put the fewest number of checks within the condition.  I have nearly a dozen columns in the table and I could have listed the value of each one as it pertains to the condition, but why?  I select a column or two that meet the condition and use it.  In this case there was an overlap, so I simply rearranged the order.  This is much simpler than have a 10+ condition.  Very hard to read and potentially hard to maintain.


    Simple Series : SSIS Conditional Split

    By Josef Richberg in SQLRunner 03-25-2010 8:22 PM | Categories: Filed under: ,
    Rating: (not yet rated) Rate this |  Discuss | 1,095 Reads | 142 Reads in Last 30 Days |1 comment(s)

    Conditional Split is a very powerful component, enabling you to send records in different directions or exclude them from the downstream altogether.  I use this component quite often in my SSIS packages.  In the industry I work in, publishing, there are many times we'll get a file in from an aggregation company or a large distributor of our books (like Amazon).  Many times the ISBN is poorly formatted and this might be for any number of reasons.  A good example of this is we generally get this aggregated information via excel sheet (yes, because humans love excel) and if you don't format the sheet right, an ISBN10, which starts '00', will have the leading zeros truncated.  We might have an isbn with the dashes in place, much like you see on the Barnes&Noble site.  We have processes to scrub this data, but in the end they might have bad data, so I use a conditional split to extract this data out and push it to an error section, be it an error template we mail back to a person and/or a local table that we store this information in.  I have found at times embedded control totals.  

    We love control totals as it gives us numbers to compare our process to.  If we add up a bunch of numbers and our numbers match the control totals given, we know we are doing things the same way the aggregator is.  This is wonderful, except we do not want to control totals in the item level table, that is what will will sum to match control totals.  The control totals go into a different table.  This is where a gotcha with Conditional Split comes in.

    Have you ever noticed the numbers on the left hand side and the up and down arrows on the right of the component.  Neither did I until I got bit.  Conditional Split component compares each condition in that order, top to bottom, first to last.   This means that if you have a very restrictive conditional after a less restrictive condition, you run the risk of the first condition (less restrictive) meeting the criteria of the data you wanted for the more restrictive.  This means you data will go down the wrong path.  Let me explain again, with pictures:

     

    img 1

    It turns out that in the excel source, all of the columns except C103_NetAmount are null.  That means the InvalidISBN will match the columns that are destined for GrandTotal.  The GrandTotal condition is much more restrictive than the BadISBN condition.  The simple way to fix this is to select GrandTotal and then click the up_arrow.  This will move the GrandTotal condition up to number 1.  The result is below:

    img2

    Remember to put your conditions in top to bottom order; Most restrictive to least restrictive.  I also tend to put the fewest number of checks within the condition.  I have nearly a dozen columns in the table and I could have listed the value of each one as it pertains to the condition, but why?  I select a column or two that meet the condition and use it.  In this case there was an overlap, so I simply rearranged the order.  This is much simpler than have a 10+ condition.  Very hard to read and potentially hard to maintain.


    SSIS 2008 Script Component variable gotcha..

    By Josef Richberg in SQLRunner 02-17-2010 4:26 AM | Categories:
    Rating: (not yet rated) Rate this |  Discuss | 190 Reads | 108 Reads in Last 30 Days |no comments

    In a previous post I showed you how to access variables from within an SSIS script component.  More specifically I showed you how to access variables from within an Script Destination Component.  You can not access the read/write variables using the standard Dts.VariableDispenser that you are familiar with in the Script Task (from the Control Flow tab).  There is a variation on that which allows you to access read_only variables (which I will blog about soon).  The original image I showed, looked like this:

     img1

    I have a few variables that I want to modify, so I put them in the ReadWriteVariables section and a few variables that I want to only read from, which I put in the ReadOnlyVariables section.   The code I was using was simple enough:

    img2

    Therein lies the problem.  It turns out that putting any variable in the ReadOlyVariables section that you do not actively use, such as in the example below:

    img3

    will cause a SQLDUMPER error to be generated.  It took me a few hours to figure out what it was and to find it.  The error is located here:

    C:\Program Files\Microsoft Visual Studio 9.0\Common7\IDE\SQLDUMPER_ERRORLOG.log

    The actual output, which provides no information that I can figure out, looks like this:

     

    img4

    How do you solve the issue?  Put any and all variables you plan on using within your Script Destination Component in the proper section.  Don't look at the variables in terms of their actual 'properties', that is to say even if the variable you are using should be read_only or read/write.  If you plan on using them in the PostExecute() method, you must register those variables in the ReadWriteVariables section.  If you plan on using the variables in the PreExecute() method, then place the variables in theReadOnlyVariablesection and you must use them.  I have not checked to see if the reverse is true, that is putting a variable in the ReadWriteVariables section, but not using it in the PostEecute() method causes the same issue, but I would use the rule I hinted at above.

     


    SSIS 2008 Script Component variable gotcha..

    Rating: (not yet rated) Rate this |  Discuss | 3,445 Reads | 452 Reads in Last 30 Days |3 comment(s)

    In a previous post I showed you how to access variables from within an SSIS script component.  More specifically I showed you how to access variables from within an Script Destination Component.  You can not access the read/write variables using the standard Dts.VariableDispenser that you are familiar with in the Script Task (from the Control Flow tab).  There is a variation on that which allows you to access read_only variables (which I will blog about soon).  The original image I showed, looked like this:

     img1

    I have a few variables that I want to modify, so I put them in the ReadWriteVariables section and a few variables that I want to only read from, which I put in the ReadOnlyVariables section.   The code I was using was simple enough:

    img2

    Therein lies the problem.  It turns out that putting any variable in the ReadOlyVariables section that you do not actively use, such as in the example below:

    img3

    will cause a SQLDUMPER error to be generated.  It took me a few hours to figure out what it was and to find it.  The error is located here:

    C:\Program Files\Microsoft Visual Studio 9.0\Common7\IDE\SQLDUMPER_ERRORLOG.log

    The actual output, which provides no information that I can figure out, looks like this:

     

    img4

    How do you solve the issue?  Put any and all variables you plan on using within your Script Destination Component in the proper section.  Don't look at the variables in terms of their actual 'properties', that is to say even if the variable you are using should be read_only or read/write.  If you plan on using them in the PostExecute() method, you must register those variables in the ReadWriteVariables section.  If you plan on using the variables in the PreExecute() method, then place the variables in theReadOnlyVariablesection and you must use them.  I have not checked to see if the reverse is true, that is putting a variable in the ReadWriteVariables section, but not using it in the PostEecute() method causes the same issue, but I would use the rule I hinted at above.

     


    I've been published on Simple-Talk.

    By Josef Richberg in SQLRunner 02-09-2010 3:16 AM | Categories:
    Rating: (not yet rated) Rate this |  Discuss | 182 Reads | 102 Reads in Last 30 Days |no comments

     I am excited to say I have been published on Simple-Talk.  This was my first foray into being professionally edited.  I will say the experience was not what I imagined.  I guess everyone has different ideas of what it is like to have an editors review your work.  I will say that my writing follows the 'train-of-thought' method (I think it's just how I'm wired), so my grammar did not go unnoticed.  One of the most interesting things I learned out of this process is to take into consideration foreign readers.  Many of the grammar comments explained that to a reader where English is a second language the wording might be confusing.  The technical review also shed some light on what I wrote vs. what I was trying to accomplish.  That second set of eyes helped me realize I took quite of bit of knowledge for granted, so I had to revise many of my descriptions and explanations.  Would I write again?  Most definitely!  The hard part for me is finding the idea for the article.  After that I just let the idea pour onto the paper and start the editorial process all over again.


    I've been published on Simple-Talk.

    By Josef Richberg in SQLRunner 02-08-2010 10:16 PM | Categories: Filed under: , ,
    Rating: (not yet rated) Rate this |  Discuss | 1,353 Reads | 238 Reads in Last 30 Days |2 comment(s)

     I am excited to say I have been published on Simple-Talk.  This was my first foray into being professionally edited.  I will say the experience was not what I imagined.  I guess everyone has different ideas of what it is like to have an editors review your work.  I will say that my writing follows the 'train-of-thought' method (I think it's just how I'm wired), so my grammar did not go unnoticed.  One of the most interesting things I learned out of this process is to take into consideration foreign readers.  Many of the grammar comments explained that to a reader where English is a second language the wording might be confusing.  The technical review also shed some light on what I wrote vs. what I was trying to accomplish.  That second set of eyes helped me realize I took quite of bit of knowledge for granted, so I had to revise many of my descriptions and explanations.  Would I write again?  Most definitely!  The hard part for me is finding the idea for the article.  After that I just let the idea pour onto the paper and start the editorial process all over again.


    Service Broker, Triggers, CLR. and SSIS Threading Framework

    By Josef Richberg in SQLRunner 02-07-2010 3:53 PM | Categories:
    Rating: (not yet rated) Rate this |  Discuss | 185 Reads | 106 Reads in Last 30 Days |no comments

    I have been trying for a few days to implement a Service Broker implementation to, service the threading framework I built for SSIS.  When you build the threading framework you have a set number of threads running and I have been looking for a way to modify the number of running threads to accommodate server load.  There are many ways to do this, such as have the threads check a table and turn on/off, but I don't care for that method.  The issue I have is you need to have the thread, once it is 'asleep' to wake up every so often and poll the table.  That's a waste of resources in my opinion.  I would prefer to have a callback mechanism where the thread is put to sleep and woken up when it is time to come back online.  To do this I choose Service Broker.  Threads will 'wait' on a queue and wake up when they should or never if they are to be kept offline.  No table polling, no contention, no wasted resources.

    I was all Gung-Ho.  I had tested my Service Broker implementation with two procedures to simulate a request to throttle the engines.  I had a slight set-back, but @mrDenny was kind enough to jump in a give me the necessary information that every resource I could find failed to explain.  I was ready to implement my Service Broker trigger based implementation.

    Whoa did the frustration ensue.  The concept was simple.  I had a table which contained a listing of what you could request of the framework, which at the moment is a certain number of engines on/off, based upon the number of engines in the framework (can't ask for more than are currently built).  Realizing that the engines could only check the queue once they finished their work, I didn't want to have a person request taking X engines offline and wait N minutes for the request to complete.  I wanted the table to update and return immediately then have the Service Broker asynchronously handle waiting. What followed was a miserable fail.

    I had the trigger built exactly like my test procedures, yet every time I did an update to the table, the trigger would hang on sending the message, not on the wait.  I spent a few hours racking my brains trying to understand this.   I was checking queues, I was checking error logs (thanks to@AaronBertrand), but couldn't figure out why.  I hit the web and found there were a few people in the same boat as I was and they had no answers to their questions.  Dig a little deeper and I found the only way to make a trigger asynchronous is do make a CLR trigger.  My next task.

    Spent a few more hours pouring through the online msdn library looking for clr information (here is a good starting point http://msdn.microsoft.com/en-us/library/ms131093.aspx) and TADAH!. I created a clr trigger and it works like a charm.  I do an update to the table and I get an instant return, allowing the user to continue on.

    I now have to integrate this into threading framework, but that won't be too hard.  The framework uses a stored procedure to dole out the work, so I just have to put the SB code in there, vs having to modify my SSIS package.  If you are unfamiliar with the threading framework I designed for SSIS, you can download the deck here (http://josef-richberg.squarespace.com/downloads/) and a recorded presentation here http://appdev.sqlpass.org/MeetingArchive/tabid/2005/Default.aspx.

    I am not sure yet if this will be a blog post (it will be very large) or an article (more to come on that).


    Service Broker, Triggers, CLR. and SSIS Threading Framework

    By Josef Richberg in SQLRunner 02-07-2010 10:53 AM | Categories: Filed under: , , ,
    Rating: (not yet rated) Rate this |  Discuss | 2,269 Reads | 329 Reads in Last 30 Days |6 comment(s)

    I have been trying for a few days to implement a Service Broker implementation to, service the threading framework I built for SSIS.  When you build the threading framework you have a set number of threads running and I have been looking for a way to modify the number of running threads to accommodate server load.  There are many ways to do this, such as have the threads check a table and turn on/off, but I don't care for that method.  The issue I have is you need to have the thread, once it is 'asleep' to wake up every so often and poll the table.  That's a waste of resources in my opinion.  I would prefer to have a callback mechanism where the thread is put to sleep and woken up when it is time to come back online.  To do this I choose Service Broker.  Threads will 'wait' on a queue and wake up when they should or never if they are to be kept offline.  No table polling, no contention, no wasted resources.

    I was all Gung-Ho.  I had tested my Service Broker implementation with two procedures to simulate a request to throttle the engines.  I had a slight set-back, but @mrDenny was kind enough to jump in a give me the necessary information that every resource I could find failed to explain.  I was ready to implement my Service Broker trigger based implementation.

    Whoa did the frustration ensue.  The concept was simple.  I had a table which contained a listing of what you could request of the framework, which at the moment is a certain number of engines on/off, based upon the number of engines in the framework (can't ask for more than are currently built).  Realizing that the engines could only check the queue once they finished their work, I didn't want to have a person request taking X engines offline and wait N minutes for the request to complete.  I wanted the table to update and return immediately then have the Service Broker asynchronously handle waiting. What followed was a miserable fail.

    I had the trigger built exactly like my test procedures, yet every time I did an update to the table, the trigger would hang on sending the message, not on the wait.  I spent a few hours racking my brains trying to understand this.   I was checking queues, I was checking error logs (thanks to@AaronBertrand), but couldn't figure out why.  I hit the web and found there were a few people in the same boat as I was and they had no answers to their questions.  Dig a little deeper and I found the only way to make a trigger asynchronous is do make a CLR trigger.  My next task.

    Spent a few more hours pouring through the online msdn library looking for clr information (here is a good starting point http://msdn.microsoft.com/en-us/library/ms131093.aspx) and TADAH!. I created a clr trigger and it works like a charm.  I do an update to the table and I get an instant return, allowing the user to continue on.

    I now have to integrate this into threading framework, but that won't be too hard.  The framework uses a stored procedure to dole out the work, so I just have to put the SB code in there, vs having to modify my SSIS package.  If you are unfamiliar with the threading framework I designed for SSIS, you can download the deck here (http://josef-richberg.squarespace.com/downloads/) and a recorded presentation here http://appdev.sqlpass.org/MeetingArchive/tabid/2005/Default.aspx.

    I am not sure yet if this will be a blog post (it will be very large) or an article (more to come on that).


    Sending SSIS Errors to the Windows Event Log

    By Josef Richberg in SQLRunner 02-01-2010 8:15 AM | Categories: Filed under: , ,
    Rating: (not yet rated) Rate this |  Discuss | 3,169 Reads | 316 Reads in Last 30 Days |3 comment(s)

     In a previous post I explained how to trap for errors in your Script Task and Components and send the information to the SSIS Log (link).  A good friend of mine and system engineer extraordinaire Eugene L. (last name withheld to protect the guilty :) ) suggested that I also push these errors to the windows event log.  We have HP OpenView which is monitored by our operations group, which will enable us to setup alerts through them as well.  This sounded simple enough and after about 5 min of research it turns out it is simple.  All you need to do is add an additional class and one additional object.

    First you need to add, using System.Diagnostics.  This provides the hook to the classes that will enable you to push your error to the Windows Event Log.  Now create the object to interact with it:

    EventLog evLog = new EventLog("Application", System.Environment.MachineName, "ImprovedTidalExtract");

    I set up the header for the log.  This is an "application", being run on "System.Environment.MachineName", and the application name is "ImprovedTidalExtract".

    Now I have to provide the detail information, that is the actual error.  Within the Catch block, I add an additional line.

    img1

     

    How will this look within the event log?  Like this:

    img2

     

     The detailed entry I created with the above try/catch looks like this:

    img3


    Try Catch within an SSIS Script Tasks and Components

    By Josef Richberg in SQLRunner 01-30-2010 11:05 PM | Categories: Filed under: , ,
    Rating: (not yet rated) Rate this |  Discuss | 1,401 Reads | 212 Reads in Last 30 Days |no comments

     Using a Try/Catch/Finally within your Script Components and Script Tasks is just good coding practice.  The thing you want to be able to do is push any errors you trap to a log file.  The syntax is somewhat simple, but it took me some time to research it and it differs depending upon whether it's aComponent or Task.  I'll go over each.

    Within the Component there is no access to the Dts object, so you need to work through theComponentMetaData object as shown below.  Please note the pbCancel is defined as a bool at the top of my Component.

    catch (SqlException sqle)
    {
       ComponentMetaData.FireError(1, "parse and insert into T011_Tree", sqle.Message, "", 0, out pbCancel);
    }

    The tricky part is using the ComponentMetaData to access the log.  The Component will fail if any of theTry block fails, regardless of the pbCancel value.  This is primarily to interact with the log, like a good programmer :)
     
    Within the Task you access the log from the same object you access just about anything, Dts.  To access the log you simply use Dts.Log and you can fail the Task by simply setting the Dts.TaskResult=Dts.Results.Failure.

    Working with ADO.Net variables in SSIS scripts

    By Josef Richberg in SQLRunner 01-25-2010 1:17 PM | Categories:
    Rating: (not yet rated) Rate this |  Discuss | 186 Reads | 102 Reads in Last 30 Days |no comments

    I am always looking to improve the performance of my SSIS packages.  I try to do as much work within the package as possible, only going to external stores (databases, flat/raw files) as little as possible.  There are a number of reasons for this.  Raw files are incredibly fast and very good when you need to have multiple stages within your package.  I use Raw Files to store about 20 GB of data which is broken out among 8 threads.  This was necessary for a number of reasons one if them being the ability to reload the tables from this raw data if we ran out of time during the initial load window. 

    In SSIS you can store the result set of a stored procedure in a System.object, which is translated into an ADO object.  I noticed the only SSIS component that can use this source is the Foreach Loop.  I can loop through the variable and process the results.  At the bottom of this post I linked to the request to add the ability for the Data Flow ADO.NET source/destination to use a System.Object.

    I have a few situations where I have a flat file and would like to read that into memory for processing.  Currently reading in a flat file the traditional way, with a Data Flow task, only allows me to push that into a data source.  I dug a little deeper into the SSIS script component and found a way to read the flat file from within a Script Component into an ADO component, which can then be read by a Foreach Loop.

    If I get enough people to vote for the enhancement this technique can be used to eventually build in-memory data sets, start to finish.

    Let's dissect the C# component I wrote. 

    img1 

    Make sure you include using System.Data in your script.  This is where all of the ADO.NET objects live.  Now, you need to create a table structure to store the data.  According to the API you can name your table, but I have yet to find a need for it:

    DataTable dt = new DataTable("Parent_Child");

    Now you need to create columns for your table.  In my case I need a two columns, one for the parentID and one for the ChildID.  Since I am reading in text fields and figured I can always case on insert into SQL Server.

     

    img2

    Now you have your table all ready to receive data, you need to lock your SSIS variable so you can access this data outside of the script.  I happen to have named my variable test_obj, which is of course of type System.Object.

                Dts.VariableDispenser.LockForWrite("User::test_obj");

     

     

     

    img3

    The script image above shows the technique to load up the local object, DataTable dt.  Once that is completed you pass the local object back out to the SSIS variable and unlock the variables like so:

                    variablesList["User::test_obj"].Value = dt;
                    variablesList.Unlock();

    Remember to unlock your list AFTER you set the value of your variables.  At this point you have loaded all of the data pulled from the flat file in a simple two column table that resides in an SSIS object variable that you can now loop through, like so:

     

    img4

     
     

    Please click on the link below and vote to  enhance SSIS to enable ADO source/destination to accept variables.

    SSIS Enhancement

     


    Working with ADO.Net variables in SSIS scripts

    By Josef Richberg in SQLRunner 01-25-2010 8:17 AM | Categories: Filed under: , ,
    Rating: |  Discuss | 3,546 Reads | 300 Reads in Last 30 Days |2 comment(s)

    I am always looking to improve the performance of my SSIS packages.  I try to do as much work within the package as possible, only going to external stores (databases, flat/raw files) as little as possible.  There are a number of reasons for this.  Raw files are incredibly fast and very good when you need to have multiple stages within your package.  I use Raw Files to store about 20 GB of data which is broken out among 8 threads.  This was necessary for a number of reasons one if them being the ability to reload the tables from this raw data if we ran out of time during the initial load window. 

    In SSIS you can store the result set of a stored procedure in a System.object, which is translated into an ADO object.  I noticed the only SSIS component that can use this source is the Foreach Loop.  I can loop through the variable and process the results.  At the bottom of this post I linked to the request to add the ability for the Data Flow ADO.NET source/destination to use a System.Object.

    I have a few situations where I have a flat file and would like to read that into memory for processing.  Currently reading in a flat file the traditional way, with a Data Flow task, only allows me to push that into a data source.  I dug a little deeper into the SSIS script component and found a way to read the flat file from within a Script Component into an ADO component, which can then be read by a Foreach Loop.

    If I get enough people to vote for the enhancement this technique can be used to eventually build in-memory data sets, start to finish.

    Let's dissect the C# component I wrote. 

    img1 

    Make sure you include using System.Data in your script.  This is where all of the ADO.NET objects live.  Now, you need to create a table structure to store the data.  According to the API you can name your table, but I have yet to find a need for it:

    DataTable dt = new DataTable("Parent_Child");

    Now you need to create columns for your table.  In my case I need a two columns, one for the parentID and one for the ChildID.  Since I am reading in text fields and figured I can always case on insert into SQL Server.

     

    img2

    Now you have your table all ready to receive data, you need to lock your SSIS variable so you can access this data outside of the script.  I happen to have named my variable test_obj, which is of course of type System.Object.

                Dts.VariableDispenser.LockForWrite("User::test_obj");

     

     

     

    img3

    The script image above shows the technique to load up the local object, DataTable dt.  Once that is completed you pass the local object back out to the SSIS variable and unlock the variables like so:

                    variablesList["User::test_obj"].Value = dt;
                    variablesList.Unlock();

    Remember to unlock your list AFTER you set the value of your variables.  At this point you have loaded all of the data pulled from the flat file in a simple two column table that resides in an SSIS object variable that you can now loop through, like so:

     

    img4

     
     

    Please click on the link below and vote to  enhance SSIS to enable ADO source/destination to accept variables.

    SSIS Enhancement

     

    More Posts Next page »