Collect Data from Multiple Instances Using SSIS and Powershell

  • Comments posted to this topic are about the item Collect Data from Multiple Instances Using SSIS and Powershell

  • You could also use a Service Broker queue to achieve this.

    The queue would let you manage how much you taxed the server by restricting the number of concurrent SSIS packages that were allowed to run.

  • Nice article I had not thought of using power shell inside of Ssis before. This will be very useful

  • Very interesting article, just last two weeks i was working using SSIS without PowerShell. Since i am new to SSIS, took little time to understand. The problem i am facing is handling the error output redirecting to separate table. Because since we have around 200 instances with different domain, some of them failing to connect for various reasons. I like to capture those Instances name.

    Is there any one can give me some idea how to handle the error handling.

    Thank you

  • Stuart Pearson (2/23/2015)


    You could also use a Service Broker queue to achieve this.

    The queue would let you manage how much you taxed the server by restricting the number of concurrent SSIS packages that were allowed to run.

    I'm not very familiar with the service broker, but I will certainly look into that. I had a lot of trouble messing around with that a few years ago trying to get a simple stored procedure to fire and wound up giving up on it (I know, should have tried more). However, thanks for the suggestion.

    The alternative to service broker that fits into the powershell script (which I added to my personal script since I drafted the article.), is to check how many jobs are running before starting additional.

    #atop the script, add this line to set a max

    $maxnumberofconcurrentjobs = 50

    #at the end of the "try" code block, you can add this before starting additional jobs

    while (@(Get-Job -State "Running").Count -ge $maxnumberofconcurrentjobs) {Start-Sleep 10}

  • s.selvarani (2/23/2015)


    Very interesting article, just last two weeks i was working using SSIS without PowerShell. Since i am new to SSIS, took little time to understand. The problem i am facing is handling the error output redirecting to separate table. Because since we have around 200 instances with different domain, some of them failing to connect for various reasons. I like to capture those Instances name.

    Is there any one can give me some idea how to handle the error handling.

    Thank you

    Within the SSIS package, you can use the "Event Handler" tab at the package level and the event handler tab can have different tasks added on. For example, I use a script task on some of mine to collect the error info from the package's system variables, push those into use variables, and then send an email using the mail task. Within the script task, I built an email body that includes the value from the "instance_name" user variable (per the sample ssis package provided).

    Sample email output here:

    There was an error during the execution of the package myPackage on myInstanceName.

    Details of the error are as follows:

    Error Number: -1073573396

    Error Description: Failed to acquire connection "instance_connectino". Connection may not be configured correctly or you may not have the right permissions on this connection.

    Error Source: myExecuteSQLTaskName

    Instance loaded at time of failure: instance1

    Below is the code to generate that email. the serverName and email_message are user variables defined on the package, and I had some other email parameters defined as well.

    The script task content is here:

    #Region " Main() "

    Public Sub Main()

    'Assume success

    Dts.TaskResult = ScriptResults.Success

    Try

    Dim errNumber As Integer

    Dim errDescription As String

    Dim errSource As String

    Dim packageName As String

    Dim machineName As String

    'Obtain error message values

    GetErrorValues(errNumber, errDescription, errSource)

    'Obtain Dynamic Package/Server Information

    GetDyanmicPackageInfo(packageName, machineName)

    'Set E-mail parameteres to dynamically include items

    SetEmailParameters()

    Dim email_message As String

    email_message = "There was an error during the execution of the package " & packageName & " on " & machineName & "." & vbCrLf & vbCrLf & _

    "Details of the error are as follows: " & vbCrLf & _

    vbCrLf & _

    " Error Number: " & errNumber.ToString() & vbCrLf & _

    vbCrLf & _

    " Error Description: " & errDescription & vbCrLf & _

    vbCrLf & _

    " Error Source: " & errSource & vbCrLf & _

    vbCrLf & _

    "Instance loaded at time of failure: " & Dts.Variables("User::serverName").Value.ToString() & vbCrLf & _

    vbCrLf & _

    "DB name loaded at time of failure (may or may not be from previous instance in the loop): " & Dts.Variables("User::db_name").Value.ToString()

    Dts.Variables("User::email_message").Value = email_message

    Catch ex As Exception

    ' Displaying the error since this is an example.

    End Try

    End Sub

    #End Region

    #Region " GetErrorValues "

    Private Sub GetErrorValues(ByRef errNumber As Integer, _

    ByRef errDescription As String, ByRef errSource As String)

    Try

    Dim vars As Variables

    Dts.VariableDispenser.LockForRead("ErrorCode")

    Dts.VariableDispenser.LockForRead("ErrorDescription")

    Dts.VariableDispenser.LockForRead("SourceName")

    Dts.VariableDispenser.GetVariables(vars)

    Try

    errNumber = CType(vars("ErrorCode").Value, Integer)

    errDescription = vars("ErrorDescription").Value.ToString()

    errSource = vars("SourceName").Value.ToString()

    Catch ex As Exception

    Throw ex

    Finally

    vars.Unlock()

    End Try

    Catch ex As SystemException

    Throw ex

    End Try

    End Sub

    #End Region

    #Region " GetDynamicPackageInfo "

    Private Sub GetDyanmicPackageInfo(ByRef p_packageName As String, ByRef p_machineName As String)

    Try

    Dim vars As Variables

    Dts.VariableDispenser.LockForRead("System::PackageName")

    Dts.VariableDispenser.LockForRead("System::MachineName")

    Dts.VariableDispenser.GetVariables(vars)

    Try

    p_packageName = CType(vars("System::PackageName").Value, String)

    p_machineName = CType(vars("System::MachineName").Value, String)

    Catch ex As Exception

    Throw ex

    Finally

    vars.Unlock()

    End Try

    Catch ex As Exception

    Throw ex

    End Try

    End Sub

    #End Region

    #Region " SetEmailParameters "

    Private Sub SetEmailParameters()

    Try

    Dim vars As Variables

    Dts.VariableDispenser.LockForRead("System::PackageName")

    Dts.VariableDispenser.LockForRead("System::MachineName")

    Dts.VariableDispenser.LockForWrite("User::email_subject")

    Dts.VariableDispenser.LockForWrite("User::email_from")

    Dts.VariableDispenser.GetVariables(vars)

    Try

    vars("User::email_subject").Value = "Package Failure Notification - " & CType(vars("System::MachineName").Value, String)

    vars("User::email_from").Value = LTrim(RTrim(CType(vars("System::MachineName").Value, String))) & "@cable.comcast.com"

    Catch ex As Exception

    Throw ex

    Finally

    vars.Unlock()

    End Try

    Catch ex As Exception

    Throw ex

    End Try

    End Sub

    #End Region

    From the powershell side of things, you could likely use invoke-sqlcmd to return a simple value (ie "select 1 [value]" and return that value to a powershell variable. if the powershell variable value is not null, process normally, otherwise switch to another invoke-sqlcmd call that would write to your error table.

    Hope that info helps.

    Steve

  • Very nice Article. I used a Registered server mechanism to scroll through all the servers to get information. This allowed the client to get information dynamically whenever they added a server or removed one.

  • jeremy_f_betz (2/23/2015)


    Very nice Article. I used a Registered server mechanism to scroll through all the servers to get information. This allowed the client to get information dynamically whenever they added a server or removed one.

    I have a CMS set up as well, just didn't include that query to pull data for the list of instances (simply for purposes of keeping the focus on the mechanism to collect data).

    Thanks for the feedback!

  • How would this compare to executing the packages concurrently using SSIS 2012 and the create_execution, set_execution_parameter_value, and start_execution features of project deployed packages?

  • kris386 (2/23/2015)


    How would this compare to executing the packages concurrently using SSIS 2012 and the create_execution, set_execution_parameter_value, and start_execution features of project deployed packages?

    On the brief reading I just did for those procs, it looks like it might not be all that different in the end. In my solution, I'm still using the old package deployment model. For the example, I simplified it even more with the SSIS package on the file system. Those procedures and using an SSIS project certainly appear to the same thing, logic-wise, to get the packages running. As I mentioned in a previous comment, it might be wise to cap the number of concurrent executions (by checking powershell job count of jobs that are running). I would guess there are views to see how many executions are occurring for a given package in the SSISDB. That may be more appealing since you could then contain everything within T-SQL and not have to get overly familiar with powershell. Thanks for sharing that!

  • Wouldn't that depend on how much data needs to be processed, how many CPU's are available on the server doing the processing, and how many servers which need to be connected to?

  • jeremy_f_betz (2/23/2015)


    Wouldn't that depend on how much data needs to be processed, how many CPU's are available on the server doing the processing, and how many servers which need to be connected to?

    In my first or second reply, I put some code for the powershell script to set a max number of powershell jobs that run simultaneously.

    All of the facets you mentioned factor in to the number of maximum powershell jobs you'd like to process. For a lot of my processes, I have queries that are not very intensive and relatively simple data flow tasks to pump in the data so I can run 50 at a time, where we have about 1000 instances. I review the performance stats of the management server that I use to do the parallel processing and adjust that max accordingly.

  • Nice article , Very well explained

  • This is a great article! Thank you very much! I just recently started leveraging powershell in my work and find it very powerful and useful. There is a steep learning curve for sure and articles like yours help to climb it. Could you explain a couple of points from your script:

    1. Why do you need all this complexity when building $scriptcommand variable? It seems that you wrap each section of the path with CHAR(34). Why is that? Also on the same line, do you need to reference $instance.instance_name? Can't you just say $instance?

    2. Could you explain the Start-Job line? I don't understand it at all. Where do $p_command values come from?

    Thank you!

  • Misha_SQL (2/24/2015)


    This is a great article! Thank you very much! I just recently started leveraging powershell in my work and find it very powerful and useful. There is a steep learning curve for sure and articles like yours help to climb it.

    Could you explain a couple of points from your script:

    Thanks, and happy to explain below.

    1. Why do you need all this complexity when building $scriptcommand variable? It seems that you wrap each section of the path with CHAR(34). Why is that? Also on the same line, do you need to reference $instance.instance_name? Can't you just say $instance?

    The char(34) equates to double quotes, and if you print the $scriptcommand to the screen, you can see where the quotes are added. If you try to run that command from the powershell prompt, it sometimes needs the quotes and sometimes doesn't, but my experience has been that adding the quotes makes it more consistent. Think of it like adding the [ and ] around the name of an object in a t-sql script, to some degree. Spaces in the path need double quotes. It really becomes trial and error to get the command working in the powershell prompt, and then coding it in the script to generate the command in the same format. $instance will not work because the $instances object stores an array of $instance objects. Since that is containing a result set from the Invoke-Sqlcmd cmdlet, the column name from the result set is accessed by going to the instance_name property of the $instance object. Each $instance returns a "row" of data, so effectively each row is its own object and the instance_name "column" is the property to be accessed.

    2. Could you explain the Start-Job line? I don't understand it at all. Where do $p_command values come from?

    Start-Job issues a powershell job, and the ScriptBlock option of that cmdlet is the actual command to run within the context of a powershell window. The -ArgumentList option is then helping to assign the values to the parameters defined in the script block.

    I'm telling the job that I am sending this script to it, and that the script will be passed a parameter. I named the parameter $p_command and am assigning the $scriptcommand value to that parameter. The Invoke-Expression cmdlet is used to fire off an executable within powershell. Since the $scriptcommand has a dtexec.exe command being created, that is why we want to Invoke-Expression. I am building the command outside of the job, so Start-Job -ScriptBlock($scriptcommand) won't work because the $scriptcommand variable would not contain any values in this context. Thin of the Start-Job as opening up a brand new powershell window, and then you're simply typing in $scriptcommand. If you do that, you are really just instantiating a new variable with no value. By having the param set up in tandem with the ArgumentList option of Start-Job, I'm telling the ScriptBlock that the first parameter, $p_command, will be assigned to the first value assigned in the -ArgumentList option.

    In comparison to terminology of a stored procedure, each PS job functions like this:

    Think of the param($p_command) as a parameter definition atop a stored procedure definition.

    Think of the -ArgumentList option of Start-Job as assigning $scriptcommand to the $p_command parameter of the stored proc.

    Start-Job is then calling the procedure with the $scriptcommand value, and the body of the procedure is the Invoke-Expression command actually executing the command.

    If you had more than one parameter to pass in to the job, you could add them like this:

    Start-Job -ScriptBlock {param($p_param1,$p_param2) write-host $p_param2} -ArgumentList $param1, $param2

    Hope that helps explain it.

    Steve

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply