How can I run the same package, at the same time with different configs

  • I have a package that I want to run 3-4 copies of it at the same time with a variable that basically changes the name of a stored procedure when it runs. I am sure there is a half dozen ways but I am wondering if there is a best practice. The runs take about 2 hours a piece, so instead of the process taking 6 hours, I want to run them parallel.

    We run all our packages via a CMDexec "call "\\SysDev\PROD\SQLDTS\UTIL\Extracts\exe_Forms.bat"

    In the bat file I pass in the name of the package and its config.

    I assume if I create on SQL JOB and have 3 steps that it would run them synchronously, which doesn't fix my problem.

    The only option I can think of is to create three jobs.

    I thought maybe I could use a Master package and pass in the three variables but if I have a cfg_PackageType in the child and I can't reuse it in the master if I run parallel.

    The only way I can think of doing it would be ugly, copying configs around with a script task, have multiple copies of the same package.

    Another gotcha or gotme is, at the end of all the process I need to create a control file. The vendor says nothing has to be in it, and it could be anything. This file is what they will use to determine that we are saying the process is complete from our end and that they can begin there process. This is one reason I was looking at wrapping the process in a master package and then after the packages ran in parallel I could use a precedence constrain to point them all to one script task to create the text file.

    Ideas?

    Thanks,

    Phil

  • Yes, if you have 3 steps in a job then they will execute serially.

    It is possible to overwrite variable values contained in a config file at runtime by using the /SET option so you could do this in your batch file command line. I assume you are using a batch file as a wrapper because there are other things in the batch file than just the dtexec command?

    If the dtexec command starts to get a little unwieldy then another option is to use a Command File that contains (on separate lines) the various switches and options with which you want to execute the package. Now, you can have as many Command Files as you like, each with different contents so you could call each of your 3 packages and specifiy a different Command File for each package.

    There's a article on how to do just that here[/url]

    Would it be possible to have 3 Execute SQL tasks (one per stored procedure) that are executed in parallel? You could either hardcode the stored proc names or have 3 variables, e.g. @StoredProc1, @StoredProc2, @StoredProc3 reference by each Execute SQL task.

    If so then it makes your control file creation at the end of the process a lot easier; you could either do it within a job step immediately after the package execution job step or as a File System process within the package itself. Of course there are other ways of signifying that a completion status such as a Send Email task within the package or by populating a row\column in a table.

    If doing everything in one package execution is not possible then I agree that you will need 3 SQL Agent jobs which makes checking for a completion state that much harder. You could have a job step that execute some T-SQL to check a Status column (or columns) in a table on a polling basis by using WAIT FOR - if the Status is NotComplete then wait for a certain amount of time before checking again.

    I'll be interested to know what approach you decide to take!

    One final note of caution: if your stored procs are querying the same set of tables (at least partially) then you have the potentially for blocking and even deadlocks (assuming you are not using NOLOCK hints or running in READ UNCOMMITED transaction isolation level).

    Regards

    Lempster

  • I think I will go down the road of one SQL JOB with 4 steps. I have 1 step that needs to run on a certain day of the month, so I'll put a variable int he XML config to hold that day and then check it in the first step of the process. I'll also put another variable in the config "cfg_RunPostOps" and I will set that to true for the 4th package. In the previous packages I'll keep appending the row counts to a text file and then in my "PostOps" sequence container I'll move the file to the outgoing folder when I see that cfg_RunPostOps variable set to True.

    I don't know if SSIS 2012 has it, but it would be nice if that when you add a package to a Master package it would expose its variables to be set. Then I could run all three at once, point them all to step 4.

    Thanks for the help,

    Phil

  • My initial impression was to use the master package too.

    Can you clarify this statement? I'm not exactly clear on the issue you encountered.

    I thought maybe I could use a Master package and pass in the three variables but if I have a cfg_PackageType in the child and I can't reuse it in the master if I run parallel.

  • Can a master package variable value be overwritten at runtime using the /SET option? I've never tried, but I see no reason why not in principle. If so then that would be the way to go - simply execute the master package with three dtexec commands using /SET to set whatever variable defines the stored proc that you want to run.

  • The problem with running Dtexec three times, assuming you meant having either three steps in a single job or three individual jobs is that the process needs to create a control file to be put in the Outbound directory after all the steps\data is there. That is the trigger for the vendors system to know that all files are ready. So I thought I could put the first 3 packages (which are the same child package) in a sequence container and then the fourth package would handle moving the control file to the outbound directory. But I can't have the master pass in the variable that the child packages uses to run the stored procedure. Because if all three children have a variable cfg_ExtractType and its source is parent then I am stuck. Unless I am misunderstanding where you are running dtext from. Is it possible to have the sql job run dtexec to call the master and then the master call dtext on each package passing in cfg_ExtractType via the /SET switch?

  • What version of SSIS is this? If its 2012 you can pass a parameter from the master to the child packages, so you could definitely do this.

    If its 2008, i don't have experience trying, but this might be what you are looking for.

  • Phillip.Putzback (1/14/2014)


    Unless I am misunderstanding where you are running dtext from. Is it possible to have the sql job run dtexec to call the master and then the master call dtext on each package passing in cfg_ExtractType via the /SET switch?

    Yes, I am talking about 3 SQL jobs (not 3 job steps because you want the execution to run in parallel) that call the master package using the /SET option to specify the value for the cfg_ExtractType variable - obviously a different value per job.

  • The problem with running them parallel is that I the code I would have to create to solve the problem with creating a signal file when they all are complete. I am sure with some more experience and time I could find a better way.

    I've decided to run them serially in one job per frequency

    Frequency Daily - pkg1_Step1, pkg2_Step2, pkg3_Final

    Frequency Monthly - pkg1_Step1, pkg2_Step2, pkg3_Step3, pkg4_Final

    The Final packages have the config variable cfg_RunPostOps = true and the PostOps step will copy the signalyyymmdd.done file to the outbound folder.

    I'm still open to suggestions. Is there a way to have a master call a child and pass in the configuration file for the child to use. I don't see a configuration option for the package that can be set with an expression, only the Connection property.

  • Ok, but your initial requirement was to run 3-4 copies of hte package at the same time; that's what I've been trying to give you a solution for. If you've now decided that parallel execution is not a requirment than of course that changes things! 😉

    Sometimes it pays off to challenge requirements or the 'traditional' way of doing things. I must admit that once I've found a way of accomplishing a task I will tend to gravitate towards that method every subsequent time......but sometimes taking a fresh look at the problem may result in a different (and better) solution. I need to do that more often!

    Regards

    Lempster

  • Being that this is a new problem for me I think I am having trouble grasping your solution. I can't visualize the workflow. I'd prefer to run the packages parallel but I can't seem to think come up with a non obscure way of doing it that wouldn't be a maintenance nightmare in the future.

    I appreciate the help. The workflow seems to change daily as they vendor can't seem to decide on their own requirements.

    I don't know if the image helps or complicates what I need to wrap up into 4 possible runs based on one variable [ExtractType] being passed in.

  • Did you check your the link I provided? I thought that would do it for you.

  • Nevyn (1/15/2014)


    Did you check your the link I provided? I thought that would do it for you.

    Isn't the code in that link another way of setting a child package variable to the get its value from the parent package? The same thing you can do with using a configuration and setting the parent package variable to load the child.

    I've been testing how to run the same package with different variables, my screenshot should help explain. In each script (Not the controller) I change the cfg_ExtractType_Parent value to a different value, either "PB_DAILY", "HB_DAILY", or "HB_ACCOUNTS".

    Then in the child task I bind the variable coming in to an email subject and mail it to me. I end up with three emails with the subject "HB_ACCOUNTS"

  • You're right, I thought of how to do it in 2012 (parameters), then tried to figure how to do parameters in 2008, but in your problem it would be the same parameter.

    So, remaining options I can think of (none of them great):

    1) Multiple copies of the package with the proc name hardcoded. Least elegant, annoying to troubleshoot, but easy.

    2) Have a simple dummy table with an identity column. In the master, truncate and reseed it to start with. Then in the child, insert to it with an output in an executesql task, and use the output in an expression to switch between procedure names (if statement or data flow to a control table).

    3) If you create separate SQL agent jobs for running it for each procedure, and then have a "master" job that simply calls those jobs in each step, I believe that by default it will spawn them all without waiting.NOTE: I have not tried this, but have read about it. It would let you avoid the master package altogether if it works.

  • Thanks for all the tips. But now that I am running the job daily, I can see that it needs to be run serially, since some of the procedures I was given take so long to complete. I don't want to drag down the server resources with this one job.

    Hopefully we get a dedicated 2012 job server soon because I am looking forward to developing SSIS packages with the new 2012 Data tools.

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply