How to performance tune ssis pkg when dev environment isnt as good as prod

  • Hey everyone, my dev envoronnebt has 56 GB of memory and 4 virtual processors with a speed of 2.29 GHz. My prod environment has 163 GB of memory and has 8 virtual processors at 2.29 GHz. I can update pkgs on my dev environment to run multiple data flows using Modulo and that pkg runs in 18 minutes when before it ran in 43 minutes. But when I run the master which calls other pkgs that run concurrently I use up all 56 GB of memory and it bogs down. Is there a way to determine what my prod environment can handle?

  • Since on dev it uses all 56 GB of memory, you would need more than 56 free on live.  How much exactly, I am not sure.  But you can get a good estimate by running the different concurrent packages one at a time and add that up.

    I don't think there is any way to simulate more memory or more CPU's on your dev environment to see how it will perform.  A rough estimate, since you are doubling the resources available, it should run in half the time or better.  But that is making a LOT of assumptions.  For example, we are assuming that the available memory is double, not just the total.  We are also assuming that the other CPU cores are running idle so doubling them gives you twice the processing space.  We are also assuming that your queries can be handled across multiple threads.  We are also assuming that all other running processes (including other SSIS packages) is identical between the two systems and other users, automated processes, etc.  If, for example, a different SSIS package is running on LIVE at the same time as yours and it eats up 56 GB of memory and pegs 4 of the 8 cores, you are going to have identical performance (or possibly worse) on the live system.

    Plus, it is assuming that it is using all 56 GB of memory on dev and not paging to disk.  Once you bring in paging to disk, that can cause substantial performance hits.  If you can reduce paging to disk, performance will be better (obviously).  But your packages may be using 57 GB of memory or they may use 2 TB.  If it needs substantially more memory than is available (such as 2 TB), doubling from 56 GB to 163 GB will not offer a huge benefit.  It will offer a benefit, but it won't be substantial.

    What I would do is determine how much each package that is being run concurrently uses (like if you have 32 pacakges that all need 2 GB of memory, as long as you have more than 64 GB free at package start time performance should be decent).  Once you page to disk, I would expect a 10 times performance drop if not more; depending on your disk, SSD vs 10000 RPM vs 7200 RPM vs 5400 RPM vs FLASH memory vs USB vs ... plus all the different RAID setups you can do, can effect performance.

    But to determine what your prod system can handle, it can handle 163 GB of data in memory and 8 cores doing processing at any given time.  What I mean by that last bit is if you have 10 SSIS packages all doing calculations at the exact same time, at most 8 of those will be in the CPU at any given cycle and the other 2 (or more) will be be queued.  BUT that memory and CPU is shared with everything else on the system.  So if SQL Server is on there, it'll be using memory and CPU.  Windows itself will use CPU and memory.  Antivirus, backup, etc... anything running on there will be using resources.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Congratulations on the improvement you made but I don't believe your job is complete yet.  My biggest concern would be to find out exactly why a single package is consuming 56GB and fix it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply