What is the best way to execute ~100 I/O bound scripts?

  • As the title suggests I am looking into ways of executing a lot of scripts as fast as possible. This is something I’ve never done before so not entirely sure how to approach it just yet.

    Originally, I thought of setting up a cloud formation template with the same number of instances as there are scripts but that just seems very inefficient.

    Then I thought of setting up an instance with multiprocessing and a queue which the instance can pull the scripts from. But, then again I’d be limited to the cores available. So, I’d have to set up multiple EC2 instances and have them all pull from the same queue which still doesn’t seem like the best choice available.

    Does anyone here have an idea of how to do this?

  • What do the scripts actually do?  The reason why I ask is that if there's contention, you could end up with quite a bit of blocking that would make this exercise basically single threaded no matter how many core/CPUs you throw at it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • If the question is, how to run a bunch of scripts at once, I'd suggest taking a look at Powershell threading. It's a pretty easy way to launch a whole slew of processes all at the same time.

    Otherwise, are you asking how to tune the queries or how to set your hardware to overcome I/O issues? As Jeff says, what are those scripts doing?

    Sorry, but the question is a little unclear.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • This was removed by the editor as SPAM

  • pokalsing wrote:

    pokalsing wrote:

    I got this,..

    Excellent... please share what you've got.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Depends on what the scripts are. I once setup an AWS Step Function to execute multiple AWS Lambdas that were essentially python apps that executed multiple python scripts at once. AWS Step Functions allows for parallel execution using the Mapping feature. One AWS Lambda can be mapped to a list of outputs. For example, you can have one AWS Lambda take an input and output a list (or array) of values: ['a', 'b', 'c', 'd', etc..] and map that list to N number of AWS Lambdas to parallel process that list as one single Lambda at the same time.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply