How to do Parallel Query Execution in SQL Server

  • Hey Guys,

    I stuck in a situation where I need to do parallel maintenance on few tables, Can someone please suggest a way to do it
    Example: There are 50 table name in a tables XYZ, I need to start index rebuild at 5 tables simultaneously in diff-2 sessions.
    Session 1 - rebuild index at Table A
    Session 2 - rebuild index at Table B
    Session 3 - rebuild index at Table C
    Session 4 - rebuild index at Table D
    Session 5 - rebuild index at Table E

  • anujkumar.mca - Tuesday, December 26, 2017 6:57 AM

    Hey Guys,

    I stuck in a situation where I need to do parallel maintenance on few tables, Can someone please suggest a way to do it
    Example: There are 50 table name in a tables XYZ, I need to start index rebuild at 5 tables simultaneously in diff-2 sessions.
    Session 1 - rebuild index at Table A
    Session 2 - rebuild index at Table B
    Session 3 - rebuild index at Table C
    Session 4 - rebuild index at Table D
    Session 5 - rebuild index at Table E

    To be honest, I think this is a plan for doing things more slowly unless you can guarantee that 5 different spindles (1 for each table) will come into play.  "Going Parallel" is NOT a panacea for performance.  You do have to consider the capacity of the "pipe" and what you're going to do to the read/write heads on the disk(s) involved.

    As for your question, you could build a "maintenance job" that would call all 5 pieces of code in parallel or you could do the same by calling xp_CmdShell to call SQLCMD 5 times using the DOS "START" command.  I still think it's going to be a mistake performance wise unless the tables really are on 5 different sets of spindles.

    As a bit of a sidebar, if you've not carefully planned for the correct FILL FACTOR for each of these indexes, then the rebuilds could put a world of hurt on your system in the form of massive page splits.  Personally, I don't rebuild indexes on a general basis anymore (and haven't since 17 Jan 2016).  The only table that I have rebuilt the indexes on in those two years is one that suffers from regular deletes that don't free up entire pages for reuse.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • anujkumar.mca - Tuesday, December 26, 2017 6:57 AM

    Hey Guys,

    I stuck in a situation where I need to do parallel maintenance on few tables, Can someone please suggest a way to do it
    Example: There are 50 table name in a tables XYZ, I need to start index rebuild at 5 tables simultaneously in diff-2 sessions.
    Session 1 - rebuild index at Table A
    Session 2 - rebuild index at Table B
    Session 3 - rebuild index at Table C
    Session 4 - rebuild index at Table D
    Session 5 - rebuild index at Table E

    Quick thought, use N number of scheduled jobs for this, each job picks one table name from the table and does the maintenance on the selected one.
    😎

  • Jeff Moden - Tuesday, December 26, 2017 7:21 AM

    anujkumar.mca - Tuesday, December 26, 2017 6:57 AM

    Hey Guys,

    I stuck in a situation where I need to do parallel maintenance on few tables, Can someone please suggest a way to do it
    Example: There are 50 table name in a tables XYZ, I need to start index rebuild at 5 tables simultaneously in diff-2 sessions.
    Session 1 - rebuild index at Table A
    Session 2 - rebuild index at Table B
    Session 3 - rebuild index at Table C
    Session 4 - rebuild index at Table D
    Session 5 - rebuild index at Table E

    To be honest, I think this is a plan for doing things more slowly unless you can guarantee that 5 different spindles (1 for each table) will come into play.  "Going Parallel" is NOT a panacea for performance.  You do have to consider the capacity of the "pipe" and what you're going to do to the read/write heads on the disk(s) involved.

    As for your question, you could build a "maintenance job" that would call all 5 pieces of code in parallel or you could do the same by calling xp_CmdShell to call SQLCMD 5 times using the DOS "START" command.  I still think it's going to be a mistake performance wise unless the tables really are on 5 different sets of spindles.

    As a bit of a sidebar, if you've not carefully planned for the correct FILL FACTOR for each of these indexes, then the rebuilds could put a world of hurt on your system in the form of massive page splits.  Personally, I don't rebuild indexes on a general basis anymore (and haven't since 17 Jan 2016).  The only table that I have rebuilt the indexes on in those two years is one that suffers from regular deletes that don't free up entire pages for reuse.

    Good points Jeff, log pressure, IO pressure etc. can throw this approach into the bin of bad performance if the underlying subsystems aren't up to the task. I'd rather go down the road of limiting the need for such operations and more carefully choosing which and when.
    😎

    Came across one such job running for tens of hours, classic random GUID primary key, 100% fill factor, huge table, system slow down, users' frustrations etc. After I corrected it, there have been no complaints and I'm quite certain that the maintenance job hasn't run since.

  • Eirikur Eiriksson - Tuesday, December 26, 2017 7:53 AM

    Jeff Moden - Tuesday, December 26, 2017 7:21 AM

    anujkumar.mca - Tuesday, December 26, 2017 6:57 AM

    Hey Guys,

    I stuck in a situation where I need to do parallel maintenance on few tables, Can someone please suggest a way to do it
    Example: There are 50 table name in a tables XYZ, I need to start index rebuild at 5 tables simultaneously in diff-2 sessions.
    Session 1 - rebuild index at Table A
    Session 2 - rebuild index at Table B
    Session 3 - rebuild index at Table C
    Session 4 - rebuild index at Table D
    Session 5 - rebuild index at Table E

    To be honest, I think this is a plan for doing things more slowly unless you can guarantee that 5 different spindles (1 for each table) will come into play.  "Going Parallel" is NOT a panacea for performance.  You do have to consider the capacity of the "pipe" and what you're going to do to the read/write heads on the disk(s) involved.

    As for your question, you could build a "maintenance job" that would call all 5 pieces of code in parallel or you could do the same by calling xp_CmdShell to call SQLCMD 5 times using the DOS "START" command.  I still think it's going to be a mistake performance wise unless the tables really are on 5 different sets of spindles.

    As a bit of a sidebar, if you've not carefully planned for the correct FILL FACTOR for each of these indexes, then the rebuilds could put a world of hurt on your system in the form of massive page splits.  Personally, I don't rebuild indexes on a general basis anymore (and haven't since 17 Jan 2016).  The only table that I have rebuilt the indexes on in those two years is one that suffers from regular deletes that don't free up entire pages for reuse.

    Good points Jeff, log pressure, IO pressure etc. can throw this approach into the bin of bad performance if the underlying subsystems aren't up to the task. I'd rather go down the road of limiting the need for such operations and more carefully choosing which and when.
    😎

    Came across one such job running for tens of hours, classic random GUID primary key, 100% fill factor, huge table, system slow down, users' frustrations etc. After I corrected it, there have been no complaints and I'm quite certain that the maintenance job hasn't run since.

    Funny you should bring up the problem of having a random GUID PK with a 100% fill factor... I'm doing a study on a year's worth of single inserts (to get the worst case fragmentation possible) where I insert 1,000 rows per simulated hour for a simulated 10 hours and measure all aspects of the state of the CI at the end of every hour for 365 days using different FILL FACTORs, doing a REBUILD, doing a REORGANIZE. etc, etc, and I'm finding that the best thing to do is to do <drumroll please> NOTHING.  Just let the "Natural FILL FACTOR" take over and then leave the CI alone (unless you have a shedload of DELETEs to also contend with).

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • @jeff - Very valid points,But I want to give a chance to it as my server has high-end configuration and no business activity/user land at weekend on server.
    I want to do update statistics in parallel as it consume approx 60-70% time of my maintenance plan.
    enable the XP_CMDShell at prod shall not be recommended
    @Eirikur - Create multiple jobs is very obvious but I don't want to create bunch of jobs at instance.

  • anujkumar.mca - Tuesday, December 26, 2017 9:13 AM

    enable the XP_CMDShell at prod shall not be recommended

    Heh... I suppose you think that it somehow makes your server more secure to not enable it?  That's a very frequent but incorrect assumption. 😉

    That, not withstanding, you don't need to use xp_CmdShell for this... you could use a CmdExec step or you could do it from a good ol' fashioned batch file.

    Or you could use that "Maintenance Plan" method I previously mentioned to fan out the work.

    Understood on the "gotta try it" thing you speak of.  I agree.  You won't know for sure unless you try.  As a good friend and fellow denizen of this site once told me, "A Developer must not guess... a Developer must KNOW" and trying it is about the only way to know for sure.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • anujkumar.mca - Tuesday, December 26, 2017 9:13 AM

    @jeff - Very valid points,But I want to give a chance to it as my server has high-end configuration and no business activity/user land at weekend on server.
    I want to do update statistics in parallel as it consume approx 60-70% time of my maintenance plan.
    enable the XP_CMDShell at prod shall not be recommended
    @Eirikur - Create multiple jobs is very obvious but I don't want to create bunch of jobs at instance.

    If you just want to try, you can just open up a few tabs in MS, put the commands in there and run them 😛

  • ZZartin - Tuesday, December 26, 2017 11:10 AM

    anujkumar.mca - Tuesday, December 26, 2017 9:13 AM

    @jeff - Very valid points,But I want to give a chance to it as my server has high-end configuration and no business activity/user land at weekend on server.
    I want to do update statistics in parallel as it consume approx 60-70% time of my maintenance plan.
    enable the XP_CMDShell at prod shall not be recommended
    @Eirikur - Create multiple jobs is very obvious but I don't want to create bunch of jobs at instance.

    If you just want to try, you can just open up a few tabs in MS, put the commands in there and run them 😛

    Heh... GREAT suggestion!  That also means you can stop them easily if something goes wrong and you can monitor by known SPID.  This is, after all, a test from what I can see.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply