• If you have parallelism, you also have the added cost of parallelism. In this case, you're taking two completely separate processes and running them at the same time, which eliminates that cost while having two threads doing the work at the same time. If you want to use more threads, just have more processes run simultaneously.

    The flaw in this logic is that if you have a large table that takes 90% of your time and typically is done using 4 threads then you could actually take longer to run the entire process. That doesn't stop you from saying that this big table can be handled separately and the rest of your tables go through service broker. However, if you don't put MaxDOP down and you have 5 iterations going at once, you could easily have 20 threads starving each other and every other process running for CPU time.