Can you give a few more details of how the tables are partitioned?
And have you tried single threading this process?
We had a mix of parallel and single threading loading in ours.
Performance was still very good, although a lot of that can be very dependent on the overall design.