• A couple of things you can do to enhance performance on the data load. You can break up the files into 60 chunks to coincide with the 60 processes that are going to do the data loading (although, I'm told that polybase internally does this now, our testing still shows that as superior). You can also try to avoid using compressed files (although that has implications on uploading since uncompressed files take longer). Since you're using polybase, you're already using the fastest method, and the one that will go parallel. Also, make sure you're loading the table in the optimal method, either round-robin or hash, out of the gate. If the table isn't used to join or it's for staging or temp, then round-robin is best. Also, if you don't have a good hash key that will evenly distribute the data, then I'd suggest round-robin. Also, you need to avoid small data loads because they'll go to the delta group first. Basically, you want the load to be in 102,400 row chunks per distribution, or 6.144 million rows because of the fact that there are 60 tables under the covers. The advice we've got from Microsoft is to do big loads because they will increase readers and writers with scale.

    That's about all I've got from my notes.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning