Thanks for your insight Grant. Yes, regarding CXPacket waits, I am looking into Parallelism. Right now, the queries (INSERTS) span as 16 threads and some of the threads are getting the CXPacket waits. I will have to see if I could rewrite the queries so that SQL Server could balance the work uniformly across these threads. Other option would be to reduce the MAXDOP to 3, 2 and 1 and also observe the performance.
I am going to talk to the server team about the latches.