I enjoyed this article immensely and look forward to the next in the series.
I thought initially that "well vexed in Java" was a typo but on reflection I think you are correct. Certainly when you compare Java to Python or Scala it seems a lot of trouble.
Yarn (yet another resource negotiator) is quite important as it allows workloads to be protected from other hungry processes. Yarn labels allow certain worker nodes to be dedicated to particular apps. This is very important when there are applications that are licences on the number of worker nodes in the cluster dedicated to the app.
Oracle licence ODI (Oracle equivalent of SSIS) on the number of cores in the entire cluster. Even those not dedicated to data processing.