• I admire all the work put in to accomplish this stuff.   But now that Polybase is part of SQL Server, why wouldn't you connect directly to Hadoop from SQL Server.  We have found it much more efficient than SQOOP for transfering data and also allows you to use most if not all of the existing T-SQL constructs.   You don't have to know Linux only where on the clusters the data is located.  Another benefit is that Polybase creates all the MapReduce needed to generate the query set and runs it on Hadoop only bringing back the data.