• Since the last time I read this article I started working in big data environment with AWS, Hadoop (EMR) and Vertica.

    As you mentioned Hadoop is complementary technology, we found out it can be used as great ETL tool to aggregate 50 columns in hundreds of million records and then send the result to Tabular. To do the same in SQL Server was not practical.

    We found out that for our purposes Hadoop is not a great data store, and while some tools like Tableau can connect to HDFS and run map-reduce queries against it the performance is not that great (at least in our case)

    Also AWS and MapR do not use HDFS

    I also started working with Vertica and I was amazed, it takes many minutes in SQL Server to move half a billion records from one table to another (via BCP out , BCP in), it takes seconds to it in Vertica. I did not believe it finished so had to do count to really see that all the records were transferred.!

    Vertica does not have many features but its LOAD and SELECT command are blazing fast.