This editorial was originally published on 14 Jul 2016. It is being republished as Steve is on vacation.
I was listening to the SQL Data Partners podcast the other day with BI expert, Tim Mitchell, and the opening question was "Is the on-premises Data Warehouse dead?" Tim is a friend, so I tuned in knowing he has some good thoughts on the topic. It's an interesting listen, and one you might enjoy if you're at all interested in data warehousing and related topics. Spoiler alert, Tim says no, on-premise isn't dead, but he does point out some interesting things about the Azure SQL Data Warehouse (ASDW) and similar offerings.
One of the more interesting comments Tim made was about a health care company he worked for. They had an end of month process that heavily taxed their systems. If they didn't need that peak level of processing, Tim noted that the cost of their large data warehouse architecture would be that halved. That need to scale up dramatically can be a big savings in moving to a cloud based system, where you can pay for a much lower level of performance most of the time and increase your scale at particular times.
I know this is feasible as I worked with a similar situation. My employer purchased a very large system for our end of month and end of quarter closing load. Fortunately, we had an AIX machine that contained its own hypervisor. At the time (2001), we purchased a 32 processor server, with the idea that only 18 CPUs (and a slice of RAM) were running our financial systems most of the month. We had QA, development, and other guests on the same hardware. During the month closing, we would shut down some VMs and dedicate most of the processors to the finance system for a few days to handle the load. What's more, the IBM machine actually contained additional CPUs that we could "rent" from IBM for a few hours if we really needed them.
That's what data warehouses in the cloud can do for you. Certainly the decision to move to a cloud architecture is more complex than just having the scale up power of ASDW or Amazon's Redshift. The ability to load into the system, the development challenges, the tax implications, and more will impact the decision. I think the workload characteristics are also important. If you don't have a highly variable, or large peak, workload, then the cloud might make less sense. If you don't have any sort of data center, then maybe the cloud makes more sense.
I do think, however, that the decision to implement a new data warehouse isn't a simple one, and the cloud is a viable choice. The platforms are becoming more capable all the time, with more tools and scale options, as well as better performance guarantees. Many of the tools used to analyze data in a warehouse are more important than the underlying platform, with Excel, Tableau, Power BI, and more easily connecting to any data warehouse platform, in the cloud or on-premise.
This means that we will end up managing more disparate systems over time, especially in larger organizations where some groups will adopt cloud systems while others stick with on-premise installations. Certainly if you are a person that works with a data warehouse, you might want to build a small POC on Azure SQL Data Warehouse and see what you think about its capabilities. At least then you'll be able to add some educated and intelligent thoughts to the discussion when the question comes up inside your organization.