Which Azure Data Platform service should I choose?

,

So you’ve decided to use Azure for your existing or new data project? This blog series is focused on choosing the right technology for your project. It’s tough right? So many options and so many variables. This post will focus more on the cost of choices.

 

Let’s start by taking a quick overview of the various options and then we’ll dig deeper into each over a series of posts.

*We won’t be discussing MySQL, PostgreSQL, Cosmos, or Azure Synapse Analytics in this post.

PaaS Solutions

What is PaaS? PaaS stands for Platform as a Service; but, what does that mean? In short it means that the cloud provider will maintain the hardware, OS, patching, and even backups. This could be a huge plus but in some cases could also be a negative. For example, if you’re a long time DBA you may want to handle some of these things yourself but if you have limited database experience then PaaS may be the best choice.

Azure SQL Database (SQL Server, MySQL, & PostgreSQL)

Azure SQL Database is a great product, especially for new development. Why new dev? Because this service has some limitations, such as: no cross-database transactions. The cool thing about Azure SQL Database is that you can easily scale up or down and can be set up in an elastic pool to share resources.

https://docs.microsoft.com/azure/sql-database/sql-database-get-started-portal?WT.mc_id=gallery

Azure SQL Database has potential to be the most expensive option if your architecture isn’t well suited. Let’s say your current environment has a database per client and you have 500 clients.  Using the Azure Calculator you can see how this could get expense quickly.

https://azure.microsoft.com/en-us/pricing/calculator/

image

Provisioned resources are quite expensive. With the minimum of 2 vCores the cost for 500 databases would be north of $200,000 per month. That’s if you need provisioned. There’s always the option to use the DTU model which is much less expensive. Azure also offers a serverless option which “could” greatly reduce costs but that’s only if you’re apps allow the database to spin-down due to lack of use.

Azure Managed Instances

Azure Managed Instances are a really cool service which are much closer to on-prem SQL which can help when you want to migrate existing applications to a cloud native solution but have been limited by the features in Azure SQL Database.

There are so many more features for supported by Azure Managed Instances than SQL DB. For a list of features supported by SQL DB and MI look here: https://docs.microsoft.com/en-us/azure/sql-database/sql-database-features

I’ve heard many people describe Azure Managed Instances as expensive, but I feel like they can help control costs when you have hundreds or thousands of databases. One of the main drawbacks is the “100 database limitation”. That’s right each MI can only support 100 databases. Using the same scenario above we’d need 5. Sadly the Azure Calculator doesn’t have Managed Instances; so, you’ll need to use your portal to check the pricing. Looking at my pay-as-you-go portal you can see the following:

image

If you needed 5 instances this would cost about $8500/mo. Is that correct though? How do you know what the right size of the instance is? Well, I don’t know for your apps. This is something you’ll need to figure out. I’ll be discussing this topic in a new post so stay tuned. For a quick 10,000’ view, simply collect perf metrics from your app and see what the high and average points are to right size the environment.

One other consideration for Managed Instances is that they can be more difficult to work with due to how the network is setup. For example, Azure Data Factory requires a self-hosted integration runtime to work with an MI.

IaaS Solutions (Lift and Shift)

There’s a reason why the “lift and shift” approach is the most common first step to the cloud for existing databases and applications. What is the reason? It’s that it’s often the most simple and least risk. Simply pickup what you have today and move it to the cloud. Where does it stand cost-wise? That all depends. It could be considerably cheaper or considerably more expensive. Let’s say the 500 databases are all hosted on a single server in your datacenter. What size is needed? If you need a huge VM then the answer is it’ll probably cost more because you’ll want to create some form of H/A like a cluster which would require a load balancer and at least two nodes. If you don’t need many cores but need a lot of memory then IaaS may actually be much better than managed instances. Let’s take a look at the limitations of RAM for Azure Managed Instances:

image

https://docs.microsoft.com/en-us/azure/sql-database/sql-database-managed-instance-resource-limits

The above table describes 5.1 GB of RAM per core. If all you need is 8 cores for the 500 databases a Managed Instance with 8 vCores would only have 40 GB of RAM, ouch! But if you can handle only 40 GB per MI maybe you’re OK, after all 40×5 = 200 GB total. A Single E series (E32-8s v3) with 8 vCores has 256 GB of RAM and costs about $5,000/mo without any hybrid benefits. With a VM you’ll also need storage. If you bring your own license then IaaS is by far the least expensive. Managed Instances can get quite expensive when adding vCores.

image

A few things to keep in mind when thinking on an FCI:

  1. The more databases you add to the FCI the less expensive the solution costs per database. Adding more databases to a Managed Instance solution means adding more Managed Instances, hence increasing monthly cost.
  2. If you don’t have an existing license and plan to install solely with the PAYG model then you may be shocked to find that the license on the passive node will need to be paid as well. That’s right, PAYG does not cover the cost of the inactive cluster node for a SQL Server FCI (Failover Cluster Instance).
  3. You do all the patching, backups, and other maintenance.

Summary

Azure offers up a lot of options for migrating your SQL Server. I’ll go into greater detail on each in future posts so be sure to stay tuned. As for this high level overview you can see that there’s a lot of thought that needs to go into the decision and it all boils down to what’s right for you. Is cost a concern? Do you have a single multi-tenant database or hundreds of individual databases? Can the limits of Azure SQL Database support your solution and does it make sense to modify the application if it does not? Does your solution need a lot more RAM than vCores? These are all questions you’ll need to answer before you make the decision.

If you have an application architecture that does not allow for multi-tenant databases then Azure SQL Database is probably not the right choice. I’d like to say that the graph below is an extreme example but considering each SQL DB is only configured for 50 DTU’s it’s really not as extreme as you may think for a production database scenario and you may argue that $73 a client is low enough but 50 DTUs most likely won’t support all your database needs. On the other end a small E32-8s may not be enough to support 2000 databases; but, it does show that the more databases you can stuff into a single VM the more potential savings you can get.

image

Questions? Email me. Also, stay tuned for a deeper dive on this topic soon.

Original post (opens in new tab)

Rate

Share

Share

Rate