SQLServerCentral Editorial

Don't Let Corner Cases Drive Your Design

,

If you graph computer/query cost against the size of data, you can get four quadrants:

  1. small data, small compute (most CRUD app queries)
  2. small data, big compute (complex BI queries for this quarter, most reporting)
  3. big data, small compute (logs, audit data)
  4. big data, big compute (complex BI queries across all our data)

If you examine the costs here, 1 is the cheapest, with 2 and 3 having a similar cost. Number 4 is expensive, and it's why we often have big boxes running our database server software. However, where is most of our work? The majority is in quadrant 1, with 2 getting the second most action. 3 might rarely exist, as does 4, but we often design for 4. We have to as we don't want phone calls, ever. What we want is to provision a system large enough that we don't hear many complaints about performance. On premises, many of us have over-provisioned systems to handle the peak load to avoid phone calls.

Can we handle the peaks or the really important things that someone thinks are important? Everyone thinks their workload is important, and it is. To them. However, there are plenty of cases where someone could think about designing for specific types of workloads, rather than just aiming for quadrant 4. I've got an image of different types of workloads that I grabbed from the Small Data 2025 conference. For example, if I am working with things like Time Series data or streaming analytics, I might not need huge compute. I might be storing a lot of data, and I need space, but the compute is low. The analysis of that data, however, might be compute intensive.

This is a reason why we might separate analytic systems out as they often are in quadrants 2 and 4, and we might want serverless or scale up/down systems to handle the rare cases, and get a real cost for them. I found it particularly interesting that the Bronze tier might be where we have big data and big compute, but once we've moved to Silver or Gold, we might have lower compute and data requirements. This makes sense as Bronze is more staging, but it is a good reason why we might aim for a Gold layer in our organization and only keep that data for the long term; it's more cost-effective.

Often, for simplicity, we build a bigger system for all types of queries. In other words, we are letting corner cases drive our design. That might be required, but it might not be. In this area of cost concerns, especially in the cloud, designing systems with appropriate resource usage is something that might override the analyst's desire for queries across all data running as quickly as order lookups in an OLTP system. This might be even more true if we can predict some patterns in our workloads during system design. We can't scale up or down instantly, but in a lot of places, I wish I had been able to scale financial or reporting systems up for a few days as we close out the period and scale them down for the rest of the month.

When building a system, think about the practical nature of your requirements and assign a cost to them. Let users know what workload you're building a system to handle and set expectations on performance and cost. If you do that, you can let others decide when we handle corner cases and when we don't. That's often a much easier conversation when we have cost numbers to help customers understand the implications of their request.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating