Distributed Writes

In SQL Server, scaling out reads (i.e. using Active secondary replicas via AlwaysOn Availability Groups) is a lot easier than scaling out writes. So what are your options when you have a tremendous amount of writes that scaling up will not handle, no matter how big your server is? There are a number of options that allow you to write to many servers (instead of writing to one master server) that I’ll call distributed writes. Here are some ideas:

Peer-to-Peer transactional replication (or Multi-master replication) with SQL Server. See Peer-to-Peer – Transactional Replication
Sharding in Azure SQL Database via elastic database tools which requires coding. See Building scalable cloud databases. You can also implement sharding in code for SQL Server
Merge replication in SQL Server. See Merge Replication
Create a messaging and queuing application in SQL Server Service Broker where all writes are placed on the queue and sent to different servers
Create a message queue using an asynchronous Azure Event Hub
Use a 3rd party product: Attunity Replicate for SQL Server, ScaleArc for SQL Server
Instead of using SQL Server, use a NoSQL or multi-model database service like Azure Cosmos DB (no coding involved – think of it as auto-sharding)

The one option out of all the above options that does not require coding and can support a large number of writes per second is Azure Cosmos DB. All the other options can require significant coding and/or can only handle a limited amount of writes per second. This is because Cosmos DB uses documents (JSON files) where all the information needed is included in that document so no joins are needed and documents can be spread on multiple servers (see Partition and scale in Azure Cosmos DB and A technical overview of Azure Cosmos DB). This is opposed to relational databases that use multiple tables that must be joined. If the tables are on different nodes that will cause a lot of data shuffling causing performance problems.

To go into greater detail on the benefits of Cosmos DB over SQL Server for distributed writes:

Consistency
- Peer2Peer SQL Replication introduces pains around data consistency and conflict resolution
Availability
- Sharding with SQL introduces pains around maintaining availability when increasing/decreasing the degree of scale-out. Frequently, downtime is involved due to needs to re-balancing data across shards
- SQL requires rigid schemas and indices to be defined upfront. Every time schema and index updates are needed – you will incur a heavy operational cost of running Create Index and Alter Tables scripts across all database shards and replicas. Furthermore, this introduces availability issues as schemas are being altered.
Handling sustained heavy write ingestion
- Putting queueing mechanisms in front of SQL only gives you a buffer for handling spikes in writes, but at the end of the day, the database itself needs to support sustained heavy write ingestion in order to consume the buffered events. What happens if events come in to the buffer faster than you drain it? You will need a database specifically designed for heavy write ingestion

Azure Cosmos DB solves these by:

Providing 5 well-defined consistency models to help developers tune the right Consistency vs Performance tradeoffs for their scenario
Scale on demand and support for flexible data model while maintaining high availability (99.99% availability SLA). Scaling out and partition management is taken care of by the service on behalf of the user
Use of log-structured techniques to be a truly latch-free database to sustain heavy write ingestion with durable persistence

In the end, eliminating schema, index management, and JOINs are a necessary byproduct of scale out that Azure Cosmos DB provides.

After the initial post of this blog, I received the question “Why not just use SQL 2016 in-Memory tables for heavy write systems (info)?” and received a great reply from a Cosmos DB product manager:

SQL in-memory is only viable when:

Request and data volume are small enough to fit on a single machine. You still have the fundamental problem of hard limits due to scale-up model.
Scenario does not need durability, reliability, or availability – which are requirements for >99% of mission critical customer scenarios.

Durability

If data is kept in only in-memory, you experience data loss upon any intermittent failure that requires computer to restart (e.g. os crash, power outage, os decides it wants to reboot to update, etc.). In order for data to be durable, it needs to be persisted to disk. In order to offer resiliency against disk failures, you will need to replicate to a number of disks
For durable scenarios – memory only acts as a buffer to absorb spikes. In order to achieve sustained write ingestion – you will need to flush the buffer as fast as you input into the buffer. Now you have a bottleneck on disk i/o unless you scale-out
This is why they immediately have to address that this is for “applications where durability is not required”; durability is a requirement for >99% of data scenarios
Data loss and data corruption should be treated as cardinal sin for any data product

Scale

This model is still a scale-up model – in which there are many hard limits
What happens for data volume that doesn’t fit in memory (which tends to be very small size relative to disk storage)? You need to scale-out
What happens for request volume that memory bandwidth is inadequate? You need to scale out
This is why the throughput numbers in the blog are orders of magnitude smaller than what customers are doing everyday on Cosmos DB, and talking about storage size is quietly ignored

Expensive

Memory is 100x more expensive than SSD. Achieving high storage in a scale-out system will yield not only better scale and durability characteristics – but incur much lower costs for any large-scale scenarios

More info:

Database Sharding

Rate

Share

Share

Rate

Distributed Writes

Rate

Share

Share

Rate

Related content

Book Review: Big Red - Voyage of a Trident Submarine

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

Inserting Markup into a String with SQL

Networking - Part 4

Speaking at Community Events - More Thoughts