A while back I was asked to look at the salability of our database environment, and what could be done to achieve a 'doubling in capacity' for what was going to be a busy trading period. [For info : the db was supporting a load-balanced server farm of web servers (providing both UI and web services) - daily traffic was expected to hit the 1 million mark]
Here is a summary of what my research found....
Scalability of SQL server can be achieved 2 ways - either scaling up or scaling out. Scaling up involves increasing the performance of a single server, i.e. adding processors, replacing older processors with faster processors, adding memory etc.., and is essentially what can be achieved by replacing existing hardware with a newer, bigger, faster machine. Scaling up works in the short term as long as the budget allows, however there are ultimately levels of volume that one server cannot handle.
Scaling out is to distribute the load across many (potentially lower cost) servers - this concept works for the infinite long-term, and is constrained by the amount of physical space, power supply, etc..., far less than the constraints of budget and individual server performance.
There are 2 approaches to scaling out:
- multiple servers that appear to the web site as a single database - known as federated databases - the workload is then spread across all servers in the federation. This can be implemented in SQL Server by using Distributed Partitioned Views.
- multiple servers that appear as individual servers, however the workload is load balanced or the servers are dedicated to a particular function or work stream (e.g. a set of 1 or more web servers)
The main concern at the heart of any scaled-out SQL solution, is how to ensure that data is consistent across all the database servers. With option 1, the data is consistent as there is only one database, and data is spread rather than replicated across each server. The main downfall of this approach is the poor availability - if one of the federated databases fails, the whole federation fails.
With option 2, some data needs to be replicated from all servers to all servers. There are many methods of implementing such an architecture, however it seems to be clear that this general approach is the best, long-term, for scalability in this environment. Some data can differ across each database server as long as the user remains 'sticky' to that database (for example, session information), investigations started into this idea, using the web servers to remain 'sticky' to a particular database. [The idea was to split the load-balanced web farm over 2 database servers and issue a cookie from the Content Services Switch to keep the users on one half of the web farm - we never actually got this to work successfully]
It is worth pointing out at this stage a common misunderstanding about clustering SQL servers using Microsoft Cluster Services. Clustering itself does NOT provide scalability - it only provides high-availability. The only true way to scale a clustered SQL solution is to scale up - i.e. increase the power of each server in the cluster.