Technical Article

SaaS and Multi-tenancy Models

,

Background

A few years back I started working with a private practice group of health care providers in a Data Architecture consulting role. They had an excellent dashboard and scorecard system and would rate 3.5 on the BI Maturity Index (Figure 1). One of the ways they continuously improved their business processes was to benchmark and meet with other “peer” private practice provider groups annually. In the course of their knowledge sharing the peers starting wanting the analytics and data that my client had. Thus grew the need for the solution I was to architect. It should be noted that organizational change was required alongside the technology aspects by my client. An example was aligning job/position titles in order to properly analyze labor efficiency (appointments per scheduler, $ per scheduler, etc.). This data stewardship was the key to success in their data analytics efforts.

In order to create a viable solution we had to decide on a software as a service (SaaS) data architecture. The analytics architecture was also decided but is not part of this writing. The options considered were single tenant behind the firewall, separate database / code base per tenant (SaaS vendor maintained), single database for all customers, and a hybrid of the former two options. The terms tenant and customer mean the same thing in this article.

It is important to note that considering SaaS endeavor is most successful when the system is quite mature. A score around 3 in the diagram below is considered mature. If possible, a SaaS candidate application should incubate for considerable time prior to such an effort. In the example above the client had the system up for about 3 years before they considered SaaS. This limits the amount of rework due to changing requirements and product growth.

Figure 1- Business Intelligence Maturity Index

Multi-tenant data architecture options

Below lists the 4 options that were investigated in choosing our final architecture. Each option is described along with the pros and cos of each.

Option 1 – Single tenant behind the firewall

This scenario involves having a dedicated database for each customer that is hosted in their data center or an externally hosted data center. Each customer has their own specific ETL processes created for them and customizations as needed. Attempts should be made to keep the data structures and code amongst customers as similar as possible.  

Pros:

  • Allows for complete flexibility and customization
  • Customers can choose the RDMS they desire (Based on internal standards)
  • Customers can have direct access to the database for analysis purposes
  • Security comfort that the data resides within the customer’s internal or controlled networks

Cons

  • Has a separate code base for each customer
  • Makes upgrades difficult across customers
  • Hard to maintain each customer due to customizations

Option 2- SaaS vendor maintained separate database

This is a variant of the first option. The SaaS vendor maintains a single database server with separate databases and code for each customer. The SaaS vendor is the group/company that is creating the solution that will be brought to market.

Pros

  • Complete flexibility and customization per client.
  • Easier visibility to all of the databases for the SaaS vendor (over Option 1) and code since the databases are in a single database server

Cons

  • Has a separate code base for each customer
  • Makes upgrades difficult across customers
  • Hard to maintain each customer due to customizations
  • It is more difficult for the users to access the database directly (I can see this as a good thing too)

Option 3 – Single multi-tenant database

In this scenario a single database is created for all customers. A single set of ETL code is utilized to maintain the data. The code is configurable for each customer by the use of configuration tables and semi-dynamic SQL code. The data is segregated by a customer id column in each table.

Pros

  • Single data model for all customers
  • Single code base (configurable for each)

Cons

  • Customizations can be more difficult. Separate columns in the MT tables may be needed for customer purposes. The staging area is the most volatile with differing sources systems per customer.
  • Single point of failure with the Multi-tenant database.  

Option 4 – Hybrid of Option 2 and 3

This option takes the best parts of options 2 and 3. The staging area of the database where customer specific data formats lands is single tenant. Each customer has their own schema and for their extract tables. There is a generic staging area that normalizes the customer specific formats. The generic staging and data warehouse models are Multi-tenant.  

Pros

  • Allows for the customer specific input formats (in their staging schema)
  • Gives a single code base
  • Allows for customizations of the ETL per customer with configuration tables

Cons

  • Data must be segregated for security and performance reasons 
  • Upgrades impact all clients simultaneously

For our solution we chose Option 4. This gave us the best mix of flexibility and code uniformity. At a later date we could implement dynamic SQL or ETL if desired.   We would leverage the configuration tables as part of this effort. For the volume of clients we have at the moment the configurable SQL based approach is sufficient.

In my prior projects to the one discussed here I had employed both option 1 and 2. For option 1 it was virtually impossible to keep the versions behind the firewall with multiple customers in sync. It was a major effort to apply releases due to the heavy customizations and internal red tape. For the option 2 implementation it was a little easier to keep the databases in sync with tools like SQL Compare. It was still a challenge because we had to copy code from the master (main database) over to the client databases. These two approaches worked fairly well for 1-2 tenants but starts to become a maintenance challenge after that with code branching. My current client is envisioning at least 10 clients so thus we opted for option 4.

Summary

We discussed the business need that warranted a SaaS solution and the various options for the data architecture. In addition the chosen solution and justification was presented. In the next article we will focus on data partitioning in a multi-tenant architecture. The need for this and technical implementation will be covered.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating