One database per customer or One DB for all customers

  • I am designing a web service for hom e inspectors and using SQL to store text and images from each inspection. The data is later used when the inspection report is viewed online or when inspection company analyzes the business data over time - reporting.

    I think that I should generate a new database copy for each new inspection company that joins the service. The inspection data is private and there could be multiple inspectors per company.

    I'm looking for guidance on best practice. Is there any reason why I should want one huge database for all inspection companies or is it best to separate companies across multiple databases?

    First major project and need direction on how to think about this.

    Mort Strom

  • The main reasons that I could think about keeping them together is so that you only have to maintain one schema and for manageability. However there are tools now that are available for blasting changes to more than one location so schema maintenance is less of an issue.

    It seems that you could keep the data together but you would have to be real sure that your application was only going to display the pertinent company information based on the company that was logged in. The risk should be minor with testing.

    The other thing to consider is how much data would be in common and what amount of duplicate data you would be storing if you split this information up to company specific databases. That could get costly if there was a bunch of common data. My guess is that there wouldn't be based on your description.

    No answers, but some things to consider anyway! :hehe:

    David

    @SQLTentmaker

    “He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

  • Yogi Berra had a saying... "When you come to a fork in the road, take it!" In other words, given a choice, take all choices.

    A partitioned table would work very nicely here. It combines the benefits of have individual tables for each "customer" along with the benefits of have one huge table for all customers. It makes future archival processes simple, it makes adding new "customers" simple, it makes table maintenance simple, and it makes index maintenance both simple, relatively unobtrusive, and fast.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • You are asking about "single tenant" (one db per customer) vs "multi-tenant" (one db for all customers) for SaaS (software as a service). You may want to Google those terms for a lot of opinions either way. MSDN has a write up on the various options.

    I find single tenant (separate databases) quite tedious to maintain. If you have an explosion of customers, it quickly becomes unmanageable to maintain so many separate databases, and your scripts need to be extended to loop across all tenant databases to run any DDL or bulk operations. Expanding the number of databases has other ramifications such as affecting backup (such as if you are creating a new subdir per db name) and restoration (need to script restore across all dbs in case of failure).

    Also, if you are not careful about your development strategy, schemas can get out of whack across the databases. I do not recommend doing different schemas per tenant (having different functionality for each customer's application, for instance), as it becomes a complete nightmare for ongoing maintenance and support at that point. You don't mention it, but you can also consider multi-tenant vs single-tenant in your application code deployment as well as database.

    If your application provides the same schema set to each tenant, I greatly prefer the multi-tenant approach (which they label "Shared Database, Shared Schema"). As the article above notes, it can be extended to provide some limited customization per tenant if need be. Proper application coding should assure that one customer cannot access or view another customer's data. Application side, you should only have one deployment of code to a single location versus having a deployment per customer.

    As someone noted, if the quantity of data is huge per customer, partitioning each customer's data to a separate file may be advantageous. I would not go this route on a whim, however -- research and testing would be required to see if customer access times lower in this scenario. I think I'd personally require the multi-tenant db to be 20G or more (and core tables be 250M+ rows) before considering partitioning.

    HTH,

    SQLServer 2005 DBA

  • On the other hand, Backups and Restores work very well for individual databases and also allows for a multi-server environment which has both it's own pains and advantages.

    The bottom line is to do like muji_mu suggested above. Google for all the types of options and the related pro's and con's... then, decide what is better for the data and the server(s).

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • This is the same question asked here: http://www.sqlservercentral.com/Forums/Topic634092-361-1.aspx and there are answers in both posts. Please don't post in 2 forums as that just fragments responses.

  • My proof-of-concept service has been in production for 2 years now and the db size is 125MB (1 company, 2 inspectors). As I design the 'real' online service for multiple inspection companies, my big concern is for performance, scalability and manageability. Security and privacy are a concern but based on posts, this appears doable regardless the path.

    Regardless the levels of services i plan to offer, there's only one database schema in mind at this time.

    I expect high frequency concurrent transactions

    + up to 8 inspections a day will be created per company (initial inserts and updates)

    + inspections are conducted online/real-time via wireless device. client is web browser. (inserts)

    + average of 25 digital images are stored per inspection. (inserts)

    + related parties view a completed inspection + images average of 5 times per day. (reads)

    + after 30 days, an inspection may never be viewed again

    + inspection companies will have a variety of canned reports that lend insight into their business over time. (read)

    + inspection companies maintain a list of real estate agent (contact info) which is not shared across companies

    The responses are most helpful and thanks to muji_mu for helping me label the question as "single" vs. "multi" tenant. I'm learning how to think about this and can read the MSDN article and Google.

    If I go with a multi-tenant model, is it a difficult task to move to the single-tenant model if future success requires it?

    Does SQL 2005 or 2008 allow for partitioning across servers? my ignorance complete here.

    Mort Strom

  • Moving from either to the other, once you have data in them, is a bit of a pain. Depends on how much data you have and how you've structured it, and how the front-end application has been built.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • mort.strom (1/12/2009)


    If I go with a multi-tenant model, is it a difficult task to move to the single-tenant model if future success requires it?

    If you implement it correctly, then it is a large task, but not a difficult one.

    I have implemented large shared schema, multi-tenant applications & databases and it is all rather straight-forward as long as you follow certain rules:

    1) All application users must be easily distinguishable by tenant to SQL. Practically speaking this means that either each tenant has a separate Login or each user has a sepearate Login to SQL Server and a Tenant_Users table to map the user back to their tenant association.

    2) Every table that the application can write to must have a tenant_id column that identifies the tenant-owner of the data.

    3) Every such table must have a corresponding "security" view that insure that any user can only access rows in that table from the same tenant., like so:

    CREATE VIEW Secure_TABLE as

    Select *

    From Physical_TABLE T

    Inner Join Tenant_Users U

    ON U.tenant_id = T.tenant_id

    And U.UserName = sUser_sName()

    4) No application code, including client code, application stored procedures, views, etc., is permitted to directly access the physical tables. All such data access is only permitted through the Security views. (use database roles, schemas and permissions to implement this, do not rely on code).

    If you notice, this approach insures that, except for the user Logon's, the application has no knowledge of the multi-tenancy, and no application changes should normally be necessary. This means that you can move a tenants data into or out of this approach without any changes to most applications, other than changing the Server.Database address. (note that to be truly transparent you would have to remove the tenant_id from the output of the security views).

    The tenant_id should be made the first field of the primary key and probably also the Clustered Index (if different).

    Here are some links to other discussions about shared schemas and an MSDN article on it:

    http://www.sqlservercentral.com/Forums/FindPost610335.aspx

    http://www.sqlservercentral.com/Forums/Topic529170-361-1.aspx

    http://www.sqlservercentral.com/Forums/Topic569567-149-1.aspx

    http://msdn.microsoft.com/en-us/library/aa479086.aspx

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply