Is it Better to use Multiple DB than Single DB?

  • Hi

    In need to design an Enterprise level solution which will contain around 8 modules (such as Accounting, Human Resource, Inventory, Purchases etc). It will be a portal with single sign on to different modules.

    Now the question is whether to put all the tables of all the modules in one DB or to have different DB for each module. All the modules will be interlinked (E.g. Users will be pulled from the Human Resources in the Purchase module. And any purchases made will be moved to inventory and transaction will be posted in Accounts too).

    A rough estimate is that each module will have 100+ tables so in total 800+ tables in the whole system. In one module the transaction will be huge (2 million entries per month in a single table - will be archived at the end of the year)

    So what will be best way to design the database. Both advantages and disadvantages

    Unni

  • Single DB.

    Far too many reasons to elaborate on 🙂

  • Wow, that's a lot of tables.

    If they are all part of the same app, I'd go with 1 DB.

    Mark

  • Just to add more confidence, I've worked with more than a thousand tables in a single database and multiple of these databases running on a single server with tables having a few hundred million records in a single table. The performance was just fine.

    There are DBA's that monitor indexes etc, but this kind of architecture is very doable.

    I've also seen in my experience issues experience with multiple databases using linked servers. It may be necessary sometimes but I would suggest keeping linked servers to the area "necessity" as there are limitations with linked servers.

  • NeverDie (7/10/2013)


    Just to add more confidence, I've worked with more than a thousand tables in a single database and multiple of these databases running on a single server with tables having a few hundred million records in a single table. The performance was just fine.

    There are DBA's that monitor indexes etc, but this kind of architecture is very doable.

    I've also seen in my experience issues experience with multiple databases using linked servers. It may be necessary sometimes but I would suggest keeping linked servers to the area "necessity" as there are limitations with linked servers.

    Thanks a lot. This really boosted my confidence.

    Unni

  • I'm inclined to disagree with earlier comments. Look at your data and see what changes or grows rapidly, and what is pretty stable and changes less often and in smaller quantity. You need different backup strategies for the different chunks of data if there is a significant distinction in those issues, unless you have store to burn in which you can hold pointless backup data, and the easy way to handle that is to put data into different databases. Of course this classification of tables may match up with the multiple modules or (more likely) not since several modules may cause updates to the same data. The discussion of linked servers is a red herring, since that's a question of how you split your data between servers not of how you split it between databases, and the question was about splitting into databases.

    Tom

  • Unless there's a compelling reason (need for security at the DB level comes to mind) to separate the data, I would go with a single DB. One of the accounting apps I used to support had the data segregated into 12 databases. It was a pain to do any sort of backup and restore to the development or acceptance servers, and writing queries was far slower because of the need to specify the database.

    As for table counts, 800 tables for this sort of application is trivially small.

    Having said that, I would also ask why you want to write your own ERP system when there many more than you can count available from a variety of competent vendors.

  • unnikozhissery (7/10/2013)


    In need to design an Enterprise level solution which will contain around 8 modules (such as Accounting, Human Resource, Inventory, Purchases etc). It will be a portal with single sign on to different modules.

    Now the question is whether to put all the tables of all the modules in one DB or to have different DB for each module. All the modules will be interlinked (E.g. Users will be pulled from the Human Resources in the Purchase module. And any purchases made will be moved to inventory and transaction will be posted in Accounts too).

    A rough estimate is that each module will have 100+ tables so in total 800+ tables in the whole system. In one module the transaction will be huge (2 million entries per month in a single table - will be archived at the end of the year)

    So what will be best way to design the database. Both advantages and disadvantages

    Putting everything together in a single database sound as a good idea but, I have seen more than enough good ideas turning bad.

    Think on the big picture...

    How about scalability?

    What is going to happen in the future if this system grows and grows? how are you planning to scale it? May be a solution would be to scale it out, on different servers in which case having different databases would make your life much easier.

    How about recoverability?

    What is going to happen if your HR Vice President comes one day and says, I'm so sorry, please recover the database as it was last Friday at 5:32PM? Having separate databases would make this easier? provided you have a sound backup/recovery strategy in place.

    How about business continuity?

    What is going to happen when you are planning your Business Continuity/Disaster Recovery plan? some applications like "billing" would be critical, on the "the company cannot survive without them" category while other applications like "HR" would be less than critical, perhaps on the "company can survive for two weeks without it" category. Having separate databases will help too, don't you think so?

    In short, even if it sound counter intuitive probably having separate databases would give you a more powerful, more versatile platform to support your organization.

    Just my two cents.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • L' Eomot Inversé (7/10/2013)


    I'm inclined to disagree with earlier comments. Look at your data and see what changes or grows rapidly, and what is pretty stable and changes less often and in smaller quantity. You need different backup strategies for the different chunks of data if there is a significant distinction in those issues, unless you have store to burn in which you can hold pointless backup data, and the easy way to handle that is to put data into different databases. Of course this classification of tables may match up with the multiple modules or (more likely) not since several modules may cause updates to the same data. The discussion of linked servers is a red herring, since that's a question of how you split your data between servers not of how you split it between databases, and the question was about splitting into databases.

    +1000 to that. People just don't think about backups and the possible restores that go with them never mind thinking about static data that might only need to be backed up once per month and then only a part of that. It's not difficult to communicate between databases using SYNONYMs and there's virtually no performance hit.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • PaulB-TheOneAndOnly (7/16/2013)


    unnikozhissery (7/10/2013)


    In need to design an Enterprise level solution which will contain around 8 modules (such as Accounting, Human Resource, Inventory, Purchases etc). It will be a portal with single sign on to different modules.

    Now the question is whether to put all the tables of all the modules in one DB or to have different DB for each module. All the modules will be interlinked (E.g. Users will be pulled from the Human Resources in the Purchase module. And any purchases made will be moved to inventory and transaction will be posted in Accounts too).

    A rough estimate is that each module will have 100+ tables so in total 800+ tables in the whole system. In one module the transaction will be huge (2 million entries per month in a single table - will be archived at the end of the year)

    So what will be best way to design the database. Both advantages and disadvantages

    Putting everything together in a single database sound as a good idea but, I have seen more than enough good ideas turning bad.

    Think on the big picture...

    How about scalability?

    What is going to happen in the future if this system grows and grows? how are you planning to scale it? May be a solution would be to scale it out, on different servers in which case having different databases would make your life much easier.

    How about recoverability?

    What is going to happen if your HR Vice President comes one day and says, I'm so sorry, please recover the database as it was last Friday at 5:32PM? Having separate databases would make this easier? provided you have a sound backup/recovery strategy in place.

    How about business continuity?

    What is going to happen when you are planning your Business Continuity/Disaster Recovery plan? some applications like "billing" would be critical, on the "the company cannot survive without them" category while other applications like "HR" would be less than critical, perhaps on the "company can survive for two weeks without it" category. Having separate databases will help too, don't you think so?

    In short, even if it sound counter intuitive probably having separate databases would give you a more powerful, more versatile platform to support your organization.

    Just my two cents.

    Your 2 cents are worth a million bucks. "Divide'n'Conquer" works well even at the database level.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I'm somewhat inclined to agree with the single database argument here (the issues of backup & recoverability not withstanding).

    But, the bigger question for me is the development and release cycle. These 8 different modules, are they being developed by a single team of developers on a single schedule? Just exactly how completely interactive are they? Sharing lots and lots of inter-dependent data, or just a few general lookups.

    I had a system that, on the surface, sounds a lot like yours. So I made the decision to put it into a single DB. For terms of backup & restore and other general maintenance it's worked fine. It's also worked well for performance. But, with five different development teams making changes and releases on different schedules, we had HUGE bottlenecks on releases to production, testing, coding interdependencies that shouldn't have been there... Also, come to find out, the amount of information that was literally shared between these systems was minimal. The requirements gathering had, rather than look at the actual requirements of these systems, looked at a previous failed project that had put the data into overly protected individual silo's of information.

    So, before you go down the single database route (which can work just fine), I would dig down a little further on the processes and business needs of the system, not the technology.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Thanks everybody.

    This solution will look something like ERP.

    First a product was created which was specific to one industry (has its own hardware and software).

    Now after couple of years, some major requirement has come to add different modules like

    a. the users and managers of the system with payroll/time allocation etc (so HRMS).

    b. the assets/devices/spares etc bought for the product (so Inventory)

    c. The age of asset/services etc (so Asset management)

    Like wise many more modules have been identified and each of these submodules should be pluggable (customers can decide which to buy with the main product)

    One big team will be working on the project.

    And as of now we have decided to use different DB for each module

    Thanks a lot again.

    Unni

  • unnikozhissery (7/17/2013)


    And as of now we have decided to use different DB for each module

    There is a possibly big "gotcha" when using multiple databases for such a project that us folks recommending multiple databases forgot to mention. Declared Referential Integrity (DRI). You might know it better as "Foreign Keys". Foreign keys don't work across databases. If you need DRI across databases, you need some form of well written trigger. You can get around this somewhat if the DRI is to a small lookup table simply by having a copy of such "standard" lookup tables in each module which would allow each module (DB) to operate in a stand-alone fashion but, if you have module specific tables and there needs to be DRI between such table across modules then, like I said, you'll need triggers for the cross database DRI. If that's going to become a huge headache, it may very well be that a single database with a different schema for each module might be the better solution.

    Apologies for not thinking about that particular "little" nuance.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • A little more detail about the "gotcha" that Jeff pointed out. You need pairs of triggers for each cross-database foreign key relationship: one trigger in the referring database, and one in the referred to database. The trigger in the referring database is pretty trivial to write. The trigger in the referred to database can introduce performance issues, and if you want cascade or similar actions on delete/update it can be quite a complex trigger.

    In my experience, the gotcha has not actually been a problem, but that's because it's been possible to split the data into databases in such a way that cross-database foreign keys were not needed in the cases where I've chosen to split; that is definitely NOT the general case, so you need to make sure you are not buying yourself code complexity by splitting into separate database.

    Of course the trade off here can be that splitting into separate databases with separate backup and DR plans can give you quite a big performance gain (for example if you can use simple error recovery for some of your data that saves the cost of doing log backups for updates to that part of the data, and if some of the data doesn't need to be recovered right up to date but can be a few days or weeks or months behind (perhaps because the loss is self correcting, as it often can be for archive data in a system that requires a few days/weeks/months of recent data in live as opposed to archived storage) the backup costs can be reduced still further. But you have no guarantee of such performance gains until you have carried out a detailed analysis of the effects of splitting the data on your system. And triggers are so easy to get wrong (for example most of the DBAs I've known have written single row triggers; so the system won't work if it uses any set-oriented operations; of course the DBAs can be educated to write triggers that cope with set-oriented operations, but you need to make sure that you the DBAs have that education since forbidding set-oriented operations is almost always a performance disaster.

    Tom

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply