large number of databases

  • Is any of you guys working / worked on a SQL Server 2005 with many databases (> 500)?

    If yes, what issues have you encountered, what hardware have you used?

  • Yes

    Ok, the hardware was big with lots of CPUs and a SAN on the back end.

    Note that many ( in groups of 72 ) of the databases were part of the same applicaton. The customer had the data partioned by month and kept several years by month in seperate databases. The data was joined together using a view with a union all join. Using a link server some other databases were on other machines, and some of those were union all joined in to some of the views as well.

    We also tried multiple SQL Server instances on the same machine, but found the performance better with one single instance of SQL Server.

    As far as performance, the back end SANs were the biggest factor in database performance. BIG fiber optic connected SANS configured with RAID 10 and terrabytes of space makes it all come together.

    Fun stuff.....a pain in the behind to manage, so have your jobs and especially backups really automated.

    Have fun

    Eric Peterson

    http://www.petersonamerican.com

  • Thanks, I am working to implement a software package that creates a database per client, with a target of more then 1500 clients.

    From what I found so far the biggest issue seems to be procedure cache memory cause by to many query plans saved and the worker threads number. In our case each database has about 20,000 SPs.

    Microsoft recommends 500 DBs per instance, but even so base on the number I am wondering what kind of hardware can take this load.

  • I'd look to split these out. You can build a bigger box, but if you have issues, especially runaway queries, then it affects a lot of clients. Better to limit the # of databases/clients and spread your load out a bit.

  • icata (1/11/2008)


    Thanks, I am working to implement a software package that creates a database per client, with a target of more then 1500 clients.

    From what I found so far the biggest issue seems to be procedure cache memory cause by to many query plans saved and the worker threads number. In our case each database has about 20,000 SPs.

    Microsoft recommends 500 DBs per instance, but even so base on the number I am wondering what kind of hardware can take this load.

    20,000 SPs per database? You better have a really good naming convention... and, you better not make any of the names include any information about the client... they should all be indentical!

    Really? 20,000 SPs per database? What is it? People-Soft or SAP?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Microsoft Dynamics, all databases have the same set of SPs.

    There is a benchmark done by MS for 4000 DBs on the same box. It seems to point to a max of 500 DBs per instance, so each instance is getting 2/3GB of memory for the procedure cache.

  • I have a client with 6600 databases. I now have them running efficiently on a single 2 CPU dual core 64bit box with terrible I/O underlying it. yes, they are small clients and they don't use sprocs (REALLY bad ADO code actually) but still I am amazed at how good SQL Server really can be when well managed. Actually, the PSS and CAT people at Microsoft were pretty stunned too when I told them my tale at PASS last year. πŸ™‚

    With that number of sprocs you have you could consider staying on SP1. SP2 causes a DRAMATIC reduction in procedure cache - which is great for most circumstances but could be disasterous for you with that number of sprocs. Test SP2 HARD before jumping to it if you aren't already on it. I would say that number of CPUs (real ones, not Hyper threads) will be key for performance to keep up with the compiles you are likely to be encountering - SP1 OR SP2. With 16+ CPUs I would also consider fibre mode. Consider reducing the min memory per query too and get REALLY familiar with memory management and analysis so you can watch and tune various things as necessary. As you mentioned increasing the worker threads may be necessary - note that there are DMVs for analyzing this too.

    I found that with large numbers of databases virtually every tool I tried to throw at the server died a horrible death. Quest, Idera, ApexSQL, Embarcadero - all would blow up due to the overwhelming amount of metadata. The only thing I can use is SSMS! Log shipping/database mirroring is out too, so I had to build my own log shipping mechanism to do a refresh over to a warm-standby/reporting box.

    The client was in the hurt locker when they contacted me and it took several months for me to completely eliminate all of the performance bottlenecks I could to get rid of their constant timeouts and server lockups. As a consultant I am thankful for the opportunity to have been able to work on such a system. 😎 We are now pursuing a scale-out mechanism whereby we will have client-database silos with 3-4K databases each that connect up to a primary 'global database' server housing the dozen or so databases that each client database needs to interact with. Fun stuff!!

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Thanks man, this is really helpfull.

    From what I understand, you haven't used several instances to hold these number of database, just the default one, right?

  • You should absolutely NOT us multiple instances on a single box to manage large numbers of databases. The various 'overheads' of multiple instances should be avoided (actually it should be avoided in almost every situation IMHO).

    Glad I could be of assistance.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • what does single box here mean ? thx.

  • "Single box" = 1 physical server (or virtual server if you are using those - although that is definitely NOT a best practice for high-performance).

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Thanks, your suggestion to not use under any circumstance SQL instances make me start reconsidering the entire SQL architecture that I had in mind.

    I was kind of assuming that using instances is kind of more then desirable, the reason been the limitation one instance has in the size of the procedure cache - more instances result in total more RAM used for procedure cache.

    I was using as a blue print "Microsoft SQL Server 2000 Scalability Projectβ€”Server Consolidation"

    http://technet.microsoft.com/en-us/library/aa902665(SQL.80).aspx

    What do you think might be the reason that using multiple instances might cause issues in the many databases scenario?

  • 20,000 SP ?????.... that is BAD!!! I believe whoever created such thing did a lousy job.

    By the way after database 99 many stats are not gathered by SQL Server.


    * Noel

  • noeld (1/14/2008)


    20,000 SP ?????.... that is BAD!!! I believe whoever created such thing did a lousy job.

    By the way after database 99 many stats are not gathered by SQL Server.

    Microsoft did the 20K+ sprocs. As does several ERP vendors IIRC. Not necessarily bad. πŸ™‚

    Do you have a reference about the >99 database stats statement you made?

    OP: I could see having multiple instances as providing you with more proc cache per instance but you won't get the benefit from this that you may think you might. Even if you go with 10 instances you will wind up with 150 dbs / instance, right? That would potentially be 150*20K sproc plans in each cache - well above the 500*200 in the document you refer to. Given that there is easily 500MB of RAM-chewing 'fluff' per instance (not to mention CPU context switches) you do give up a non-negligible amount of resources running lots of instances. Oh, also note that the referenced document you list is for SQL 2000, NOT 2005. There are HUGE differences in procedure cache between the two systems!! Besides, they were probably running a benchmarking app that made calls to every sproc regularly, which will most certainly NOT be happening in your 20K-sproc databases. In any case I think lots of real CPUs will be your friends here. πŸ™‚

    The primary situation where I think more than one instance is beneficial is where you have one or a few dbs that REALLY need to have fast response time. With multiple instances and appropriate memory settings, disk file placement and perhaps even CPU affinity you can guarantee important stuff gets sufficient resources always.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Microsoft did the 20K+ sprocs. As does several ERP vendors IIRC. Not necessarily bad.:)

    Well I would re-write that quote this way:

    "If it was written by M$ it is NOT necessarily good" πŸ˜€

    I have seen *a lot* of M$ written (aquired to be fair) apps that are pathetic πŸ™‚

    About the first 99 databases here is a link: http://technet.microsoft.com/en-us/library/aa905151(SQL.80).aspx

    here is the relevant part :

    "Note Performance condition alerts are only available for the first 99 databases. Any databases created after the first 99 databases will not be included in the sysperfinfo system table, and using the sp_add_alert procedure will return an error."

    Apparently this is fixed in 2005 ( I must update my link πŸ˜‰ ) But I see no confirmation neither denial that this limitation has been removed.

    Just my $0.02


    * Noel

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply