Thanks for the reply. I was not aware of the 25 instance limit; that is a good thing to know!
That is a good future state for sure, but current state we have a lot of instances. Did some checking and I counted wrong. I was including both test and live in my count. I don't need the test ones, so that drops the number down to 27 instances ranging across multiple versions. If we exclude our SQL 2000 and SQL 2005 instances from that mix, we are at 25.
Consolidation is something on our radar to try to reduce the number of instances down as it would make resource planning on the servers much nicer. Trying to keep the MAX memory per instance at a reasonable level while leaving some room for SSIS and SSRS on these servers has been interesting to say the least.
We do currently have a failover tool, but are investigating what other options we have which is why failover clustering came into the mix.
The nice thing about the instances is they are mostly OLTP, but none are exceptionally heavy use. We rarely see CPU spikes except when our ETL processes run to our OLAP instance. And memory only spikes when that ETL process runs too. We have it tuned pretty good at the moment.
The main reason for having so many instances for our small team (3 DBA's) is the requirements we got at the time the instance was created. For example, IT has several tools that all need a database backend. SOME of the tools claim to need sysadmin permissions, so those get their own instances. I'm not willing to risk having the tool drop a database it does not recognize when updates happen for example. Then we have a single instance for our ERP tool as we want to keep that one as clean of things we need to create as possible to keep the auditors happy. They like to go through that instance and ask us a lot of questions each year about who has access and what they have access to. Plus a few different instances for our internal processing (company internal data such as test results, not DBA specific).
Most, if not all, of the 3rd party tools requested some level of permissions that I didn't trust having on the same system as anything critical. Plus some of the 3rd party tools are only compatible with a specific SQL version which is what led to some of our version sprawl. With the 2000 and 2005 instances - those are the LATEST versions that the tools support, mind you the tools are no longer in support.
My expectation is that in the end, we will be left with about 10-15 SQL instances total across both test and live just to make sure all of our tools can work. Getting it down to 3 would be ideal, but just taking a more realistic approach of even getting it down by half to 30 would be a huge improvement.
Now, with our instances, we have 2 out of the 25 that are bigger resource hogs - our ERP one and our primary test result one, so those 2 we like to keep on different physical boxes as much as possible. So lets call those A and B. If our servers are 1, 2 and 3, we would host A on 1, B on 2, and the secondary for both of these we would like to be 3. So if 2 fails, B starts up on 3. If 2 AND 3 fail, B would start up on 1.
At the moment, I am mostly curious if this is a possibility. If so, I think our next steps will be to either do some trial and error with our test systems or to reach out to a consultant. I just didn't want to use a lot of resources only to find out what I am trying to do is going to fail.