• I do agree that clustering has its benefit. But for me, its still a single point of failure. Clustering is availability vs DM is redundancy. I don’t doubt the reliability of SAN but sometimes shits just happen.

     

    We too, had SAN that replicate at disk level (MirrorStore) replicating to another site. But unfortunately, not all DBs are available on SAN and I’m not used to the concept of sharing a disk for few DBs/app, especially those busy OLTPs. Say SAN replication is enabled, how would you put the DB up if your building is burnt? Would business wait until the SQL instance is up? You’ll still need a SQL instance somewhere to access the replicated DB, and this is what I’m using DM for. Sometimes, businesses enforces that the system should never be down for mere few hours. It costs them too much.

     

    There are even more expensive solution such as setting up mirror + clustering, which quadruple the hardware and costs. But, I reckon these are case by case. Nothing is “one” best solution. I’m sure you’ve got your reasons for clustering, and its working for you.

    I would also like to share a bit of my experience on log shipping.

     

    Log shipping was a headache for me too because it requires intermittent attention, although it’s a bit less now. It doesn’t work very well with VLDB as it constantly giving error messages on backup/restore alerts jobs. Fortunately we’ve got MOM for monitoring and would know which ones are not false alarms and making sure the alerts are minimised (E.g increase log backup frequency, set log restore to be a bit higher, increase log shipping monitor alert threshold, etc). So far, we’ve reduced a lot of false alarm and manage to monitor all DBs pretty well (but not without 0 effort though). That’s why I’m interested to slowly use DB mirroring if it proves superior over log shipping. There’s a custom solution building log shipping + SQL Lite Speed to compress the transaction logs before sending over the network. I'm pretty sure there are codes out there that can be downloaded and tested. I remember by a person name Chris Kempler who wrote a similar solution.

     

    The business I’m in requires redundancy, even at some point of time it might sacrifice a bit of performance.

     

    Ok, now back to DM. I’ve not changes any of my application to be re-written to DM aware. All our DRP plans require manual intervention, which is my preference too. I don’t like if a system switches itself back and forth when we’ve got network glitch rather than a real DR situation. We created a DNS CNAME for our SQL Server and application are config to connect using this CNAME. If primary server is down, our network guys can quickly change the CNAME to have the DR IP. A forced DNS/IP refresh can happen at any time. So, from the application point of view, no config changes would need to be performed. This requires manual intervention, which is part of our policy. This CNAME in some way works the same as clustering virtual config where CNAME = SQL virtual name, CNAME IP = SQL virtual IP and CNAME IP can be changed at any time.

     

    An important point to DM, the default failover time for DM is 10s. This is way too low! I would say at any time, set it to a higher value if you've got an automatic failover enabled, e.g. 90s

     

    ALTER DATABASE <DB Name> SET PARTNER TIMEOUT 90;

     

    With or without clustering, the server needs to be spec-ed to support the DBs load its holding. I don’t think clustering helps in many ways in terms of performance. Funnily enough, I found its harder to support clustering than DM. M$ specifically do not recommend clustering for separate geographic location as likely you’ll need to involve hardware manufacturer in any issue. Read the part on the Node Location

    http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/failclus.mspx

     

    I would be very interested how you architect your servers for a business objects and its pros and cons.

     

    Simon

    Simon Liew
    Microsoft Certified Master: SQL Server 2008