I work with a similar environment albeit not at the same scale, and I'd certainly be interested in anything on this topic - I'm sure I'll hit a lot of the issues you have seen at some point.
A couple of questions for abair34:
Do you use any failover technologies other than clustering? Did these change as the number of databases grew?
Do you have to support a 24x7 environment or are you able to balance the load by timezone?