Okay this is sort of complicated so it's going to be a bit long.
This is a fairly simple version of a larger problem for which setting up domain trusts is not a viable solution. I've seen other topics in other places get derailed by the argument "the only right way to do it is setting up a trust relationship between domains so drop everything else and do that". Well regardless of the "theological" argument of there being only one true way and all others should be shunned, that's just not going to happen in a variety of security situations including the one I'm dealing with.
So, The setup...
We've got 3 different DMZ zones that are split across two domains. Two of the zones are 'exposed' to the internet and support connections through HTTP and HTTPS (mostly these). Let's represent these two zones as A and B.
The second type of DMZ zone is the internal zone where the database servers reside. This zone is accessible only via a Winsock connection on port 1433 between one or more web servers in one of the exposed zones and a database server in the internal zone. Let's call this zone C since in our implementation there's currently only one.
As for Active Directory (AD) domains, Zone A and C are in one AD forest and zone B is in a completely different one. There are other servers in zone B connecting to god knows what, I don't have a need to know so I can't say. We already have servers in A connecting to C with no problems and life there is bliss.
The current solution...
We have a recent need to connect a server in Zone B to Zone C. It so happens that the B server is also running SharePoint Foundation 2010 (SPF) as the application layer interface but that's not where the problem lies. Or at least not entirely.
We used a feature in Windows Server 2008, also found in Windows 7 and can be retrofit into earlier OS's, called Credential Manager (CM). Now CM let's you create a trusted store for credentials, certificates and/or username/password pairs and to store and transmit those credentials in an encrypted "secure" form.
With CM you can assign a particular credential to a user and say "Whenever you connect with this IP(in zone C), including an optional port to confine it to a single well known service (WKS), use this login Id(from zone B) instead of your current login Id(from zone C)". This works great, believe it or not, and even works for service accounts if you do a little dancing around it by temporarily enabling them for interactive login, setting up the zone B credential, and then removing the interactive login privilege.
We were even able to install SPF and have it create the databases and such and start the website. It works perfectly for interactive logins like creating a session (in zone C) and using SSMS to connect to the database server (in zone B).
Now the problem...
Okay for non-interactive logins, like executing the SPF site as the SharePoint Administrator, it runs okay for a time period of between 1 to 3 hours.
What we found in the logs...
When it's working we see a event id of 4648, "A login was attempted using explicit credentials", in the web server's Security log. This shows both the local login that is making the request and the account name (but not the password) that has been used on the remote machine. And it shows up as a successful login attempt.
Meanwhile back at the database server, we're seeing a corresponding logon with event Id of 4624, "An account was successfully logged on.". This looks exactly the same as an interactive login on the database server except it's got a login type id of '3' signifying a network logon. These session(s) connect and stay open for around 15 minutes on average and then they eventually logoff. I think the connection management is keeping an open pipe to the database backend but if calls aren't made then the connection closes.
Usually within a few seconds, 20 to 40 seconds on average, of a connection closing on the web server side we see an explicit request of the 4648 type and a corresponding 4624 on the database server. This goes on fat dumb and happy for 1-3 hours.
Then suddenly, on the database server side we see a pair of Audit Failure messages one an event Id of 4776, "Credential Validation", error and one event Id of 4625, "Account failed to log on.", error. In the details for the 4625 we can see that it's attempting to use the account Id from the other domain (zone C) and other errors can be found indicating that a login was attempted from a non-trusted domain (zone C). On the web server side there is not indication at all that a log on attempt has taken place, the 4648 events disappear.
The final piece we have...
At the time we notice the problem if we restart the IIS Web Server the problem disappears for a while. That's the problem in a nutshell. In other forums we're getting a wide variety of unhelpful answers, eg "then don't do that". And, not unsurprisingly, the usual degree of fingerpointing indicating it's always someone else's problem.
Just wanted to throw this out to the combined wisdom of the SSC community and see if anyone has solved a similar problem and, if so, how. We're not committed to Credential Manager as an answer but we do need a secure channel for sending credentials. SQL Server authentication is NOT an option. And we don't have the option of setting up a domain trust.
As I said I've seen other people with a similar problem and the general form of the question is that the Web Server is being hosted in one place and can allow login(s) controlled by that AD domain AND the database is being hosted and controlled by a second group that has a different set of people that have access and that set of login(s) is controlled by a different AD domain. The 'limit of trust' is being able to store a valid encrypted credential for domain B in domain C. Any solution that allows that and keeps working instead of failing without leaving a footprint would be viable. To reiterate (because I've seen 'that's stupid do it by one of your disallowed solutions' in so many other places to other people's posts), SQL authentication and cross-domain trusts are both off the table.
Come Watson the game is afoot...
Ah, I was explicit above the operating system version being Windows Server 2008. In fact it is Release 2 of 2K8. AND, the SQL version we're using is SQL Server 2008 also R2. We have tried both Standard Edition and Developer's Edition of SQL Server so it's not a case of Enterprise edition or Datacenter edition doing something that Standard won't. (In fact given the facts I suspect it's a problem on the Web Server side of the equation not SQL server at all!)