• Actually, the I/O demands from the OLTP server are not extremely high. The CPU context switches metric is relatively low; there's not a lot of thread thrashing. The SQL Server waits profile is one of the best I've seen in my entire career, almost textbook perfect.

    I work closely with the Systems Admin group and we know that there isn't an I/O path bottleneck, anywhere. at least nothing we can see. The NIC cards are way underutilized and the Simplivity system in general handles the workload very well. Since we moved reporting and data analysis off the OLTP server to a reporting server, our CPU cores run at about 20%-60% utilization; they're not even close to being maxed out.

    In answer to your question, I can't tune the application software, but I can recommend changes and they are happening. The current admin and development staff inherited some less than optimally implemented technologies and we're in the process of fixing, refining, and re-architecting.

    I have exceptionally detailed knowledge of the load and stress points on, and generated by the application software and the queries it executes, as well as the databases' architectures and the design problems they have. I've considered all your suggestions and pursued many more points of investigation, but thanks for suggesting anything; I'm open to all suggestions.

    Thanks for bringing up the I/O path bottleneck. It is one of the last remaining possibilities, as the people that set this up did some odd things involving 2 NIC cards that seem inexplicable, according to our admin group. They and I eventually will solve the mysteries about that, and it may be part of the problem, don't know yet. I was told that the last attempt at reimplementing it failed.

    My situation has every problem you listed and many more on multiple servers and databases. Right now, I've got the fires out and I'm dealing with the smoke and embers. My first audit of the main server, the OLTP server, produced a 26 page report detailing the use, misuse, and the problems with the OLTP environment and the databases on that server. I still haven't been able to implement and resolve at least 80% of the problems I identified in my audit.

    Resolving them is a process, not an event, so I'm still in the middle of the process.

    I started with the fundamentals and worked up: the hardware on the physical machine, the VMWare configuration settings, the Windows Server configuration settings, the SQL Server configuration settings, the database level settings, ... I'm still working my way up to the more detailed problems like hundreds of missing indexes, excessive page splits, inadequate index maintenance, and more. But I'm getting there.

    Like a fine wine, no problem solved before its time. :^)

    But that's why the company hired me, to straighten out their servers, their applications, and to educate the software development staff who are quite talented but not so well educated about writing efficient SQL. We're getting there.

    Having started with the fundamentals, the platforms, is why I generated this post. I've taken care of all the fundamentals except for I/O stalls. I'm stuck on I/O stalls on the hardware platform, can't seem to get beyond it, even though it is creating no system user visible problems, so I'm considering non-conventional alternatives, since the manufacturer cannot offer me any additional assistance.

    I genuinely appreciate all your posts in response to my original one.