This past week I was presented with a very unique issue. A call came in with a production issue on a tier one application. Unfortunately it was on a system that utilizes SQL 2000 SP4. Having a rather small toolset to use for support I had to rely on Perfmon, sysprocesses and the usual old school tools.
What I found was page life expectancy was higher than usual, CPU was lower than usual, memory was usual, disk metrics were way lower than normal, but thread count was out the roof with pageiolatch waits.
We had all tech areas on the call and we brought in all our top DBA’s. Earlier in my analysis I came across a scan64.exe that was running. Our virus protection is setup to best practice for database servers on the “On Access Scan” being that we exclude the normal files. I did not recognize scan64.exe, noticed it was only using 6% CPU so I looked up what processes was using it. I found that it was indeed a virus scan product so I did what any other admin would do, I tried to kill it. I couldn’t kill the process so I moved on since it wasn’t using much CPU.
In times past when on access scan would be killing a server, CPU would be off the chart. Since scan64.exe wasn’t I didn’t give it much focus. Another DBA brought it up and once we got Microsoft on the phone and mentioned scan64.exe they stated that belonged to a filter driver product with virus protection. After much more research we found that this was scanning at the application layer ‘sqlserv.exe’. All operations were being ran through that driver thus applying a DOS attack on the server.
After shutting down that driver and rebooting the server, everything went back to normal.
Lesson learned for me, anything out of the norm should be a suspect!
References for virus protection best practices – http://timradney.com/virusscanbestpractices
Filter drivers on SQL Servers – http://timradney.com/filterdrivers