After reading this confession, I am sure almost every DBA is going feel a churn in their stomach. They are all going to say “How could he do that? Does he not know that it is not right? Why does he keep doing that still?".
My confession is that we use READ UNCOMMITTED all over the place. It is used regularly on the Report DB and the OLTP server. This is even part of our Review guideline (Yes, we review all DB objects before they hit production). There I said it. Before you say “Are you nuts” let me give you some back ground information on our set up.
We have a web farm that currently has more than seventy-five servers. We also have a cluster for TS where other client applications are running from. Our customer base is over four hundred thousand and our website pulls an average bandwidth of 8 MB per sec peaking up to 15 MB at peak time. At given time, there are more than eight hundred SQL connections on the OLTP database. We do an average of 13K batch requests per second (peaks up to 21K). Now you get an idea of how busy our SQL Server DB is.
In my previous blog I had stated that we did some performance test while upgrading our server. We were able to an average of 8K batch request per second. That was couple of years back. We had half the customer base as we have today. There fore that performance was good enough. But while doing some capacity planning, we were advised that we could expect double the traffic on the web site in two years time. That meant that our server would not be able to service the request from the website in a timely fashion.
I did some research on how we could get more out of the server. I was looking basically for a “Turbo switch”. I poured through blogs and articles trying to find out what could be done. Everyone was talking about tuning the queries that hit the DB. You can get only so much out of tuning queries. That was when I read the blog from Linchi Shea where he stated the performance between a READ_ONLY DB and a READ_WRITE DB. The more the load (User connections) the more the difference is. The difference was around 10%. Keep in mind that even though the tests were run on DB’s that were READ_ONLY and READ_WRITE, the queries were just Selects. But in a real world scenario, there are multiple reads and writes happening concurrently. There fore in our scenario, the performance gain would be much higher. I knew that I found my Turbo switch. But there are drawbacks in using READ UNCOMMITTED. The dataset returned to the application might not be accurate, you could get strange errors like “Could not continue scan with NOLOCK due to data movement”.
I went back to the management and advised them of all the issues that we would get by taking this approach. The management had no problem with showing data that might not be accurate but they wanted to make sure that when ever a transaction is done, it had to be based on the right values and if there was a change in any value, the customer should be advised of the change before inserting the data. That was not difficult to achieve. We started using READ UNCOMMITTED and we have not looked back after wards.
In addition to this, we did quite a bit of work on query tuning, moved all queries that looked at historical data to our replicated server and then doubled our RAM. Currently we have 256 GB of RAM. We review all objects that are to be released into production, maintain the indexes and monitor the DB for any irregularities. We have also introduced Caching of data in the application layer so that the database does not get a heavier load. Our next project is to add an additional Drive shelf with its own IO controller and see if we can gain a bit more IO performance by moving certain tables and/or indexes into that drive by adding another file.
If anyone has better suggestions than what we are doing, please feel free to comment. All comments are welcome.