I am researching a Service Broker performance problem. I'm in the initial stages of investigating this, but I could really use some pointers on where to look for the issue and whether the design has obvious flaws.
Service Broker is being used for messaging between a database and an application (the application has been developed in C# and runs as a Windows service on another server). The database puts messages on the queue and the application removes them.
There are two services (initiator and target) on the same SQL Server. The database (also on the same server) sends a message from the initiator service to the target service. There are 10 conversations open, corresponding to 10 threads on the application. These are long-running conversations which are only ended and replaced if needed. There are never more than 10 conversations being used.
The calling application opens a transaction (using .NET), calls a stored procedure which receives the top 1 message where the conversation ID matches the one being handled by that thread. The application commits that transaction after it's certain that there are no errors.
We expect to process about 2 messages per second per thread (20 messages per second). This rate will be constant for hours, with slow times only coming at night.
In testing on a performance-test server that has been well-provisioned with hardware, the message rate is acceptable for less than a minute. After that, the performance rapidly slows. Once the performance is fully "slowed", it takes one hour to receive 1600 message from the 10 conversations, with no more being added. "Unacceptable" does not even begin to describe this performance.
Interestingly enough, the process generally performs well when running on local (developer machine) installations of SQL Server with the application also running locally. The local tests are limited in time and scope, but they out-perform the servers easily. It performs poorly on every "production-level" server to which it's been deployed. Code profiling indicates that almost all of the time is spent calling the stored procedure with the RECEIVE statement.
I'm not experienced in performance tuning SQL Server, but I suspect I'm about to learn that. However, the dismal performance of this process on well-provisioned servers makes me believe that the design is flawed. I do know it's nearly impossible to "tune" a poor design.
Does anyone see a smoking gun in this description? Any tips on how to proceed?