recommendations for MAX_QUEUE_READER option on queue

Question

recommendations for MAX_QUEUE_READER option on queue

MVDBA (Mike Vessey)

SSC-Insane

Points: 21797
More actions
May 14, 2012 at 9:00 am

#268863

does anyone have any best practice details or suggestions on how to choose a value for MAX_QUEUE_READER parameter of a receive queue?
i would like to be able to run as many paralell processes as possible, but i'd like to see if anyone has any past experiences or documentation relating to optimal configurations?
e.g. 1 per cpu core? 10 per cpu core?
the app is importing CSV files into tables and i'm importing multiple files simultaneously to try and acheive maximum IO throughput.
any thoughts?
MVDBA

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 1

Going with the "it depends" answer. How many files are being processed, how often are they processed, how big is each file that is being processed, how long does it take to process each file (average may be good here), what else is occurring on the system, what is the system configuration (processors, memory, disk (SAN or DASD)).

As you can see there are a lot of variables. You may just need to play with the values until you find something that statisfies your SLA for processing files as well as the other requirements of the system.

MVDBA (Mike Vessey) SSC-Insane Points: 21797 More actions · Answer 2

Lynn Pettis (5/14/2012)
Going with the "it depends" answer. How many files are being processed, how often are they processed, how big is each file that is being processed, how long does it take to process each file (average may be good here), what else is occurring on the system, what is the system configuration (processors, memory, disk (SAN or DASD)).
As you can see there are a lot of variables. You may just need to play with the values until you find something that statisfies your SLA for processing files as well as the other requirements of the system.

ok - all good questions

number of files - thousands, each constains potentially 10k+rows

average time to process - 2 seconds - max time to process 40 seconds

the data from the files is pumped into tables that are used for reporting and aggregation, but it's really simple stuff, no joins

the hardware is Amazon 64 bit with 64 GB ram - unfortunatly we have no control over the san disk config as it's hidden from you on amazon - but it's very slooooow. CPU most likely 4 cores

i have no SLA and there are no other requirements (most of the work is being done by other systems, this is just for aggregation)

rather than anyone just telling me a figure, really i'm looking for some info/past expreience/best practice/guidlines so i can work it out myself without having to keep incrementing by 1 and looping through the test 😀 - once i get in the ballpark i can fine tune it later (plus also i would like to be able to explain why i chose that figure)

thanks

MVDBA

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 3

michael vessey (5/14/2012)
Lynn Pettis (5/14/2012)
Going with the "it depends" answer. How many files are being processed, how often are they processed, how big is each file that is being processed, how long does it take to process each file (average may be good here), what else is occurring on the system, what is the system configuration (processors, memory, disk (SAN or DASD)).
As you can see there are a lot of variables. You may just need to play with the values until you find something that statisfies your SLA for processing files as well as the other requirements of the system.
ok - all good questions
number of files - thousands, each constains potentially 10k+rows
average time to process - 2 seconds - max time to process 40 seconds
the data from the files is pumped into tables that are used for reporting and aggregation, but it's really simple stuff, no joins
the hardware is Amazon 64 bit with 64 GB ram - unfortunatly we have no control over the san disk config as it's hidden from you on amazon - but it's very slooooow. CPU most likely 4 cores
i have no SLA and there are no other requirements (most of the work is being done by other systems, this is just for aggregation)
rather than anyone just telling me a figure, really i'm looking for some info/past expreience/best practice/guidlines so i can work it out myself without having to keep incrementing by 1 and looping through the test 😀 - once i get in the ballpark i can fine tune it later (plus also i would like to be able to explain why i chose that figure)
thanks

I missed a question, how many files (average would be good) are received over a set time period (again, you will have to define this as it could be every minute, every 10 minutes, etc). This will have a impact on how many readers you will need. If you receive 10 files a minute and each file is processed in less than a minute you only need 10 readers. Also, are there spikes and lulls in receiving the files and are they significant in time span.

MVDBA (Mike Vessey) SSC-Insane Points: 21797 More actions · Answer 4

thanks for the reply lynn

i have to assume that the files will be constant and a 24*7 basis, and that the number of files will be more than a single machine can handle (worst case scenario there is a potential that we would receive in excess of 150 million files in the space of 24 hours)

it is acceptable for us to have files pending and a large backlog - we are under no pressure to perform to any SLA.

the thing with this is that the data is not required immediatley - from receiving the first file we may have to wait 24 hours until the last file is actually sent to us (i can't discuss the actual product, but imagine it was a large retail chain and tranactional data and status data are sent to a head office - plus the head office has the ability to send out questions to the till systems for a status update - if it's a sunday and that store is closed then we wait until it comes online)

the solution we have will scale sideways rather than demanding more powerfull hardware (as we are limited by amazon IO) so the real task is optimising the solution so that we can run on as few machines as possible when we go live.

really i'm not looking for a solution, more of a best practice guide (ie a microsoft web page or a white paper) so that we can say - "this is the reason we chose a figure of 50"

my main line of thinking is how many spid threads can each CPU handle and how do i manage the locking of the inserts when more than 1 file goes into a single table

MVDBA

MVDBA (Mike Vessey) SSC-Insane Points: 21797 More actions · Answer 5

i did find this interesting snippet

http://datageekgal.blogspot.co.uk/2008/07/sql-server-service-broker-archival.html

ARCHIVAL TIP #5: TUNE THE NUMBER OF QUEUE READERS
I've yet to find a solid recommendation for choosing the number of queue readers (threads) to activate for a given queue. One post suggested activating a queue reader per cpu. In my particular situation I've seen improved throughput with a number almost triple the cpu count.
I'm NOT suggesting that you use my numbers. Test, test, and test some more to find the thread level that allows you to keep up, or catch up, under load. For me, it was useful to play with both the MAX_QUEUE_READERS setting in the queue definition and the number of messages dequeued by the activated stored procedure at any one time (via the RECEIVE TOP(x) syntax).

any thoughts ?

MVDBA

MVDBA (Mike Vessey) SSC-Insane Points: 21797 More actions · Answer 6

i think i hit the jackpot

http://msdn.microsoft.com/en-us/library/dd576261(v=sql.100).aspx

finally after 2 days worth of digging 😀

MVDBA

RBarryYoung SSC Guru Points: 143329 More actions · Answer 7

michael vessey (5/15/2012)
i think i hit the jackpot
http://msdn.microsoft.com/en-us/library/dd576261(v=sql.100).aspx
finally after 2 days worth of digging 😀

Be forewarned Micheal, that article (and most MS Service Broker performance articles) are focused on how to increase performance of Service Broker by reducing the overhead and impact of Service Broker itself. As such the kind of tasks that are being serviced by the Activation Procedures are extremely low latency tasks, so that they can focus on the load from SB.

What they do not usually look at very much (and what I feel your principal performance concern is going to be), is what the performance impact/overhead of concurrency and locking stalls in the service tasks themselves is going to be. That is never easy to assess, and I do not know any hard and fast rules other than "always test".

A theoretical upper limit is easy to set: it's the number of CPU threads plus the number of disk spindles that will be used by your service tasks. In theory, you can never have more tasks physically "in service"(*) than that anyway. Anymore than that and the excess would have to be physically "in queue" for resources, instead of in your Service Broker's queue.

The real question then becomes, between One(1) and that number, when does the CPU task switching overhead and/or the effect of disk-head slewing from random access, and/or the hidden subtle logical effects of locking on the tasks, outweigh the benefits of maximizing the physical resource utilizations?

And there's no generally solvable formula for that. (well there is for the CPU part, but not the other two).

(*-- "in service" and "in queue" are the the Performance & CP technical terms for using a physical resource and waiting to use a physical resource, respectively)

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

RBarryYoung SSC Guru Points: 143329 More actions · Answer 8

RBarryYoung (5/15/2012)
...
A theoretical upper limit is easy to set: it's the number of CPU threads plus the number of disk spindles that will be used by your service tasks. In theory, you can never have more tasks physically "in service"(*) than that anyway. Anymore than that and the excess would have to be physically "in queue" for resources, instead of in your Service Broker's queue.
...

Hmm, I should correct myself here.

Because loading in general is probabilistic and the loading between the CPU's and each individual disk spindle tends to be uneven, this number is not the actual upper limit, but rather some factor of it to account for both the variance and the disparities. Generally "3x" is the seat-of-the-pants number that is used as a starting point, if you are trying to get all resource utilizations to 100%.

Note again, that for most "normal" service-tasks, you'll reach the overhead+blocking cross-over point well before that. (One exception to this is "insert-only" operations on a single, un-indexed table, which is highly optimized in SQL Server).

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

MVDBA (Mike Vessey) SSC-Insane Points: 21797 More actions · Answer 9

just an update

after some testing we have adopted a different approach - it wasn't service broker that was the problem (as mentioned in the previous 2 posts)

the 64 bit servers we have have to use the ACCESS drivers for reading CSV files through openquery - (MS in their infinite wisdom do not ship the standard text drivers with windows 64 bit) - we found that these drivers simply do not scale for large files. (we have explored bulk insert, but found it did not meet our requirements)

the time to import a file seems to rise exponentially based soley on the number of records in the file

for example a 50Mb file with 10k rows reads far faster(4 seconds) than 10mb file with 50k rows (69 seconds)

as such we are delegating that work to application servers and they are passing us the data as Table value parameters (TVP) via direct connection. all that remains for our procs is to insert the data. - actually this works better for us as we can then filter the incoming data to just the info required. - just the locking to deal with now!!!!!

service broker will be involved at a later stage, but not in such a main focus.

thanks guys for contributing, although the initial solution we proposed was not scalable at least we ruled it out and are trying something else

MVDBA

R.P.Rozema SSChampion Points: 12307 More actions · Answer 10

You may still be interested to read this blog post from the developers of SSSB which describes how to configure your service broker routes such that the calls to your service are distributed into multiple machines implementing the same service, i.e. create a load balancer so that the workload of importing the files can be distributed over multiple machines:http://blogs.msdn.com/b/sql_service_broker/archive/2008/07/14/using-multiple-routes-in-service-broker.aspx

Posting Data Etiquette - Jeff Moden[/url]
Posting Performance Based Questions - Gail Shaw[/url]
Hidden RBAR - Jeff Moden[/url]
Cross Tabs and Pivots - Jeff Moden[/url]
Catch-all queries - Gail Shaw[/url]

If you don't have time to do it right, when will you have time to do it over?