Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

recommendations for MAX_QUEUE_READER option on queue Expand / Collapse
Author
Message
Posted Monday, May 14, 2012 9:00 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Thursday, June 26, 2014 8:49 AM
Points: 2,276, Visits: 779
does anyone have any best practice details or suggestions on how to choose a value for MAX_QUEUE_READER parameter of a receive queue?

i would like to be able to run as many paralell processes as possible, but i'd like to see if anyone has any past experiences or documentation relating to optimal configurations?

e.g. 1 per cpu core? 10 per cpu core?

the app is importing CSV files into tables and i'm importing multiple files simultaneously to try and acheive maximum IO throughput.

any thoughts?


MVDBA
Post #1299623
Posted Monday, May 14, 2012 9:30 AM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Yesterday @ 4:59 PM
Points: 23,000, Visits: 31,482
Going with the "it depends" answer. How many files are being processed, how often are they processed, how big is each file that is being processed, how long does it take to process each file (average may be good here), what else is occurring on the system, what is the system configuration (processors, memory, disk (SAN or DASD)).

As you can see there are a lot of variables. You may just need to play with the values until you find something that statisfies your SLA for processing files as well as the other requirements of the system.



Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #1299656
Posted Monday, May 14, 2012 10:02 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Thursday, June 26, 2014 8:49 AM
Points: 2,276, Visits: 779
Lynn Pettis (5/14/2012)
Going with the "it depends" answer. How many files are being processed, how often are they processed, how big is each file that is being processed, how long does it take to process each file (average may be good here), what else is occurring on the system, what is the system configuration (processors, memory, disk (SAN or DASD)).

As you can see there are a lot of variables. You may just need to play with the values until you find something that statisfies your SLA for processing files as well as the other requirements of the system.


ok - all good questions

number of files - thousands, each constains potentially 10k+rows
average time to process - 2 seconds - max time to process 40 seconds
the data from the files is pumped into tables that are used for reporting and aggregation, but it's really simple stuff, no joins
the hardware is Amazon 64 bit with 64 GB ram - unfortunatly we have no control over the san disk config as it's hidden from you on amazon - but it's very slooooow. CPU most likely 4 cores
i have no SLA and there are no other requirements (most of the work is being done by other systems, this is just for aggregation)

rather than anyone just telling me a figure, really i'm looking for some info/past expreience/best practice/guidlines so i can work it out myself without having to keep incrementing by 1 and looping through the test - once i get in the ballpark i can fine tune it later (plus also i would like to be able to explain why i chose that figure)

thanks





MVDBA
Post #1299690
Posted Monday, May 14, 2012 10:39 AM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Yesterday @ 4:59 PM
Points: 23,000, Visits: 31,482
michael vessey (5/14/2012)
Lynn Pettis (5/14/2012)
Going with the "it depends" answer. How many files are being processed, how often are they processed, how big is each file that is being processed, how long does it take to process each file (average may be good here), what else is occurring on the system, what is the system configuration (processors, memory, disk (SAN or DASD)).

As you can see there are a lot of variables. You may just need to play with the values until you find something that statisfies your SLA for processing files as well as the other requirements of the system.


ok - all good questions

number of files - thousands, each constains potentially 10k+rows
average time to process - 2 seconds - max time to process 40 seconds
the data from the files is pumped into tables that are used for reporting and aggregation, but it's really simple stuff, no joins
the hardware is Amazon 64 bit with 64 GB ram - unfortunatly we have no control over the san disk config as it's hidden from you on amazon - but it's very slooooow. CPU most likely 4 cores
i have no SLA and there are no other requirements (most of the work is being done by other systems, this is just for aggregation)

rather than anyone just telling me a figure, really i'm looking for some info/past expreience/best practice/guidlines so i can work it out myself without having to keep incrementing by 1 and looping through the test - once i get in the ballpark i can fine tune it later (plus also i would like to be able to explain why i chose that figure)

thanks





I missed a question, how many files (average would be good) are received over a set time period (again, you will have to define this as it could be every minute, every 10 minutes, etc). This will have a impact on how many readers you will need. If you receive 10 files a minute and each file is processed in less than a minute you only need 10 readers. Also, are there spikes and lulls in receiving the files and are they significant in time span.



Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #1299716
Posted Tuesday, May 15, 2012 1:33 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Thursday, June 26, 2014 8:49 AM
Points: 2,276, Visits: 779
thanks for the reply lynn

i have to assume that the files will be constant and a 24*7 basis, and that the number of files will be more than a single machine can handle (worst case scenario there is a potential that we would receive in excess of 150 million files in the space of 24 hours)

it is acceptable for us to have files pending and a large backlog - we are under no pressure to perform to any SLA.

the thing with this is that the data is not required immediatley - from receiving the first file we may have to wait 24 hours until the last file is actually sent to us (i can't discuss the actual product, but imagine it was a large retail chain and tranactional data and status data are sent to a head office - plus the head office has the ability to send out questions to the till systems for a status update - if it's a sunday and that store is closed then we wait until it comes online)

the solution we have will scale sideways rather than demanding more powerfull hardware (as we are limited by amazon IO) so the real task is optimising the solution so that we can run on as few machines as possible when we go live.


really i'm not looking for a solution, more of a best practice guide (ie a microsoft web page or a white paper) so that we can say - "this is the reason we chose a figure of 50"

my main line of thinking is how many spid threads can each CPU handle and how do i manage the locking of the inserts when more than 1 file goes into a single table


MVDBA
Post #1300066
Posted Tuesday, May 15, 2012 2:04 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Thursday, June 26, 2014 8:49 AM
Points: 2,276, Visits: 779
i did find this interesting snippet

http://datageekgal.blogspot.co.uk/2008/07/sql-server-service-broker-archival.html


ARCHIVAL TIP #5: TUNE THE NUMBER OF QUEUE READERS

I've yet to find a solid recommendation for choosing the number of queue readers (threads) to activate for a given queue. One post suggested activating a queue reader per cpu. In my particular situation I've seen improved throughput with a number almost triple the cpu count.

I'm NOT suggesting that you use my numbers. Test, test, and test some more to find the thread level that allows you to keep up, or catch up, under load. For me, it was useful to play with both the MAX_QUEUE_READERS setting in the queue definition and the number of messages dequeued by the activated stored procedure at any one time (via the RECEIVE TOP(x) syntax).


any thoughts ?


MVDBA
Post #1300074
Posted Tuesday, May 15, 2012 2:09 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Thursday, June 26, 2014 8:49 AM
Points: 2,276, Visits: 779
i think i hit the jackpot

http://msdn.microsoft.com/en-us/library/dd576261(v=sql.100).aspx

finally after 2 days worth of digging


MVDBA
Post #1300077
Posted Tuesday, May 15, 2012 11:15 AM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Thursday, June 5, 2014 10:54 AM
Points: 9,902, Visits: 9,480
michael vessey (5/15/2012)
i think i hit the jackpot

http://msdn.microsoft.com/en-us/library/dd576261(v=sql.100).aspx

finally after 2 days worth of digging


Be forewarned Micheal, that article (and most MS Service Broker performance articles) are focused on how to increase performance of Service Broker by reducing the overhead and impact of Service Broker itself. As such the kind of tasks that are being serviced by the Activation Procedures are extremely low latency tasks, so that they can focus on the load from SB.

What they do not usually look at very much (and what I feel your principal performance concern is going to be), is what the performance impact/overhead of concurrency and locking stalls in the service tasks themselves is going to be. That is never easy to assess, and I do not know any hard and fast rules other than "always test".

A theoretical upper limit is easy to set: it's the number of CPU threads plus the number of disk spindles that will be used by your service tasks. In theory, you can never have more tasks physically "in service"(*) than that anyway. Anymore than that and the excess would have to be physically "in queue" for resources, instead of in your Service Broker's queue.

The real question then becomes, between One(1) and that number, when does the CPU task switching overhead and/or the effect of disk-head slewing from random access, and/or the hidden subtle logical effects of locking on the tasks, outweigh the benefits of maximizing the physical resource utilizations?

And there's no generally solvable formula for that. (well there is for the CPU part, but not the other two).

(*-- "in service" and "in queue" are the the Performance & CP technical terms for using a physical resource and waiting to use a physical resource, respectively)


-- RBarryYoung, (302)375-0451 blog: MovingSQL.com, Twitter: @RBarryYoung
Proactive Performance Solutions, Inc.
"Performance is our middle name."
Post #1300442
Posted Tuesday, May 15, 2012 11:31 AM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Thursday, June 5, 2014 10:54 AM
Points: 9,902, Visits: 9,480
RBarryYoung (5/15/2012)

...
A theoretical upper limit is easy to set: it's the number of CPU threads plus the number of disk spindles that will be used by your service tasks. In theory, you can never have more tasks physically "in service"(*) than that anyway. Anymore than that and the excess would have to be physically "in queue" for resources, instead of in your Service Broker's queue.
...

Hmm, I should correct myself here.

Because loading in general is probabilistic and the loading between the CPU's and each individual disk spindle tends to be uneven, this number is not the actual upper limit, but rather some factor of it to account for both the variance and the disparities. Generally "3x" is the seat-of-the-pants number that is used as a starting point, if you are trying to get all resource utilizations to 100%.

Note again, that for most "normal" service-tasks, you'll reach the overhead+blocking cross-over point well before that. (One exception to this is "insert-only" operations on a single, un-indexed table, which is highly optimized in SQL Server).


-- RBarryYoung, (302)375-0451 blog: MovingSQL.com, Twitter: @RBarryYoung
Proactive Performance Solutions, Inc.
"Performance is our middle name."
Post #1300455
Posted Wednesday, May 16, 2012 2:16 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Thursday, June 26, 2014 8:49 AM
Points: 2,276, Visits: 779
just an update

after some testing we have adopted a different approach - it wasn't service broker that was the problem (as mentioned in the previous 2 posts)

the 64 bit servers we have have to use the ACCESS drivers for reading CSV files through openquery - (MS in their infinite wisdom do not ship the standard text drivers with windows 64 bit) - we found that these drivers simply do not scale for large files. (we have explored bulk insert, but found it did not meet our requirements)

the time to import a file seems to rise exponentially based soley on the number of records in the file

for example a 50Mb file with 10k rows reads far faster(4 seconds) than 10mb file with 50k rows (69 seconds)

as such we are delegating that work to application servers and they are passing us the data as Table value parameters (TVP) via direct connection. all that remains for our procs is to insert the data. - actually this works better for us as we can then filter the incoming data to just the info required. - just the locking to deal with now!!!!!

service broker will be involved at a later stage, but not in such a main focus.

thanks guys for contributing, although the initial solution we proposed was not scalable at least we ruled it out and are trying something else



MVDBA
Post #1300844
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse