Service broker not able to cope with many login event notifications ...

  • SQL2005 EE x64 9.00.4035.00 (sp3)

    2 proc (4cores/proc)

    8GB ram

    On average 100 connections exists on the instance.

    We implemented "successful login" event notification to our admin database.

    This is an EPO server (McAfee ePolicy Orchestrator anti virus monitoring).

    SQLserver has +- 50 succesful logins / second.

    After a day or so the admin db has grown to 17GB because of the backlog of the queue processing.

    I tried starting the queue with different max_queue_readers settings going from 1 to 10.

    Still it is not able to cope with the load;

    If I have more than one queue reader it suffers many blocking threads ( blocks occuring inter queue reader processes )

    Any hints ?

    ps FYI:

    -- Q_023nRows

    -- Q_023 - [Q_Logon_Tracker_Queue] - [2010-01-22 13:21:07.847]600

    -- Q_023 - [Q_Logon_Tracker_Queue] - [2010-01-22 13:29:37.407]11321

    -- Q_023 - [Q_Logon_Tracker_Queue] - [2010-01-22 13:42:07.440]25053

    -- Q_023 - [Q_Logon_Tracker_Queue] - [2010-01-22 13:45:52.037]75918

    -- Q_023 - [Q_Logon_Tracker_Queue] - [2010-01-22 14:01:24.473]291805

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • ALZ: Well, you know the drill :-), we'll need to see the sending code, the receiving code, the SB definitions and the definitions of any tables touched by the code.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Hi Barry, must have been "last day of the week syndrome" 😉

    Off course I should have posted the used DDL.

    As stated in my previous post, i tried to play with the number of queue readers, but with no solution.

    Here's the code:

    /*

    DBA_ConnectionTracker: Maintain an overview of which connections are made to this SQLServer instance

    -- This implementation user SQLServer Service Brocker with Event Notifications

    */

    USE DDBAServerPing;

    if object_id('dbo.T_DBA_ConnectionTracker') is null

    begin

    print 'Table [T_DBA_ConnectionTracker] Created';

    CREATE TABLE [dbo].[T_DBA_ConnectionTracker](

    [host_name] [varchar](128) NOT NULL,

    [program_name] [varchar](128) NOT NULL,

    [nt_domain] [varchar](128) NOT NULL,

    [nt_user_name] [varchar](128) NOT NULL,

    [login_name] [varchar](128) NOT NULL,

    [original_login_name] [varchar](128) NOT NULL,

    [client_net_address] [varchar](48) NOT NULL,

    [tsRegistration] datetime NOT NULL default getdate(),

    [tsLastUpdate] datetime NOT NULL default getdate()

    ) ;

    Create clustered index clX_DBA_ConnectionTracker on [dbo].[T_DBA_ConnectionTracker] ([tsRegistration]);

    Create index X_DBA_ConnectionTracker on [dbo].[T_DBA_ConnectionTracker] ([login_name], [program_name]);

    end

    -- Enable Service Broker for DDBAServerPing database if it's the case

    IF EXISTS( SELECT *

    FROM sys.databases

    WHERE [name]=N'DDBAServerPing'

    AND is_broker_enabled = 0 )

    Begin

    print 'SSB enabled';

    ALTER DATABASE DDBAServerPing SET ENABLE_BROKER;

    END

    -- We will access from the activated stored procedure a view that is

    -- located in a different database

    -- the sys.dm_exec_sessions dynamic management view

    -- The security context of the stored procedure would not allow us to do so

    -- unless we set the the TRUSTWORTHY option to ON.

    -- Why and another method you can find here:

    -- http://blogs.msdn.com/remusrusanu/archive/2006/03/01/541882.aspx

    IF EXISTS( SELECT *

    FROM sys.databases

    WHERE [name]=N'DDBAServerPing'

    AND is_trustworthy_on = 0 )

    Begin

    print 'trustworthy switched to ON';

    ALTER DATABASE DDBAServerPing SET TRUSTWORTHY ON;

    END

    -- Create a queue

    CREATE QUEUE Q_Logon_Tracker_Queue;

    -- Create a service

    CREATE SERVICE S_Logon_Tracker_Service

    ON QUEUE Q_Logon_Tracker_Queue([http://schemas.microsoft.com/SQL/Notifications/PostEventNotification]);

    -- Create a route

    CREATE ROUTE R_Logon_Tracker_Route

    WITH SERVICE_NAME = N'S_Logon_Tracker_Service'

    , ADDRESS = N'LOCAL';

    go

    /* current user get ownership of EVENT NOTIFICATION, so switch to 'sa' */

    EXEC AS LOGIN = 'sa';

    go

    -- Create the event notification at the server level for the AUDIT_LOGIN event

    CREATE EVENT NOTIFICATION N_Successfull_Login_Notification

    ON SERVER FOR AUDIT_LOGIN

    TO SERVICE 'S_Logon_Tracker_Service', 'current database';

    go

    /* Switch back to original user */

    REVERT;

    GO

    -- Create the stored procedure that will handle the events

    -- First set the options required to work with the XML data type

    SET ANSI_NULLS ON;

    SET QUOTED_IDENTIFIER ON;

    GO

    CREATE PROCEDURE dbo.spc_DBA_ConnectionTracker

    AS

    BEGIN

    SET NOCOUNT ON;

    -- Use an endless loop to receive messages

    WHILE (1 = 1)

    BEGIN

    DECLARE @messageBody VARBINARY(MAX);

    DECLARE @messageTypeName NVARCHAR(256);

    WAITFOR (

    RECEIVE TOP(1)

    @messageTypeName = message_type_name,

    @messageBody = message_body

    FROM Q_Logon_Tracker_Queue

    ), TIMEOUT 500

    -- If there is no message, exit

    IF @@ROWCOUNT = 0

    BEGIN

    BREAK ;

    END ;

    -- If the message type is EventNotification do the actual work

    IF (@messageTypeName = 'http://schemas.microsoft.com/SQL/Notifications/EventNotification')

    BEGIN

    DECLARE @XML XML,

    @host_name varchar(128) ,

    @program_name varchar(128) ,

    @nt_domain varchar(128) ,

    @nt_user_name varchar(128) ,

    @login_name varchar(128) ,

    @original_login_name varchar(128) ,

    @client_net_address varchar(48) ,

    @ts_logon datetime,

    @SPID VARCHAR(5);

    SELECT @XML=CONVERT(XML,@messageBody)

    ,@host_name = ''

    ,@program_name = ''

    ,@nt_domain = ''

    ,@nt_user_name = ''

    ,@login_name = ''

    ,@original_login_name = ''

    ,@client_net_address =''

    ,@SPID ='';

    -- Get the SPID and the Login name using the value method

    SELECT @SPID = @XML.value('(/EVENT_INSTANCE/SPID)[1]', 'VARCHAR(5)')

    , @ts_logon = @XML.value('(/EVENT_INSTANCE/StartTime)[1]', 'NVARCHAR(128)')

    , @host_name = @XML.value('(/EVENT_INSTANCE/HostName)[1]', 'NVARCHAR(128)')

    , @program_name = @XML.value('(/EVENT_INSTANCE/ApplicationName)[1]', 'NVARCHAR(128)')

    , @nt_domain = @XML.value('(/EVENT_INSTANCE/NTDomainName)[1]', 'NVARCHAR(128)')

    , @nt_user_name = @XML.value('(/EVENT_INSTANCE/NTUserName)[1]', 'NVARCHAR(128)')

    , @original_login_name = @XML.value('(/EVENT_INSTANCE/SessionLoginName)[1]', 'NVARCHAR(128)')

    , @login_name = @XML.value('(/EVENT_INSTANCE/LoginName)[1]', 'NVARCHAR(128)')

    ;

    -- Try to get client_net_address

    Select @client_net_address=isnull(rtrim(EC.[client_net_address]),'')

    from sys.dm_exec_sessions ES

    inner join sys.dm_exec_connections EC

    on EC.[session_id] = ES.[session_id]

    where ES.[session_id] = @SPID

    and ES.[nt_domain] = @nt_domain

    and ES.[nt_user_name] = @nt_user_name

    and ES.[login_name] = @login_name

    Select @program_name = case when upper(@program_name) like 'DATABASEMAIL - %'

    then substring(@program_name, 1, charindex('<',@program_name,1) - 1 )

    else @program_name

    end

    /* 'BiztalkServiceAccount' geef message GUID als application name mee !! register this user only once */

    Updatedbo.T_DBA_ConnectionTracker

    set [tsLastUpdate] = case when [tsLastUpdate] < @ts_logon then @ts_logon

    else [tsLastUpdate]

    end

    Where [host_name] = @host_name

    and case when upper([login_name]) like 'BiztalkServiceAccount%' then @program_name

    else [program_name]

    end = @program_name

    -- and [program_name] = @program_name

    and [nt_domain] = @nt_domain

    and [nt_user_name] = @nt_user_name

    and [login_name] = @login_name

    and [original_login_name] = @original_login_name

    and [client_net_address] = ISNULL(@client_net_address,'')

    if @@rowcount = 0

    begin

    INSERT INTO [dbo].[T_DBA_ConnectionTracker]

    ([host_name], [program_name], [nt_domain], [nt_user_name], [login_name], [original_login_name], [client_net_address], [tsRegistration], [tsLastUpdate] )

    values ( @host_name, @program_name, @nt_domain, @nt_user_name, @login_name, @original_login_name, ISNULL(@client_net_address,''), @ts_logon, @ts_logon);

    end

    /*

    Select *

    from dbo.T_DBA_ConnectionTracker

    order by [tsLastUpdate] desc

    */

    END;

    END;

    END;

    GO

    if object_id('dbo.spc_DBA_ConnectionTracker') is not null

    begin

    -- Link the stored procedure to the Q_Logon_Tracker_Queue

    ALTER QUEUE Q_Logon_Tracker_Queue

    WITH STATUS=ON

    , ACTIVATION ( STATUS=ON

    , PROCEDURE_NAME = dbo.spc_DBA_ConnectionTracker

    , MAX_QUEUE_READERS = 4

    , EXECUTE AS SELF) ;

    end

    ELSE

    BEGIN

    RAISERROR ('DBA Message: SSB Queue ConnectionTracker NOT Activated !!! ', 1,1 ) WITH log

    END

    /*

    USE DDBAServerPing;

    GO

    -- clean up only if needed

    /* In case of Problems first drop the EVENT NOTIFICATION subscription */

    --DROP EVENT NOTIFICATION N_Successfull_Login_Notification ON SERVER

    --ALTER QUEUE Q_Logon_Tracker_Queue WITH STATUS=OFF

    --DROP EVENT NOTIFICATION N_Successfull_Login_Notification ON SERVER

    --DROP ROUTE R_Logon_Tracker_Route

    --DROP SERVICE S_Logon_Tracker_Service

    --DROP QUEUE Q_Logon_Tracker_Queue;

    --DROP PROCEDURE dbo.spc_DBA_ConnectionTracker

    --

    */

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • I have the vast impression it must be something with this sqlinstance... the box should be able to cope with the threads without a problem.

    Another thing is we are having issues upgrading this x64 ent edtn to SP3.

    We are awaiting the MS pss solution for this upgrade issue.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • It seems that you could change your activation proc to use a cursor or since you are not sending any messages back (nor ending the conversation) you could even process the queue as an entire set:

    http://rusanu.com/2006/10/16/writing-service-broker-procedures/

    I think this would be much, much faster then extracting one message at a time from the queue.

  • j.a.c (2/9/2010)


    It seems that you could change your activation proc to use a cursor or since you are not sending any messages back (nor ending the conversation) you could even process the queue as an entire set:

    http://rusanu.com/2006/10/16/writing-service-broker-procedures/

    I think this would be much, much faster then extracting one message at a time from the queue.

    Um, no. First, Cursors are not faster, they are slower. And secondly, Service Broker Queues do not support Cursors (thank God).

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Check out the link that I provided.

    It shows quite clearly that cursors are much faster in this instance than only processing one row per receive statement.

    I am using a cursor in an activation proc that is very similar to the OP.

  • ALZ, sorry I took so long to get back to you...

    Anyway, I took a look at this and although 50/logins a second is a lot, the Queue itself should be able to handle it. IMHO it is much more likely to be the Activation Procedure that is your bottleneck.

    On thing I noticed in it is that it is always attempting the UPDATE, then if no rows were updated, it tries the INSERT. The UPDATE statement looks like this:

    Update dbo.T_DBA_ConnectionTracker

    set [tsLastUpdate] = case when [tsLastUpdate] < @ts_logon

    then @ts_logon

    else [tsLastUpdate] end

    Where [host_name] = @host_name

    and case when upper([login_name]) like 'BiztalkServiceAccount%' then @program_name

    else [program_name]

    end = @program_name

    -- and [program_name] = @program_name

    and [nt_domain] = @nt_domain

    and [nt_user_name] = @nt_user_name

    and [login_name] = @login_name

    and [original_login_name] = @original_login_name

    and [client_net_address] = ISNULL(@client_net_address,'')

    And here's the table definition:

    CREATE TABLE [dbo].[T_DBA_ConnectionTracker](

    [host_name] [varchar](128) NOT NULL,

    [program_name] [varchar](128) NOT NULL,

    [nt_domain] [varchar](128) NOT NULL,

    [nt_user_name] [varchar](128) NOT NULL,

    [login_name] [varchar](128) NOT NULL,

    [original_login_name] [varchar](128) NOT NULL,

    [client_net_address] [varchar](48) NOT NULL,

    [tsRegistration] datetime NOT NULL default getdate(),

    [tsLastUpdate] datetime NOT NULL default getdate()

    ) ;

    Create clustered index clX_DBA_ConnectionTracker

    on [dbo].[T_DBA_ConnectionTracker] ([tsRegistration]);

    Create index X_DBA_ConnectionTracker

    on [dbo].[T_DBA_ConnectionTracker] ([login_name], [program_name]);

    What jumps out at me is that the [program_name] search code has apparently been recently changed, and in such a manner that it is no longer SARGable.

    Since [tsRegistration] obviously is not (and cannot be) used for the lookup, the efficiency of every UPDATE (for every login @ 50/second) rests entirely on the [login_name]. That should probably be OK (though a bit marginal), so long as [login_name] is well distributed.

    However, if most of your logins are actually coming in with the SAME login_name (such as an ASP.Net app, or BizTalk can be setup to do), then the [login_name] will provide almost no indexing discrimination and you would probably get exactly the symptoms that you are seeing. So I would suggest checking that out, and seeing if you cannot improve you index-lookup situation somehow.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • j.a.c (2/10/2010)


    Check out the link that I provided.

    I did. three years ago when Remus wrote it.

    It shows quite clearly that cursors are much faster in this instance than only processing one row per receive statement.

    It shows nothing of the kind, because you are setting up a false dichotomy, that the choice is between processing one row per RECEIVE or using a Cursor, and that is not at all true as this article itself shows in great detail.

    Here are the problems with your claim:

    1. As I stated, you CAN NOT use Cursors on an SB Queue.

    Nor does this article do that or even claim that you can. In fact, Remus explicitly says that you cannot. Instead what Remus is actually doing is using a bulk-RECEIVE statement to dump the queue message into a Table Variable and then is using a Cursor on that. Not at all the same thing.

    2. The Cursor is NOT what causes the Speed-Up in this Test.

    In fact it is the Bulk-RECEIVE that is providing all of the performance improvement. You can see this by the fact that of the three different tests that he does with the Bulk-RECEIVE technique, the Cursors are by far the slowest. The Cursors are in fact slowing it down significantly.

    3. The non-Cursor (Set-based) Methods are faster.

    As above, keep the Bulk-RECEIVE and use any method other than cursors, (that is, anything set-based) and it's actually faster.

    4. The Test are Artificial:

    Remus's tests's work by setting up an artificial situation: he backlogs 10,000 messages in the queue. The problem with this is that it probably is not normal for ALZ to have that many messages in the Queue, unless his performance has already gone belly-up. And there is every possibility that these bulk-methods are actually slower for processing when there is only a single message ready in the queue at a time.

    5. The RECEIVE Isn't the Problem Anyway:

    All of this silliness about using a cursor on a Queue is predicated on the assumption that it is the Activation Procedure's use of the RECEIVE statement that is the problem, and all or the evidence actually argues against that. And in fact, Remus's article, which you reference proves it.

    In That article when Remus uses the same RECEIVE technique that ALZ is using he is getting 383 message a second! And ALZ's actual technique should be even faster because ALZ is not using transactions (i.e., no COMMITs at all). You will note that 383 per second is considerably faster than the 50 message a second that ALZ should be receiving from Login events. Further, Remus's tests were on what he described as his "measly" single-core 2.4ghz P4, home system. ALZ's queue is running on an 8-core x64 Enterprise Server(!), I think it should be able to keep up with 5 message a second.

    All of which leads us back to what I posted above: the problem isn't with his RECEIVE, and even if it was, he would be better off going with a set-based solution, rather than Cursors. However, what he should be doing is looking at other potential causes and solutions, in particular, the use of indexes on his Tracker table.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • I may have an additional insight for you here. I've developed a number of solutions now using Service Broker and I ran across a similar problem the first time I tried to use activation procedures on a SB queue. The queues were locking up, blocking each other, and the multi-threading performance of defining the MAX_QUEUE_READERS seemed like it was tripping over itself.

    Here's what I found that is relevant to how you've set up your activation procedure. You need to remove the WHILE (1=1) loop and the WAITFOR(), TIMEOUT statement. This type of setup is great when you want to call a SP from SSMS and have that SP monitor a SB queue, but when using a SP as an activation procedure, you don't want the SP to hang around and wait. The SP does not need to WAITFOR a message as Service Broker will invoke the SP when a message is added to the queue.

    You want the SP to grab one message, process it, and end. Service Broker will manage the queue load and call as many instances of the SP (up to the MAX_QUEUE_READERS parameter) as it needs to process the queue. Once the contents of the queue have been processed, you should not see instances of your SP lingering around, which you will if you use the loop and WAITFOR statement.

    So, try removing the WHILE loop and WAITFOR statements and I think your problems will be solved. From my experience, when Service Broker is able to load balance itself and scale the # of instances of the activation procedure, it can fly through a queue and becomes a very scalable solution.

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • j.a.c (2/10/2010)


    Check out the link that I provided.

    It shows quite clearly that cursors are much faster in this instance than only processing one row per receive statement.

    I am using a cursor in an activation proc that is very similar to the OP.

    OK, I want to preemptively apologize for the tone in my responding post above. I believe that my points are technically correct, but my attitude is inappropriate and unfair. Sorry.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Thanks Barry,

    I don't think I ever stated that cursors could be used to receive data from the queue. I stated that they could be used in the activation proc which is very true.

    In some cases I am not sure if you can easily use a set based solution in an activation proc if you need to end a conversation or return some kind of message.

    So if you need to return some kind of message or end a conversation then using a cursor to process messages inserted into a table variable (bulk-receive) is probably faster than receiving one message at a time from the queue.

    I should have been exactly explict on this.

  • I would have to disagree with the bulk-receive approach anyways. His activation procedure is using the eternal WHILE loop method as well. This will not allow SB to properly scale up and process the queue. The bulk-receive may be faster versus an activation proc w/ an eternal loop, but I wouldd like to see the comparison between that and an activation proc w/o the loop. The bulk receive method makes processing the queue a serial process whereas using SB to scale up your activation procs give you multi-threading.

    Yea, yea, I know. I should write out the test and prove it.....I can, but not today. Give me some time and I'll set the test up.

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • John Rowan (2/10/2010)


    I would have to disagree with the bulk-receive approach anyways. His activation procedure is using the eternal WHILE loop method as well. This will not allow SB to properly scale up and process the queue. The bulk-receive may be faster versus an activation proc w/ an eternal loop, but I wouldd like to see the comparison between that and an activation proc w/o the loop. The bulk receive method makes processing the queue a serial process whereas using SB to scale up your activation procs give you multi-threading.

    There's no reason that you cannot use both, John, in the same proc even. And Bulk-Receive isn't a all-or-nothing deal, you can parametrize/scale it if you want. ALZ's usage is particularly amenable to this kind of approach.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Thank you all for your replies.

    Give me some time to interpret them and check them on this instance.

    A little work around I 've done in the mean while, is to not perform the update right away, but first check if it needs an update. So I perform a select and check the tslastupdate if it needs to be altered ( so it only records a login once a day ). If that is the case it will perform an update, if no row is found, it will perform the insert.

    But still .... it doesn't perform ... I must be overlooking the obvious.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply