Serivce Broker 31+ million Disconnected_Inbound conversations

  • Hi,

    Running this query on one of our servers returned some scary numbers.

    SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

    SELECTCOUNT(*) ConvCount, S.Name, CEP.is_initiator, CEP.[State], CEP.[state_desc], CEP.far_service

    FROMsys.Conversation_EndPoints CEP

    INNER JOIN sys.services S

    ON CEP.service_id = S.service_id

    GROUP BY S.Name, CEP.is_initiator, CEP.[State], CEP.[state_desc], CEP.far_service

    Results:

    ConvCountNameis_initiatorStatestate_descfar_service

    120,189InitiatorService1COCONVERSINGTargetService

    120,189TargetService0COCONVERSINGInitiatorService

    31,153,305InitiatorService1DIDISCONNECTED_INBOUNDTargetService

    1InitiatorAndTargetService1COCONVERSINGInitiatorAndTargetService

    1InitiatorAndTargetService0COCONVERSINGInitiatorAndTargetService

    29,385InitiatorAndTargetService1DIDISCONNECTED_INBOUNDInitiatorAndTargetService

    All services are in the same database. The activation procedure on the InitiatorService service has been altered to end conversations properly.

    InitiatorAndTargetService uses a dialog pool base on this MSDN Blog.

    To clean up I was going to end all the conversation like this:

    DECLARE@SQLVarchar(8000) = '.',

    @LoopCountSmallInt = 0

    BEGIN TRANSACTION

    WHILE LEN(@SQL) > 0 AND @LoopCount < 100

    BEGIN

    SET @SQL = ''

    SELECTTOP 115

    @SQL = @SQL + 'END CONVERSATION ''' + CAST([conversation_handle] AS Varchar(40)) +''' WITH CLEANUP' + CHAR(13)

    FROMsys.conversation_endpoints WITH (NOLOCK)

    WHEREfar_service = 'RemoteTargetService'

    AND[state] = 'DI'

    EXEC(@SQL)

    SET @LoopCount = @LoopCount + 1

    END

    COMMIT TRANSACTION

    Code takes 15 seconds to run. This means to around 11 hours to clean it all up.

    So my only practical option is:

    ALTER DATABASE MyReallyImportantDatabase SET NEW_BROKER

    The queues are filled via a C# application and a SQL Agent Job.

    What can I do to mitigate data loss? Something like:

    1) Pause C# app and Jobs,

    2) Wait for existing conversations to end

    3) End any stubborn conversations

    4) Alter database

    5) Unpause C# app and Jobs

    I am not sure if the C# application can be paused and, if it can, what the effects will be. So alternative idea's are welcome.

    There are no routes on this server.

    Remus Rusanu[/url] answered a similar question here.

    Can anyone second his claim "If everything you have is local, in the same db, then you have basically nothing to worry about."?

    What kind of Event Notifications issues can I expect?

    Cheers

  • I used ALTER DATABASE <DBName> NEW_BROKER WITH ROLLBACK IMMEDIATE to clean up a couple of databases with 10's millions open conversations with state "DI" (Disconnected Inbound).

    The biggest one only took 116 ms and the next biggest 1203 ms.

    When I tried the same on another database with a few million open conversations, I had to cancel after a 60+ secs.

    Although I can't be positive of this, system SPID 25 "BRKR TASK" started a long blocking chain due to the rollback.

    Only by seeing changes in sys.sysProcesses.waitResource for SPID 25, could I be sure that the SPID was actually doing anything.

    (Service restart narrowly avoided.)

    Also these huge amounts of "Disconnected" conversations caused the TempDBs to bloat.

    Looking at the properties of TempDB, all looked fine.

    Size : 59356.31 MB

    Space Available : 59089.55 MB

    Although it looks like there is 267 MB in use, I could not shrink the file

    The space available is incorrect as it is still reserved by Internal Objects.

    As these results from Michael Valentine Jones's query shows.

    VersionStore%UserObject%InternalObject%FreeSpace%TotalPagesSizeMBs

    0.0030.00996.5223.4666051059259092.38

    After the blocking chain issue, I have decided to follow Jacob Sebastians's advice and clean up RBAR style.

    Remus Rusanu[/url] provides some great info on how to prevent conversation leaks.

    If you need to use NEW_BROKER, try to stop the queue's from being filled (Blocking a firewall port worked for us on one of the queues) then wait for them to empty

    or at least do this when the service is as quiet as possible and you have time to rollback if needed.

    If multiple databases need doing, take your time. Do them one by one and check if the coast is clear before moving on to the next database.

  • I ended up using the code below to end the conversations.

    Cicra 3.5 million conversations per 30 mins.

    You might have to rinse and repeat to deal with all the conversations.

    SET NOCOUNT ON

    IF OBJECT_ID('TempDB..#Convs') IS NOT NULL

    DROP TABLE #Convs

    CREATE TABLE #Convs

    (

    ConvIDInt IDENTITY PRIMARY KEY,

    ConvHndlUniqueIdentifier,

    ProcessedBit

    )

    INSERT#Convs

    SELECT[Conversation_Handle], 0

    FROMsys.conversation_endpoints CE WITH (NOLOCK)

    WHERE[State] IN ('DI', 'DO')

    DECLARE@SQLVarchar(8000) ='.',

    @LoopCountSmallInt = 0,

    @MaxLoopsSmallInt = 10000,

    @LoopTimerDateTime,

    @StartRowInt

    WHILE EXISTS

    (

    SELECTTOP 1 1

    FROM#Convs

    WHEREProcessed = 0

    )

    BEGIN

    SET @StartRow = (SELECT ISNULL(MAX(ConvID), 0) FROM #Convs WHERE Processed = 1)

    SET @LoopTimer = GETDATE()

    SET @LoopCount = 0

    BEGIN TRANSACTION

    WHILE LEN(@SQL) > 0 AND @LoopCount < @MaxLoops

    BEGIN

    SET @LoopCount = @LoopCount + 1

    SET @SQL = ''

    SELECT@SQL = @SQL + 'END CONVERSATION ''' + CAST(ConvHndl AS Varchar(40)) +''' WITH CLEANUP' + CHAR(13)

    FROM#Convs

    WHEREProcessed = 0

    AND ConvID BETWEEN @StartRow AND @StartRow + 100 -- 100 rows at a time. 100 * @MaxLoops = commit size

    BEGIN TRY

    EXEC(@SQL)

    UPDATE#Convs

    SETProcessed = 1

    WHEREConvID BETWEEN @StartRow AND @StartRow + 100

    END TRY

    BEGIN CATCH

    PRINT ERROR_MESSAGE()

    END CATCH

    -- Doesn't matter if there are errors or not, continue anyway.

    SET @StartRow = @StartRow + 100

    END

    COMMIT TRANSACTION

    -- For our viewing pleasure and to get an idea of how long it will take to complete.

    PRINT 'Cicra ' + CAST((@LoopCount * 100) AS Varchar(7)) + ' conversations ended in ' + CAST(DATEDIFF(S, @LoopTimer, GETDATE()) AS Varchar(10)) + ' seconds.'

    END

  • Still dealing with the bloated TempDBs : http://www.sqlservercentral.com/Forums/Topic1607443-1550-1.aspx?Update=1

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply