Deadlock Issue encountered while Replication and JobRead oocur simultaneously

  • Hi,

          One our Reporting Application has gone to Production last week, and we are encountering the below issue only on Production Environment.

     

    [This is going to be a lengthy mail J ]

     

    Scenario:

         Our Server has a Continuous Transactional Replication (Pull) from an Upstream System on 1 Database. This Upstream System does a Delete and Insert of records (5-6 Tables) within a transaction for every 2 Hrs into the Database which eventually gets replicated on subscriber. While this happening couple of our Jobs (Job Span: 3-5Hrs) are trying to read the Data from same database (/Tables) where Insert/Delete is in process.

     

    Issue

          In production environment we consistently faced that the Job has failed with below error

    Executed as user: PARTNERS\_prtsit1. Transaction (Process ID 53) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction. [SQLSTATE 40001] (Error 1205).  The step failed.

     

    We are unable to Repro this scenario in any of the other environments inspite of repeated tries.Server Configuration is provided below in mail.

     

    Investigation Done:

           Enabled TraceFlag for Error-1204, 1205 and captured the below log in ServerLog Files for one of the failure.

    Date

    Source

    Message

    3/6/06 11:16 PM

    spid55

    Configuration option 'show advanced options' changed from 1 to 1. Run the RECON

    3/6/06 11:16 PM

    spid55

    Error: 15457

    3/6/06 11:05 PM

    spid4

    ResType:LockOwner Stype:'OR' Mode: S SPID:55 ECID:0 Ec0x1D8935F0) Value:0x2af

    3/6/06 11:05 PM

    spid4

    Victim Resource Owner:

    3/6/06 11:05 PM

    spid4

    ResType:LockOwner Stype:'OR' Mode: S SPID:55 ECID:0 Ec0x1D8935F0) Value:0x2af

    3/6/06 11:05 PM

    spid4

    Requested By:

    3/6/06 11:05 PM

    spid4

    Input Buf: RPC Event: sp_MSins_Task_Dimension;1

    3/6/06 11:05 PM

    spid4

    SPID: 53 ECID: 0 Statement Type: INSERT Line #: 7

    3/6/06 11:05 PM

    spid4

    Owner:0x205acbc0 Mode: IX       Flg:0x0 Ref:2 Life:02000000 SPID:53 ECID:0

    3/6/06 11:05 PM

    spid4

    Grant List 3::

    3/6/06 11:05 PM

    spid4

    PAG: 6:1:9041                  CleanCnt:2 Mode: IX Flags: 0x2

    3/6/06 11:05 PM

    spid4

    Node:2

    3/6/06 11:05 PM

    spid4

     

    3/6/06 11:05 PM

    spid4

    ResType:LockOwner Stype:'OR' Mode: IX SPID:53 ECID:0 Ec0x6973B598) Value:0x20

    3/6/06 11:05 PM

    spid4

    Requested By:

    3/6/06 11:05 PM

    spid4

    Input Buf: Language Event: EXEC Weekly_Tue_NAMERICA

    3/6/06 11:05 PM

    spid4

    SPID: 55 ECID: 0 Statement Type: SELECT INTO Line #: 422

    3/6/06 11:05 PM

    spid4

    Owner:0x2afc0f20 Mode: S        Flg:0x0 Ref:1 Life:00000000 SPID:55 ECID:0

    3/6/06 11:05 PM

    spid4

    Grant List 0::

    3/6/06 11:05 PM

    spid4

    TAB: 6:418816554 []            CleanCnt:2 Mode: S Flags: 0x0

    3/6/06 11:05 PM

    spid4

    Node:1

    3/6/06 11:05 PM

    spid4

     

    3/6/06 11:05 PM

    spid4

    Wait-for graph

    3/6/06 11:05 PM

    spid4

     

    3/6/06 11:05 PM

    spid4

    ...

     

     

    We understood that from the above log that Replication Job has IX lock on Table and our Job has Shared lock on that table.

    So SQL Server chose our Process as victim and hence terminated

     

    Questions:

    1)       Is our understanding on the Log correct??

    2)       If correct then shouldn’t be process which has shared lock be in “waiting” mode rather than be killed by SQL Server?? ( We simulated on another environment where the query is blocked till Write process got completed). Is there any threshold on wait time??

    3)       What could be the cause of this behavior on a particular environment?? What is the suggested Fix??

     

     

     

    Environment Details:

    QA

    Microsoft SQL Server  2000 - 8.00.997 (Intel X86)

    Dec 17 2004 22:15:27

    Copyright (c) 1988-2003 Microsoft Corporation

    Enterprise Edition on Windows NT 5.2 (Build 3790: Service Pack 1)

     

    OS: Windows 2003- SP1 IPAK NT5.25 (V3.0) and SP2

    SQL2000: SP3 [IPAKSQL_8010(RTM 8.00.997.SP3a v1.3)

     

    Machine: 8 Processor

    SQLServer Processor Setting: All 8 Processors enabled for Usage

                                     Parallelism Option: Use all available Processors

                                   

     

    UAT

    Microsoft SQL Server  2000 - 8.00.997 (Intel X86)

    Dec 17 2004 22:15:27

    Copyright (c) 1988-2003 Microsoft Corporation

    Enterprise Edition on Windows NT 5.2 (Build 3790: Service Pack 1)

     

    OS: Windows 2003- SP1 IPAK NT5.25 (V3.0) and SP2

    SQL2000: SP3 [IPAKSQL_8010(RTM 8.00.997.SP3a v1.3)

     

    Machine:8 Processor

    SQLServer Processor Setting: All 8 Processors enabled for Usage

                                     Parallelism Option: Use all available Processors

     

     

    Prod

    Microsoft SQL Server  2000 - 8.00.2040 (Intel X86)

    May 13 2005 18:33:17

    Copyright (c) 1988-2003 Microsoft Corporation

    Enterprise Edition on Windows NT 5.2 (Build 3790: Service Pack 1)

     

    OS: Windows 2003- SP1 IPAK NT5.25 (V3.0) and SP2

    SQL2000: SP4 [IPAKSQL_8011(RTM 8.00.2040.SP4 v1.1)]

     

    Machine:8 Processor

    SQLServer Processor Setting: All 8 Processors enabled for Usage

                                     Parallelism Option: Use all available Processors

     

     

    Any suggestions would be of great Help. Thank You.

     

     

     Regards,

    Pavan Keerthi

    SIT-Report Authoring Test

    Microsoft India  

    Work: +91-040-55113008

    Mail:v-pavake@microsoft.com

     

     

     


    Kindest Regards,

    Keerthi Pavan Kumar

  • This was removed by the editor as SPAM

  • This might be kind of a "hokey" suggestion but what the heck.  For the process doing the read, would it be possible to do a WITH(NOLOCK) on the table to avoid any locking what so ever?

    Just a thought. 

    Mark

  • I am having th e same issue. But its our subscriber where deadlock is happening. Any clue why and how to fix it?

    :crazy: :alien:

    Umar Iqbal

  • Hi,

    You have two choices:

    1. Use the WITH (NOLOCK) clause mentioned above, this case you'll read out uncommitted transactions as well.

    2. Use the WITH (READPAST) clause, this case you'll skip the locked rows. You mentioned that there are insert/delete operations on the table, so this second solution could be better for you since you'll read data which is surely in the database, like this import wouldn't exist.

    For Umar: probably the first solution is the better. Your problem is that there's a shared lock normally on selected rows and the distribution agent waits for some rows while already locked others... With replication, you should always be aware of what kind of queries you run against the data while distribution agents run.

    HTH,

    Erik

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply