Wait type LATCH_EX [NESTING_TRANSACTION_FULL] - and blocking.

  • We have this ad hoc query that is a recurring block leader, blocking dozens of other SPs and adhoc query. This code has not changed in years. The query had never been a massive blocker before a few days ago, and never caused issues. now it does. It causes latencies, users are waiting and timing out at the classic asp screen where this ad hoc is being assembled and then submitted.

    Underlying tables have not changed either.

    sp_whoisactive also reports this wait type associated with the blocker: LATCH_EX [NESTING_TRANSACTION_FULL].

    the blockEEs usually show these kinds of waits: (3ms)PAGELATCH_UP:tempdb:12(PFS)

    or sometimes (1916ms)CXCONSUMER.

    Code rewrite is the least desired option, unless it is minor. What do you suggest that we do? (I wish I could attach query plan...)

    all tables have 10+ million rows.

    DROP TABLE IF EXISTS #TempAltMemberIDs

    SELECT m.Mem_HHMemberRefID
    ,md.MemD_Data AS AltMemberID
    INTO #TempAltMemberIDs
    FROM HHWorking.dbo.Member m WITH (NOLOCK)
    INNER JOIN HHWorking.dbo.MemberData md(NOLOCK) ON m.Mem_HHMemberRefID = md.MemD_HHMemberRefID
    WHERE m.Mem_PayorCode = 'abcd'
    AND md.MemD_Label = 'Axxzd'
    AND md.MemD_Data = 'K777C11'

    DROP TABLE IF EXISTS #tmpTrans
    DROP TABLE IF EXISTS #tmpT2

    SELECT TranID
    INTO #tmpTrans
    FROM CRIS.dbo.tblTrans tmp WITH (NOLOCK)
    WHERE (
    (Member_Number = 'abcdefg')
    OR (
    MemberID IN (
    SELECT Mem_HHMemberRefID
    FROM #TempAltMemberIDs
    )
    )
    )

    SELECT TranID = As_ReferralTransactionId
    ,IsCurrentlyAssigned = CASE
    WHEN As_AssigneeUserId = - 1999999998
    THEN 1
    ELSE 0
    END
    INTO #tmpT2
    FROM #tmpTrans tmp
    INNER JOIN Consult.dbo.Assessment asm WITH (NOLOCK) ON As_ReferralTransactionId = tmp.TranID
    INNER JOIN (
    SELECT MinAssmLogId = MIN(AL_AssessmentLogID)
    ,DateAssigned = MIN(AL_DateModified)
    ,TranID = AL_TransID
    FROM #tmpTrans tmp
    INNER JOIN Consult.dbo.AssessmentLog WITH (NOLOCK) ON al_TransID = tmp.tranId
    WHERE Al_AssigneeUserID = - 1999999998
    GROUP BY AL_TransID
    ) AS vt ON vt.TranID = asm.As_ReferralTransactionId

    SELECT tblTrans.TranID
    ,tblTrans.Van_ID
    ,tblTrans.DateCreated
    ,TranDate
    ,TranTime
    ,LastUser
    ,As_AssigneeUserID
    ,POS
    ,Member_Number
    ,MemberCity
    ,MemberState
    ,tblTrans.Payor
    ,tblTrans.PhysicianID
    ,CreateUser
    ,Pay_PayerName
    ,Tier2Bypass = CASE
    WHEN t2b.tranId IS NOT NULL
    THEN 1
    ELSE 0
    END
    ,IsCurrentlyAssigned = ISNULL(IsCurrentlyAssigned, 0)
    ,ISNULL(TotalNumProcedures, 1) AS TotalNumProcedures
    ,ISNULL(ProcedureCount, 1) AS ProcedureCount
    ,ProcCode
    ,CASE
    WHEN Proc_ShortDesc <> ''
    THEN Proc_ShortDesc
    ELSE UCB_ProcCodeDesc
    END AS Proc_ShortDesc
    ,DxCode
    ,Dx_ShortDesc
    ,ph.FirstName
    ,ph.LastName
    ,tblTrans.RealAuthNum
    ,tblTrans.ScheduleType
    ,tblTrans.MemberPlanCode
    ,tblPlanCode.Description
    ,tblTrans.FirstName AS MemberFirst
    ,tblTrans.LastName AS MemberLast
    ,tblTrans.ReferralStatus
    ,tblTrans.ReferralSubStatus
    ,tblTrans.Appt_Date
    ,tblTrans.Appt_Time
    ,tblTrans.SourceType
    ,tblTrans.CallerZipCode
    ,tblTrans.MemberZip
    ,CASE
    WHEN (
    (
    POSID IS NULL
    AND POS = '~Not Listed'
    )
    OR OrderingFacilityID = - 1
    OR tbltrans.PhysicianID = 5302
    )
    THEN 1
    ELSE 0
    END AS TranHasTempData
    ,(
    SELECT COUNT(*)
    FROM HHMaster.dbo.ListValue WITH (NOLOCK)
    INNER JOIN HHMaster.dbo.PayerData WITH (NOLOCK) ON List_Flag = PayD_PayerCode
    WHERE PayD_PayerCode = PAYOR
    AND List_Type = 'ACD_SPLIT'
    AND List_NumVal IN (
    1
    ,2
    )
    AND (
    PayD_Label = 'USE_RADXXXXXXX'
    AND PayD_Data = '1'
    )
    ) AS IsDotNETPayor
    ,TL_UserID
    ,TL_TimeStamp
    ,CASE
    WHEN IsNull(TL_UserID, - 1) = - 1
    THEN 0
    ELSE 1
    END AS IsLocked
    ,CASE
    WHEN COALESCE(UM_TAT_ClosedDate, As_DateClosed, tblTrans.DateClosed) IS NULL
    THEN 0
    ELSE 1
    END AS IsTATClosed
    ,CurrentDateTime = GETDATE()
    ,MinutesForCase = DATEDIFF(MINUTE, CONVERT(DATETIME, (TranDate + ' ' + TranTime)), GETDATE())
    ,StandardToStat_DateFormatted = FORMAT(tat_StandardToStatDate, 'MM/dd/yyyy h:mm:00 tt')
    ,tat_ruleId
    ,tat_clockStartDate = FORMAT(tat_clockStartDate, 'MM/dd/yyyy h:mm:00 tt')
    ,tat_clockDueDate = FORMAT(tat_clockDueDate, 'MM/dd/yyyy h:mm:00 tt')
    ,tatRef_triggerType
    ,(
    SELECT TOP 1 AltMemberID
    FROM #TempAltMemberIDs
    WHERE AltMemberID = 'KG30116R'
    ) AS AltMemberID
    FROM #tmpTrans tmp
    INNER JOIN CRIS.dbo.tblTrans tblTrans WITH (NOLOCK) ON tmp.TranID = tblTrans.TranID
    INNER JOIN CRIS.dbo.tblTransTAT tblTat WITH (NOLOCK) ON tblTrans.tranId = tblTat.tat_tranId
    LEFT JOIN HHMaster.dbo.ProcedureCode pr(NOLOCK) ON pr.Proc_Code = tblTrans.ProcCode
    LEFT JOIN HHMaster.dbo.DiagnosisCode dx(NOLOCK) ON dx.Dx_Code = tblTrans.DxCode
    LEFT JOIN HHWorking.dbo.tblPhysician ph(NOLOCK) ON ph.PhysicianID = tblTrans.PhysicianID
    LEFT JOIN Consult.dbo.Assessment(NOLOCK) ON as_ReferralTransactionID = tblTrans.TranID
    LEFT JOIN Contract.dbo.tblPlanCode(NOLOCK) tblPlanCode ON tblTrans.MemberPlanCode = tblPlanCode.PlanCode
    LEFT JOIN HHMaster.dbo.Payer(NOLOCK) ON tblTrans.Payor

    Likes to play Chess

  • Without execution plans, I'm guessing here.

    Changes in statistics caused a change in execution plans. Changes in data caused a change in behavior (as data changes, row counts change, which causes execution plan changes, and yeah, it can be very sudden and out of the blue, cross a threshold and you get a scan instead of a seek, a hash join instead of a merge, who knows). Those are my best bets with no other information.

    The waits & locks suggest that the problem is probably at the point where you're loading the temp tables (so painting it all with NOLOCK may not be helping). So that could be the issue, but it's probably the SELECT slowing down because of the reasons I suggested above.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Thank you very much for the valuable feedback. So does my suggestion to add more data files to TempDb sound reasonable then?

    Likes to play Chess

  • How many files do you have now?

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Ya know... the very least you could do is run that code through a decent formatter to get proper indentation the help with readability.

    While I'm sure it's not the whole of your problem, the code uses several calls to the FORMAT() function and that's about 43 times slower for every instance than convert.  Anything that slows this code down is going to do nothing but allow and extend periods of blocking.  Remember that WITH(NOLOCK) only aids in THIS code not being blocked.  It doesn't prevent this code from blocking,

    You don't just happen to see a non-correlated sub-query that does a COUNT aggregation, do you?  I don't know if that will cause a problem (possible Cartesian Product?) in this case so YOU have look at the execution plan plan and find out.

    I'm sure there are other issues but there are so many places with non aliased table names and column names (and you SHOULD fix that at the very least) that the best we'd be doing is making other to SWAGs.  If the company isn't going to allow you to post the execution plan, then I recommend you hire someone that will be allowed to see it.  Especially since you can't post the execution plan, this is well beyond what you should expect to solve for free on a forum.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • VoldemarG wrote:

    Thank you very much for the valuable feedback. So does my suggestion to add more data files to TempDb sound reasonable then?

    To me... no.  Even if you do that and it seems to fix things, all you're doing is kicking the can on down the road.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden wrote:

    VoldemarG wrote:

    Thank you very much for the valuable feedback. So does my suggestion to add more data files to TempDb sound reasonable then?

    To me... no.  Even if you do that and it seems to fix things, all you're doing is kicking the can on down the road.

    To be fair, if they're sitting on a single file right now, adding some (assuming more than one processor), will absolutely help. However, yep. Code. Fix it.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Create all temp tables separately from loading them.  Bonus: you could then also properly cluster the temp tables before loading them.

    The direct load of table with SELECT ... INTO #temp ... causes lots of blocking (esp. in tempdb, since so many loads go on there at the same time).

    Here's the changes needed for the first temp table.  Change all temp table creates/loads to match this pattern.

    Of course it's likely other tasks using SELECT ... INTO ... to load tables that are blocking this on, but you need to start making corrections somewhere.  But correct other tasks loading tables that way too.  That is, create the table separately first, add a clus index if needed, and only then finally load the table itself.  If you do the actual load with SELECT ... INTO ..., some (all?) metadata locks are held for the entire load of the table, which causes certain metadata blocking for other tasks.

    DROP TABLE IF EXISTS #TempAltMemberIDs

    /* create the table WITHOUT loading it */
    SELECT TOP (0) --<<--
    m.Mem_HHMemberRefID
    ,md.MemD_Data AS AltMemberID
    INTO #TempAltMemberIDs
    FROM HHWorking.dbo.Member m WITH (NOLOCK)
    INNER JOIN HHWorking.dbo.MemberData md(NOLOCK) ON m.Mem_HHMemberRefID = md.MemD_HHMemberRefID
    WHERE m.Mem_PayorCode = 'abcd'
    AND md.MemD_Label = 'Axxzd'
    AND md.MemD_Data = 'K777C11'

    /* BEFORE loading, create a clus index that matches the lookups done in the code on that table */
    CREATE CLUSTERED INDEX CL ON #TempAltMemberIDs ( AltMemberID ) WITH ( FILLFACTOR = 100 );

    /* finally, fully load the table itself */
    INSERT INTO #TempAltMemberIDs --<<--
    SELECT
    m.Mem_HHMemberRefID
    ,md.MemD_Data AS AltMemberID
    FROM HHWorking.dbo.Member m WITH (NOLOCK)
    INNER JOIN HHWorking.dbo.MemberData md(NOLOCK) ON m.Mem_HHMemberRefID = md.MemD_HHMemberRefID
    WHERE m.Mem_PayorCode = 'abcd'
    AND md.MemD_Label = 'Axxzd'
    AND md.MemD_Data = 'K777C11'

     

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • We have 12 files for tempdb.

    I am also thinking to reduce MAXDOP for that query only from 8 (server wide setting) to 4 or 2, too. Considering that LATCH contention of Nested Transaction Full may be related to excessive parallelism?

     

    Likes to play Chess

  • and how many cpu/cores/numa nodes does that machine have?

  • I agree that a "Best Practice" to prevent recompiles would be to move the creation of the temp tables to the very beginning but that's not likely to make a huge dent in the performance issue that's actually causing the blocking.  Someone at the company that actually knows how to read and interpret the execution plan needs to get involved and the rewrite parts of the code and maybe even apply a little bit of highly appropriate Divide'n'Conquer methodology.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • physical cpu = 24, logical 48.

    Likes to play Chess

  • I can read plans, to an extent. I just dont know how to attach them here :).

    code rewrite is the very last resort (asp pages,....re-testing.. deployment/release...--noone wants do all that because of 1 query only).

    Likes to play Chess

  • Reducing MaxDop helped, actually. elapsed time reduced from seconds to milliseconds now.

    Pre-creating temp tables did not change metrics but i understand what you all said regarding that it is the right way to go. thank you.

    Likes to play Chess

  • VoldemarG wrote:

    Reducing MaxDop helped, actually. elapsed time reduced from seconds to milliseconds now. Pre-creating temp tables did not change metrics but i understand what you all said regarding that it is the right way to go. thank you.

    Heh... are you sure it wasn't because of the fact that changing MaxDop forced a recompile?

    I'm glad you didn't take the additional Temp Table file path.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply