some rows go missing once in a while when querying for a batch

  • Hello,

    Not sure if this should be posted here or maybe the administration part... but i will start here i guess...

    so we have a process that gets top 200, but gets first the last Max record from the last batch, example:

    batch 1 the max record ID that it had last is 19, so the new batch 2, will get batch ids 20-29, then go back and say, cool, max batch from batch 2 is 29, now with batch 3, get 30-39 etc. etc. (hope this makes sense)... the problem we are seeing is...

    when batch 2, which is suppose to be 20-29, its maybe missing id 24... but the other ID's 20-29 are there, and only id 24 is missing, nothing special about it, so when i query it using the same store procedure, the 24 finally shows up... when it should have showed up with the batch 2...

    ok so now some architecture...

    This is on Always On SQL server 2014 Sp3, CU4 enterprise, 4 nodes, and 2 (primary and 1 secondary) are sync, the other 2 that go to other data centers are Async.

    the other thing is, the store procedure is querying 2 seperate databases and joining them... also to make things more complicated... a trigger from database A, upon insert, creates the record in database B that the store procedure looks at both.

    the process is currently pointing to Primary, it was originally on the secondary... but for the sake of elimination, we decided to change to primary...

    we raised suspicion that maybe row 24 (using example) is still being insert/update etc.? but also maybe looking into snapshot isolation... but worried about temp tables/space. any one encountered something like this?

     

    thanks in advance

  • Without seeing your data or table structures, my best guess is one of the following:

    1 - your data is unordered - that is you are not putting in an "ORDER BY" with your TOP statement thus the data coming back can be in ANY order

    2 - you are using NOLOCK and something has modified the page you are reading

    But I am sure there could be other reasons, those are just the first two that come to my mind.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Hey Brian, big thanks for responding, and yes, also i forgot to mention, we are doing an Order by, and yes, we did just remove that Nolock, good catch on that, we did see an improvement, a waaay bigger improvement which you definitely identified thank you, but now, just missing 1 or 2 every maybe... 5-7 hours... which is great, but not sadly perfect... so still looking around... we have the compatibility level set for 2014... but not sure what else to check or do... any thoughts?

  • This all seems like it''s "just" a process timing issue and you haven't figured out the actual cause, yet.  Is the process the creates the row actually finishing on time (for example)?  You might want to create a log that traces what each step towards creating the "records" is along with when they entered a step and when they completed it to help.

    Otherwise, If the batches must have sequential rows in them, shouldn't you just wait until they do?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • If this is a "timing issue", then snapshot isolation is not going to solve the issue.  And, in a synchronous commit availability mode, the data is committed to the secondary first.

    Here's my question, using your example.  If your second batch leaves out record #24, then why doesn't it pick it up with a subsequent batch?

    How are you generating the record ID's?  Are they an identity or a sequence?  If so, there may never be a record id #24 in the table.

    If a trigger on table A is inserting a record in table B using data from table A, and your query joins both of them, can the proc be re-written to just use table A?

    Is it possible to post the code?  And, let me guess.  The trigger contains a cursor?

     

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • Hey Michael good point, which now putting it all together, i think I might see a problem here... and the keyword is trigger, which from table 1 on database A, when an insert happens (uniqueness is by 3 columns, example would be customer, order and date), gets inserted into table 1 on database B, which table 1 database B is the table that the query is based on, AND has identity sequence... and no cursor lol 😉

    but now wondering your opinions/experience, since there is 2 databases, that are on same AG, that would definitely or possibly make one of the rows go missing either due to in process of commit or maybe still update/insert?

  • Siten0308 wrote:

    Hey Michael good point, which now putting it all together, i think I might see a problem here... and the keyword is trigger, which from table 1 on database A, when an insert happens (uniqueness is by 3 columns, example would be customer, order and date), gets inserted into table 1 on database B, which table 1 database B is the table that the query is based on, AND has identity sequence... and no cursor lol 😉

    but now wondering your opinions/experience, since there is 2 databases, that are on same AG, that would definitely or possibly make one of the rows go missing either due to in process of commit or maybe still update/insert?

    Is the row missing or just "late"?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply