Think LSNs Are Unique? Think Again - Preventing Data Loss in CDC ETL

  • Comments posted to this topic are about the item Think LSNs Are Unique? Think Again - Preventing Data Loss in CDC ETL

  • Thanks for posting your issue and hopefully someone will answer soon.

    This is an automated bump to increase visibility of your question.

  • I'm not sure, if a check for >= @start_lsn would work, when there are multiple parallel inserts, each within its own session

    E.g.

    • session 1 uses 0x001 and  is very slow and inserts just 50 rows
    • session 2 uses 0x002 and inserts 100 rows
    • session 2 is done
    • ETL job gets all data from the table (150 rows) and writes 0x002 to its marker table
    • session 1 inserts the second 50 rows (= 100 total)
    • session 1 is done
    • second ETL job starts and filtes for LSN >= 0x002
    • you would miss again the late 50 rows from session 1

    PS: unlikely for 50 rows, but imagine 500k or 5 mio or just a locking problem at the source of session 1

    God is real, unless declared integer.

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply