Audit trail of merging duplicates (long)

  • Using SQL Server 7.0 (SP3) on NT 4.0 (SP6)

    I've seen great code here and elsewhere showing and deleting duplicates, but none that

    include an audit trail. I had this working, then many things changed, and

    it doesn't work now.

    We receive error-packed data in text files. I upload them to SQL Server. A child might visit one doctor one month and another the next month, providing different address, etc., depending on whether dad or grandma brought the child in.

    I am trying to

    1. archive duplicate patient records based on certain, not all, fields.

    2. archive via

    a. an entry in a parent (ONE) table for each group of archives

    (patient_ID of record kept, date & time archived, user, method [since

    users will be de-duplicating based on their assessment of whether the

    records represent the same entity])

    b. an entry in a child (MANY) table for each address record archived (with the details in case we later determine the archiving was an error

    3. Ideally, the code would add info from the archived records to the kept record when the kept record has blank or null fields (sometimes one record is missing one field and another record has that, but is missing something else).

    /* 1. add to MANY table (pat_combos)

    2. create temp MANY table

    3. append temp MANY table records to permanent MANY table (pat_combo_items)

    4. reset linked table (shotrecs)

    5. delete duplicates

    6. mark MANY table records as done

    so they won't be checked again for RESET and DELETE steps

    7. drop temp ONE table (temp_pat_combo_items) (move this to last step when it's all working) */

    insert into pat_combos(kept_id, fname, lname, dob, address_id, query_used, merged_at, username)

    select min(patient_id), fname, lname, dob, address_id, query_used='dedup_query',merged_at=getdate(), username='rwmacfad'

    from patients

    group by fname, lname, dob, address_id

    having count(*)>1

    go

    select a.*, combo_id=(select t.combo_id from pat_combos t where t.delmark<>'T' and t.lname=a.lname and t.fname=a.fname and t.dob=a.dob and t.address_id=a.address_id)

    into temp_pat_combo_items

    from patients a join pat_combos t2

    on t2.lname=a.lname and t2.fname=a.fname and t2.dob=a.dob and t2.address_id=a.address_id

    where a.patient_id not in (select kept_id from pat_combos where delmark <> 'T')

    go

    insert into pat_combo_items(combo_id, patient_id, fname, mname1, lname, DOB, sex, address_id, hphone, wphone, mphone, simult_birth, pat_type, rec_type, sysEditBy, sysEditDate, entered_by, rec_creation_date, birth_country, birth_state, birth_mother_id, birth_father_id, birth_cert, death_cert, guardian_id, had_chickenpox, opt_out, Medicaid_ID, no_live, no_Hep, hep_reason, notes)

    select combo_id, patient_id, fname, mname1, lname, DOB, sex, address_id, hphone, wphone, mphone, simult_birth, pat_type, rec_type, sysEditBy, sysEditDate, entered_by, rec_creation_date, birth_country, birth_state, birth_mother_id, birth_father_id, birth_cert, death_cert, guardian_id, had_chickenpox, opt_out, Medicaid_ID, no_live, no_Hep, hep_reason, notes

    from temp_pat_combo_items

    go

    update shotrecs

    set shotrecs.patient_id = c.kept_id

    from pat_combo_items i, pat_combos c, shotrecs s

    where c.delmark <>'T' and c.combo_id=i.combo_id and s.patient_id = i.patient_id

    go

    delete patients

    from patients p, pat_combos c, pat_combo_items i

    where c.delmark <>'T' and c.combo_id = i.combo_id and p.patient_id=i.patient_id

    go

    /* mark pat_combos as done so they're not re-checked next time */

    update pat_combos

    set delmark='T'

    where delmark <>'T'

    go

    drop table temp_pat_combo_items

    /* any other advice, such as removing the GOs, is welcome */


    Standalone Server, SQL 7.0 (SP3), NT 4 (SP6), Access 2000, MDAC 2.6

  • I forgot to mention the error I was receiving:

    "Cannot insert the value NULL into column 'combo_id', table 'pat_combos'; column does not allow nulls. INSERT fails."

    I thought I was inserting the most recent combo_id as established in the nested SELECT clause in the 2nd block of code. That is the part that I am messing up, or vice versa.


    Standalone Server, SQL 7.0 (SP3), NT 4 (SP6), Access 2000, MDAC 2.6

Viewing 2 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply