INSERTED DUPLICATE RECORDS

  • I'm running a vb program that will insert data to my database from a text file for report generation. It will insert if the record does not exist. It has a validation that if you try to upload two or more same records, the program will not allow it. It runs weekly after extraction of records from another system. The problem is, when the report was generated for the month of October, each record has its duplicate. I checked the text file where the record came from, but it does not have any single duplicates. I test again the vb program, but the validation is working and does not allowed the records to be uploaded to the database. What would be the probable cause of the duplication of the records?

  • A couple of questions:

    (1) How are you enforcing the "no duplicates" rule? In the database via a constraint (e.g. unique index) or in your VB application?

    (2) Are you joining the data inserted to another table in a query? Any chance you've got a duplicate in a joined table?

    Joe

  • I enforce the "no duplicates" via my vb program and it is working.

    It's not joined, the data are directly inserted in the table from a text file via my vb program.

  • Hi,

    why you dont use a DTS/SISS or BCP to insert the data from your text file to the database.

    http://www.databasejournal.com/features/mssql/article.php/3391761

    http://www.verio.com/support/documents/view_article.cfm?doc_id=3846

    http://www.sqlis.com/

    Regards,

    Ahmed

  • I can't, because the vb program is automatically running to lessen the effort, meaning, it is assigned as a job that will run every Monday after the extracted text file has been imported to its source folder instead of manually doing the DTS.

  • hi,

    you can also schedule the dts to run automatically each monday.

  • Who said anything about a "manual process"? Schedule it! BCP, a proc that uses Bulk Insert, and DTS can all be scheduled.

    Does your table have a valid, non-surrogate Primary Key? If not, you need to figure out how to make one.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • All true.

    To answer his question, though, if:

    A. Data is only transfered to the database via the program and

    B. Logic of the program is bullet proof and

    C. There are no duplicates in the database (Rules out an extra call to the program with the same file OR the same data appearing in a different file) and

    D. The report WAS showing duplicates

    Then:

    A. The design of the report is bad (Unlikely to see duplicates) or

    B. The query the report is based off of is bad (More likely. Double-check your joins.) or

    C. Design flaw in the reporting tool (Get another set of eyes to look things over, if possible. If it's reproducible, you may want to send a bug report and work around it.)

  • The vb program will split the strings and remove unnecessary values such as 0's, and converts strings to datetime.

  • There are no joins in my expressions, there is a unique index(ID).

  • I wasn't sure if assumption C was correct and that you had checked the database itself and there are no duplicates there but I went back and added a second possibility there if that wasn't the case. The program may only be looking for duplicates within the single file OR if you don't truncate/destroy the table before the import, the report may be including data from more than one batch, somehow.

  • The vb program will split the strings and remove unnecessary values such as 0's, and converts strings to datetime.

    Yep... you're right... database sure can't do any of that 😉 And, I'm sure that the VB program does it with much more speed... 😀

    I enforce the "no duplicates" via my vb program and it is working

    Ummm... so you checked the table for duplicates, eh? And the table has a Primary Key, or at least a unique index to double check for you, huh?

    Dave is correct... either you have duplicates or the report code is wrong. You need to check both.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The thing is all of the extracted files that are inserted into the database have no duplicates, except for the one file that is generated last Nov 5, where our database server was reported stopped responding.

  • Does your program check the database itself for rows that may have been inserted previously?

    If it "can't happen", I am guessing not.

    That being said, what triggers the VB code to be executed? It may help us to know.

    Is it constantly running and checking for the appearance of a file to trigger the import? If so, what causes the file to appear in the target folder? How often does the program check and when does the file get deleted or moved? Could it have been sent to you a second time? Does the program produce logs you can check? Could it happen that you have/had two instances of your VB program in memory? Does the program produce logs you can check? Oh right... asked that.

  • Yep... you're right... database sure can't do any of that And, I'm sure that the VB program does it with much more speed...

    Everyone does realize that my statement above was absolutely dripping with sarcasm, correct?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply