Bulk Insert batch error

  • Hello Team,

    I am getting following error when I use bulk insert command.

    Msg 4864, Level 16, State 1, Line 1

    Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 477, column 1 (Examid).

    Command used:

    BULK INSERT FileImportDB..Exam FROM

    'F:\70-431 Material\Practice Files\Chapter 10\ExamImportFile.txt' --WITH (TABLOCK )

    with (FIRSTROW = 1

    ,DATAFILETYPE ='char'

    ,ROWS_PER_BATCH = 5

    )

    There are only 500 rows in the file. Amazingly one one row, where the error occurred, is not inserted. According to BOL the batch should be rolled back and later batches should not be executed.

    BOL: "Each batch of rows is inserted as a separate transaction. If, for any reason, the bulk-import operation terminates before completion, only the current transaction is rolled back. For instance, if a data file has 1000 rows, and a batch size of 100 is used, Microsoft SQL Server logs the operation as 10 separate transactions; each transaction inserts 100 rows into the destination table. If the bulk-import operation terminates while loading row 750, only the previous 49 rows are removed as SQL Server rolls back the current transaction. The destination table still contains the first 700 rows.

    "

    -MJ
    Please do not print mails and docx unless it is absolutely necessary. Spread environmental awareness.

  • Michael,

    ROWS_PER_BATCH is not the same as BATCHSIZE (check out BOL), and is only used for optimizing, not for transactional purposes. That may explain the behaviour you experience.

  • According to my understanding ROWS_PER_BATCH is same as BATCHSIZE.

    The sentence in my previous post is from BOL only and when this describes 100 as batch size then it is same as rows per batch

    -MJ
    Please do not print mails and docx unless it is absolutely necessary. Spread environmental awareness.

  • BOL 2000 says:

    Using ROWS_PER_BATCH

    If the -b switch or BATCHSIZE clause is not used, the entire file is sent to SQL Server and the bulk copy operation is treated as a single transaction. In this case, the ROWS_PER_BATCH hint or ROWS_PER_BATCH clause can be used to give an estimate of the number of rows. SQL Server optimizes the load automatically, according to the batch size value, which may result in better performance.

    BOL 2005 says

    ROWS_PER_BATCH = rows_per_batch

    Indicates the approximate number of rows of data in the data file.

    By default, all the data in the data file is sent to the server as a single transaction, and the number of rows in the batch is unknown to the query optimizer. If you specify ROWS_PER_BATCH (with a value > 0) the server uses this value to optimize the bulk-import operation. The value specified for ROWS_PER_BATCH should approximately the same as the actual number of rows.

  • This is what I don't understand:

    "ROWS_PER_BATCH should approximately the same as the actual number of rows"

    It means that all the rows would be in the same batch. And we are good if we give a hint of batch size rather than row size.

    The word 'ROWS_PER_BATCH' gives an impression that these many rows will be sent to sql server per batch instead of sending all the rows at the same time. But the sentence contradicts this:

    ""ROWS_PER_BATCH should approximately the same as the actual number of rows"

    -MJ
    Please do not print mails and docx unless it is absolutely necessary. Spread environmental awareness.

  • It may look somewhat confusing. Check out the explanation in BOL 2005 on 'Managing Batches for Bulk Import'. ROWS_PER_BATCH is a performance parameter, not a parameter that manages the batch size. BATCHSIZE takes precedence when both are specified.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply