Insert Performance

  • Hi,

    Which will perform faster/better inserting million rows to a table or put all insert in a transaction then commit?

    Does it have a considerable difference?

    Example

    insert into ...

    insert into ...

    insert into ...

    insert into ...

    insert into ...

    or

    begin transaction

    insert into ...

    insert into ...

    insert into ...

    insert into ...

    insert into ...

    commit transaction;

    Thanks!

  • Single INSERT INTO with UNION ALL in SELECTs will be waaaay faster.

    _____________
    Code for TallyGenerator

  • Sergiy (3/12/2013)


    Single INSERT INTO with UNION ALL in SELECTs will be waaaay faster.

    UNION ALL million rows? I don't think so!

    What is the source? Where data coming from?

    BTW. Million separate inserts in one transaction will be faster than without transaction if you have enough space in your transaction log to cover it without grow.

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Actually, building such a query for a million rows will produce about a GB of a script.

    It will take ages lust to parse it.

    For million rows uploads there are BULK INSERT and BCP commands.

    Everything else is so bad - there is no point even in discussing what's better.

    _____________
    Code for TallyGenerator

  • Sergiy (3/13/2013)


    Actually, building such a query for a million rows will produce about a GB of a script.

    It will take ages lust to parse it.

    For million rows uploads there are BULK INSERT and BCP commands.

    Everything else is so bad - there is no point even in discussing what's better.

    That is why I am asking OP what is the source.

    It may be that BULK INSERT, BCP or SSIS should be used, or, if the source is just another tables in the same database, then simple INSERT INTO ... SELECT can be used (or SELECT INTO...).

    There are not enough details provided to make any meaningful suggestion...

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • From how it looks - it's about scripting DB deployment as a part of an installation package.

    That's how common tools build those scripts.

    Not sure what kind of million-rows table could be a part of such a package though...

    But anyway - saving data in files and scripting BULK INSERT upload would be the most effective approacbh.

    _____________
    Code for TallyGenerator

  • A million individual insert statements is a horrible approach. Period. Doing it within a single transaction will be painful because you're going to have to have a huge transaction log to support. If the log has to grow while you're running the transaction it will slow it down even more. You need to find a more batch oriented approach. If you're pulling data from a location, see if you can pass it as a table variable or XML and then write a single insert statement that moves all one million rows. That's the better approach.

    But, if you have to do individual inserts (and I'd be curious why), then small sets of transactions, say 10,000 at a time, would probably be better than all one million or one at a time.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • I agree with Grant

  • Year, hard to argue.

    Except, probably, for this one:

    Grant Fritchey (3/13/2013)


    But, if you have to do individual inserts (and I'd be curious why), then small sets of transactions, say 10,000 at a time, would probably be better than all one million or one at a time.

    Upload via an ODBC driver is "one row at a time" because of the nature of ODBC drivers.

    And still single load always outperforms set of partial ones (in my experience).

    _____________
    Code for TallyGenerator

  • Sergiy (3/13/2013)


    Year, hard to argue.

    Except, probably, for this one:

    Grant Fritchey (3/13/2013)


    But, if you have to do individual inserts (and I'd be curious why), then small sets of transactions, say 10,000 at a time, would probably be better than all one million or one at a time.

    Upload via an ODBC driver is "one row at a time" because of the nature of ODBC drivers.

    And still single load always outperforms set of partial ones (in my experience).

    I guess it really depends on how you do the test. I haven't done precise measures on these for several years. But a few years ago, I did do a series of tests, not from ODBC, but through ADO.NET, and we saw serious performance degradation when using single inserts stacked. But I can't supply the numbers because it was from a little while back.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (3/13/2013)


    But a few years ago, I did do a series of tests, not from ODBC, but through ADO.NET, and we saw serious performance degradation when using single inserts stacked. But I can't supply the numbers because it was from a little while back.

    Were those single inserts wrapped into a isngle transaction?

    Because I was talking about comparing a single transaction for 1mil inserts and 100 transactions for 10k inserts each.

    Multiple inserts within a single transaction is what happening when you load data from, say, a linked server or OPENROWSET.

    _____________
    Code for TallyGenerator

  • Thanks for the insights.

    I don't have really an implementation for this. My boss just asked me that sort of question and just wondering if some of you might have a real life experience.

  • Sergiy (3/13/2013)


    Grant Fritchey (3/13/2013)


    But a few years ago, I did do a series of tests, not from ODBC, but through ADO.NET, and we saw serious performance degradation when using single inserts stacked. But I can't supply the numbers because it was from a little while back.

    Were those single inserts wrapped into a isngle transaction?

    Because I was talking about comparing a single transaction for 1mil inserts and 100 transactions for 10k inserts each.

    Multiple inserts within a single transaction is what happening when you load data from, say, a linked server or OPENROWSET.

    We did a series of tests going from single transactions with single inserts, single transactions with multiple single inserts, and single transactions with batches of inserts. The performance order was the opposite of what I listed. Batch was best followed by groups of inserts within a transaction followed by single transactions.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • The log buffer flushes from the individual inserts without a wrapping begin/commit tran will absolutely crush your throughput.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply