Parallel insert in a table

  • Hi All,

    Is there a way to perform parallel inserts in a table ? I have any huge table in one database and I want a big chunk of its data in my database for some processing on weekly basis. Because the data is large, it takes lots of time to copy data. If we can do data loads in parallel, it will be faster. As per my understanding one INSERT statement will lock only the inserting records and not the entire table.

  • sqlenthu 89358 (8/14/2016)


    Hi All,

    Is there a way to perform parallel inserts in a table ? I have any huge table in one database and I want a big chunk of its data in my database for some processing on weekly basis. Because the data is large, it takes lots of time to copy data. If we can do data loads in parallel, it will be faster. As per my understanding one INSERT statement will lock only the inserting records and not the entire table.

    Unless you can guarantee that the parts of the table you want to load into are on separate physical spindles each having their own separate read/write head, doing a "parallel load" will only slow the process down. You just can't change physics.

    You're best bet would to be to setup for minimal logging and go serial.

    If the data is on a separate box than what you want to copy to, BCP the data to a couple of files, bulk load it into your local table using minimal logging.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Part of table into different spindles means partitioned tables ?

  • sqlenthu 89358 (8/15/2016)


    Part of table into different spindles means partitioned tables ?

    That would be necessary but you can partition the heck out of a table using multiple files and yet they may all live on just one spindle or "spindle set" (as is the case with RAID drives).

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi Jeff, is there any article or link which gives insight of the parallel inserts in a table ? I am curious to test and implement it.

  • Can we achieve it using service broker ? sp_executesql works in sequential order only.

  • There are plenty of methods for supposed "parallel loads" but none of those methods change physics. If the table exists only on one spindle, there's only one read/write head. That's the bottleneck and the more you cause it to move, the less time it has to write.

    Again, even with table partitioning, if you can't guarantee that you're writing to different parts of the table and those different parts are on separate physical spindles, you're just going to slow the process down. The best thing to do is to use whatever "minimal logging" techniques you can find because that actually writes less data to log files.

    Here's a link on how someone loaded 1TB in an hour a couple of years ago.

    https://blogs.msdn.microsoft.com/sqlcat/2006/05/19/load-1tb-in-less-than-1-hour/

    I also provide a direct quote from that article...

    The files were on the same SAN as the database, but on different drive arrays. There were 12 files on a single drive array and it took 5 separate disk arrays to handle all 60 input files. The database has two filegroups and 96 files. The table used for inserts was LINEITEM which resided on its own filegroup and is placed on the outer tracks of the disk.

    It would have been much simpler for them to run 5 inputs rather than 60 and they probably would have done better because of limited head moving instead of thrashing to different spots on the disk to service different files.

    Shifting gears, how much data do YOU need to load and how often?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I'd suggest to start from a basic "table partitioning" which was available even back in SQL2000 ages.

    Make sure the clustered index and non-clustered index(es) are on different physical drives.

    And none of those physical drives holds the log file.

    By "different physical drives" I mean different arrays, different controllers.

    _____________
    Code for TallyGenerator

  • Hi,

    I was going through the concept of service broker but little confused. Can I make insert into table using following code:

    DECLARE @SBDialog uniqueidentifier

    DECLARE @Message NVARCHAR(128)

    BEGIN DIALOG CONVERSATION @SBDialog

    FROM SERVICE SBSendService

    TO SERVICE 'SBReceiveService'

    ON CONTRACT SBContract

    WITH ENCRYPTION = OFF

    -- Send messages on Dialog

    SET @Message = N'Insert into table1(col1, col2), select col1, col2 from table2 where param = 1234';

    SEND ON CONVERSATION @SBDialog

    MESSAGE TYPE SBMessage (@Message)

  • sqlenthu 89358 (8/21/2016)


    sp_executesql works in sequential order only.

    I wonder - where does it come from?

    _____________
    Code for TallyGenerator

  • Sergiy (8/23/2016)


    sqlenthu 89358 (8/21/2016)


    sp_executesql works in sequential order only.

    I wonder - where does it come from?

    Does it work parallel as well ? I mean if I write it as follows:

    sp_executesql <some proc1>

    sp_executesql <some proc2>

    then will it just fire proc1 first and then before it's completion fire proc2 ? No, I believe.

  • Sergiy (8/23/2016)


    sqlenthu 89358 (8/21/2016)


    sp_executesql works in sequential order only.

    I wonder - where does it come from?

    Does it work parallel as well ? I mean if I write it as follows:

    sp_executesql <some proc1>

    sp_executesql <some proc2>

    then will it just fire proc1 first and then before it's completion fire proc2 ? No, I believe.

  • sqlenthu 89358 (8/24/2016)


    Sergiy (8/23/2016)


    sqlenthu 89358 (8/21/2016)


    sp_executesql works in sequential order only.

    I wonder - where does it come from?

    Does it work parallel as well ? I mean if I write it as follows:

    sp_executesql <some proc1>

    sp_executesql <some proc2>

    then will it just fire proc1 first and then before it's completion fire proc2 ? No, I believe.

    It has nothing to do with sp_executesql.

    In your script statements are executed sequentially one after another.

    Any kind of statements.

    And I believe it's true for any kind of script, not only T-SQL.

    If you want Proc1 and Proc2 to be executed at the same time launch them from independent command shells:

    - different SSMS windows;

    - different command prompt sessions;

    - different SQL Agent jobs;

    - different SSIS tasks;

    - etc.

    But - you probably do not need any of those options.

    SQL Server has its own parallelism.

    It will perform your single query in several parallel threads if it decides that performance will benefit from parallel execution.

    _____________
    Code for TallyGenerator

  • sqlenthu 89358 (8/23/2016)


    Hi,

    I was going through the concept of service broker but little confused. Can I make insert into table using following code:

    DECLARE @SBDialog uniqueidentifier

    DECLARE @Message NVARCHAR(128)

    BEGIN DIALOG CONVERSATION @SBDialog

    FROM SERVICE SBSendService

    TO SERVICE 'SBReceiveService'

    ON CONTRACT SBContract

    WITH ENCRYPTION = OFF

    -- Send messages on Dialog

    SET @Message = N'Insert into table1(col1, col2), select col1, col2 from table2 where param = 1234';

    SEND ON CONVERSATION @SBDialog

    MESSAGE TYPE SBMessage (@Message)

    Does this makes sense ?

  • sqlenthu 89358 (8/24/2016)


    sqlenthu 89358 (8/23/2016)


    Hi,

    I was going through the concept of service broker but little confused. Can I make insert into table using following code:

    DECLARE @SBDialog uniqueidentifier

    DECLARE @Message NVARCHAR(128)

    BEGIN DIALOG CONVERSATION @SBDialog

    FROM SERVICE SBSendService

    TO SERVICE 'SBReceiveService'

    ON CONTRACT SBContract

    WITH ENCRYPTION = OFF

    -- Send messages on Dialog

    SET @Message = N'Insert into table1(col1, col2), select col1, col2 from table2 where param = 1234';

    SEND ON CONVERSATION @SBDialog

    MESSAGE TYPE SBMessage (@Message)

    Does this makes sense ?

    I'm not sure if it would be any more parallel than simply running

    Insert into table1(col1, col2), select col1, col2 from table2 where param = 1234

    _____________
    Code for TallyGenerator

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply