Stairway to Columnstore Indexes Level 5: Adding New Data To Columnstore Indexes

  • Hugo Kornelis

    SSC Guru

    Points: 64645

    Comments posted to this topic are about the item Stairway to Columnstore Indexes Level 5: Adding New Data To Columnstore Indexes


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • TheComedian

    SSCrazy

    Points: 2928

    "if you have between 100,000 and 120,000 new rows per day, a single daily load would result in either lots of rows in the delta store, or lots of small rowgroups"

    Good tip/remark. I thought of the same thing when I first read up on Delta Store and the min/max values for row groups.

  • curious_sqldba

    SSC-Dedicated

    Points: 36283

    Excellent article @Hugo. THANKYOU.

    Question : You menationed "The standard T-SQL INSERT ... SELECT statement will also use the bulk load method if the total number of rows inserted is large enough, as shown by the results of listing 5-2. However, the T-SQL MERGE statement will always use the trickle insert method, regardless of the number of rows to be inserted.". Do you know how many records should it insert to use bulk load method? I couldn't find any documentation related to this. Is this new in 2016?

  • Hugo Kornelis

    SSC Guru

    Points: 64645

    curious_sqldba - Friday, July 6, 2018 10:50 AM

    Excellent article @Hugo. THANKYOU.

    Question : You menationed "The standard T-SQL INSERT ... SELECT statement will also use the bulk load method if the total number of rows inserted is large enough, as shown by the results of listing 5-2. However, the T-SQL MERGE statement will always use the trickle insert method, regardless of the number of rows to be inserted.". Do you know how many records should it insert to use bulk load method? I couldn't find any documentation related to this. Is this new in 2016?

    Thanks for the kind words! And thanks for your question, as it gives me the opportunity to clarify what I clearly had not described clear enough.

    Two methods of inserting data exist:
    * Trickle load: Add rows to open delta store rowgroup, using row mode storage. (Will eventually be compressed into columnstore storage when the rowgroup is full, e.g. has 1,048,576 rows)
    * Bulk load: Added in a new rowgroup that is immediately compressed into columnstore storage after the insert finishes.

    Bulk load is used when the following conditions apply:
    1: Insert has at least 102,400 rows; AND
    2: Insert is done by either any of the tools that use the Bulk Insert API (as listed in the article), or done by an INSERT INTO ... SELECT ... statement.

    Trickle insert is used when the following conditions apply:
    1. Insert has 101,399 rows; OR
    2: Insert is done through a tool that does not suport Bulk Insert API; OR
    3: Rows are added by a MERGE instead of an INSERT statemment (regardless of number of rows); OR
    4: Rows are added as new due to an UPDATE statement (see http://www.sqlservercentral.com/articles/Stairway+Series/136174/), also regardless of number of rows.

    So the short answer to your question is: "102,400" :satisfied:


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • curious_sqldba

    SSC-Dedicated

    Points: 36283

    Hugo Kornelis - Friday, July 6, 2018 12:51 PM

    Hugo Kornelis - Friday, July 6, 2018 12:51 PM

    curious_sqldba - Friday, July 6, 2018 10:50 AM

    Excellent article @Hugo. THANKYOU.

    Question : You menationed "The standard T-SQL INSERT ... SELECT statement will also use the bulk load method if the total number of rows inserted is large enough, as shown by the results of listing 5-2. However, the T-SQL MERGE statement will always use the trickle insert method, regardless of the number of rows to be inserted.". Do you know how many records should it insert to use bulk load method? I couldn't find any documentation related to this. Is this new in 2016?

    Thanks for the kind words! And thanks for your question, as it gives me the opportunity to clarify what I clearly had not described clear enough.

    Two methods of inserting data exist:
    * Trickle load: Add rows to open delta store rowgroup, using row mode storage. (Will eventually be compressed into columnstore storage when the rowgroup is full, e.g. has 1,048,576 rows)
    * Bulk load: Added in a new rowgroup that is immediately compressed into columnstore storage after the insert finishes.

    Bulk load is used when the following conditions apply:
    1: Insert has at least 102,400 rows; AND
    2: Insert is done by either any of the tools that use the Bulk Insert API (as listed in the article), or done by an INSERT INTO ... SELECT ... statement.

    Trickle insert is used when the following conditions apply:
    1. Insert has 101,399 rows; OR
    2: Insert is done through a tool that does not suport Bulk Insert API; OR
    3: Rows are added by a MERGE instead of an INSERT statemment (regardless of number of rows); OR
    4: Rows are added as new due to an UPDATE statement (see http://www.sqlservercentral.com/articles/Stairway+Series/136174/), also regardless of number of rows.

    So the short answer to your question is: "102,400" :satisfied:


    Thanks for the kind words! And thanks for your question, as it gives me the opportunity to clarify what I clearly had not described clear enough.

    Two methods of inserting data exist:
    * Trickle load: Add rows to open delta store rowgroup, using row mode storage. (Will eventually be compressed into columnstore storage when the rowgroup is full, e.g. has 1,048,576 rows)
    * Bulk load: Added in a new rowgroup that is immediately compressed into columnstore storage after the insert finishes.

    Bulk load is used when the following conditions apply:
    1: Insert has at least 102,400 rows; AND
    2: Insert is done by either any of the tools that use the Bulk Insert API (as listed in the article), or done by an INSERT INTO ... SELECT ... statement.

    Trickle insert is used when the following conditions apply:
    1. Insert has 101,399 rows; OR
    2: Insert is done through a tool that does not suport Bulk Insert API; OR
    3: Rows are added by a MERGE instead of an INSERT statemment (regardless of number of rows); OR
    4: Rows are added as new due to an UPDATE statement (see http://www.sqlservercentral.com/articles/Stairway+Series/136174/), also regardless of number of rows.

    So the short answer to your question is: "102,400" :satisfied:

    Appreciate for detail explanation. I have noticed when i load data using SSIS fast data load option it is way faster than my t-sql. Below is my sample t-sql.  I would like to execute below t-sql just like how would execute through SSIS.  I am not sure how can i confirm if below sql is actually inserting data using bulk load method. TIA

    INSERT INTO Archive.dbo.PT ( col1 ,
                col2 ,
                col3 )
        SELECT TOP 1000000 a.col1 ,
               a.col2 ,
               a.col3
        FROM Edb.dbo.PT a
           JOIN #T1 b ON a.col1 = b.col1
               AND a.col2 = b.col2;

Viewing 5 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply