Blog Post

Bulk Insert – Insights That Will Make You Drool

,

Bulk insert can be an extremely helpful tool to help ingest data into SQL Server efficiently. Sometimes, it is necessary to capture metrics in regards to the bulk insert in order to understand the who, when, and potentially why related to the insert process.

Many moons ago, I worked with a client issue in regards to performance issues stemming around linked servers. From that work, I shared an article about monitoring linked servers.

Also stemming from that work was a clear pattern that there were some issues also relating to bulk inserts. This experience helped me to generate a means to monitor and trap useful data while troubleshooting. From this monitoring solution, we were able to isolate specific processes for the client to help improve performance.

In this article, I will share the solution that I created through the power of Extended Events.

Bulk Insert Monitoring

When you are looking to trace for bulk inserts via Extended Events (XEvents), it is crucial to remember that there are multiple events. In addition, there are multiple means to perform a bulk insert. I show three methods to perform a bulk insert in my demo script.

The bulk insert methods I will explore are BCP, BULK INSERT, and OPENROWSET(BULK). Each has advantages and disadvantages, none of which I will explore in this article. Suffice it to say that each has slightly different uses and it is worth testing them to see which works best for your needs!

Bulk Events

Extended Events gets more and more powerful with each update to SQL Server. Sometimes this is demonstrated through the addition of certain events. Other times, it is measured through the improvement of existing events. With the release of SQL Server 2017, we saw the improvement as it relates to bulk insert operations. SQL Server 2017 added two more events.

As it now stands, the events in SQL Server that I would use for bulk insert operations consists of the following 7 events (SQL Server 2017 and 2019).

  1. databases_bulk_copy_rows
  2. databases_bulk_copy_throughput
  3. bulk_insert_usage
  4. openrowset_bulk_usage
  5. databases_bulk_insert_rows
  6. databases_bulk_insert_throughput
  7. bulk_operation_page_check_failed

If you happen to be in SQL Server 2016, then remove items 3 and 4 from the preceding list.

Bulk Setup and Demo

The XEvent script is fairly large, so in the sake of brevity, you can just download it from here. Once the session is installed and running, we are ready to try some easy tests. I will just share some snippets of my demo script used to test this session.

First, let’s take a look at the bcp method to bulk insert (or bulk copy) data. The command is fairly straightforward.

EXECUTE master..xp_cmdshell 'BCP AdventureWorks2019.Sales.SalesOrderDetail OUT C:DatabaseDemoBCPOutDemo.txt -S .MyServer -T -c'
EXECUTE master..xp_cmdshell 'BCP AdventureWorks2019.Sales.SalesOrderDetail format nul -c -f C:DatabaseDemoBCPOutDemo.fmt -S .MyServer -T'

These two preceding commands allow us to both create an export file and secondly to create a format file. The format file is not entirely necessary until the last insert to be done. Once the data is setup via the preceding script, we can proceed to attempting a bcp insert.

As a part of performing that bcp insert, I want to make sure I have the proper table structure in place.

Bulk Insert Demo 1

IF NOT EXISTS (SELECT * FROM sys.databases d
WHERE d.name = 'BCPDemo')
BEGIN
CREATE DATABASE BCPDemo;
END;
GO
USE [BCPDemo]
GO
IF SCHEMA_ID('Demo') IS NULL
BEGIN
    EXECUTE sp_executesql N'CREATE SCHEMA [Demo] AUTHORIZATION [dbo]';
END;
GO
IF EXISTS (SELECT * FROM sys.objects o
WHERE o.name = 'SalesOrderDetail'
AND SCHEMA_NAME(o.schema_id) = 'Demo')
BEGIN
DROP TABLE Demo.SalesOrderDetail
END
CREATE TABLE [Demo].[SalesOrderDetail]
(
[SalesOrderID] [INT] NOT NULL
  , [SalesOrderDetailID] [INT] IDENTITY(1, 1) NOT NULL
  , [CarrierTrackingNumber] [NVARCHAR](25) NULL
  , [OrderQty] [SMALLINT] NOT NULL
  , [ProductID] [INT] NOT NULL
  , [SpecialOfferID] [INT] NOT NULL
  , [UnitPrice] [MONEY] NOT NULL
  , [UnitPriceDiscount] [MONEY] NOT NULL
  , [LineTotal] AS (ISNULL(([UnitPrice] * ((1.0) - [UnitPriceDiscount])) * [OrderQty], (0.0)))
  , [rowguid] [UNIQUEIDENTIFIER] ROWGUIDCOL NOT NULL
  , [ModifiedDate] [DATETIME] NOT NULL
  , CONSTRAINT [PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID]
PRIMARY KEY CLUSTERED (
[SalesOrderID] ASC
  , [SalesOrderDetailID] ASC
)
);
GO
EXECUTE master..xp_cmdshell 'BCP BCPDemo.Demo.SalesOrderDetail IN C:DatabaseDemoBCPOutDemo.txt -S .MyServer -T -c'
GO

So far so good. There is nothing super secret sauce about this example yet. Suffice it to say the demo succeeds and data is inserted as desired.

Bulk Insert Demo 2

I am jumping straight to the demo script. The point of these demos is really just to show data in the XEvent session anyway, so let’s not delay.

BULK INSERT BCPDemo.[Demo].[SalesOrderDetail]
   FROM 'C:DatabaseDemoBCPOutDemo.txt';
GO

In my opinion, this is by far the easiest of the three and it works so well.

Bulk Insert Demo 3

This demo is slightly more complicated. To make this demo work, we must use the format file that was created from the first script. There are some other nuances as well such as accounting for the identity field and the computed field. Let’s take a look.

IF EXISTS (SELECT * FROM sys.objects o
WHERE o.name = 'SalesOrderDetail'
AND SCHEMA_NAME(o.schema_id) = 'Demo')
BEGIN
DROP TABLE Demo.SalesOrderDetail
END
CREATE TABLE [Demo].[SalesOrderDetail]
(
[SalesOrderID] [INT] NOT NULL
  , [SalesOrderDetailID] [INT] IDENTITY(1, 1) NOT NULL
  , [CarrierTrackingNumber] [NVARCHAR](25) NULL
  , [OrderQty] [SMALLINT] NOT NULL
  , [ProductID] [INT] NOT NULL
  , [SpecialOfferID] [INT] NOT NULL
  , [UnitPrice] [MONEY] NOT NULL
  , [UnitPriceDiscount] [MONEY] NOT NULL
  , [LineTotal] AS (ISNULL(([UnitPrice] * ((1.0) - [UnitPriceDiscount])) * [OrderQty], (0.0)))
  , [rowguid] [UNIQUEIDENTIFIER] ROWGUIDCOL NOT NULL
  , [ModifiedDate] [DATETIME] NOT NULL
  , CONSTRAINT [PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID]
PRIMARY KEY CLUSTERED (
[SalesOrderID] ASC
  , [SalesOrderDetailID] ASC
)
);
GO
SET IDENTITY_INSERT Demo.SalesOrderDetail ON
GO
INSERT INTO [Demo].[SalesOrderDetail]
(SalesOrderID
  , SalesOrderDetailID --identity column
  , [CarrierTrackingNumber]
  , [OrderQty]
  , [ProductID]
  , [SpecialOfferID]
  , [UnitPrice]
  , [UnitPriceDiscount]
  --, [LineTotal] --computed column
  , [rowguid]
  , [ModifiedDate]
)
SELECTSalesOrderID
  , SalesOrderDetailID
  , [CarrierTrackingNumber]
  , [OrderQty]
  , [ProductID]
  , [SpecialOfferID]
  , [UnitPrice]
  , [UnitPriceDiscount]
  --, [LineTotal]
  , [rowguid]
  , [ModifiedDate]
FROM
OPENROWSET(BULK N'C:DatabaseDemoBCPOutDemo.txt', FORMATFILE = 'C:DatabaseDemoBCPOutDemo.fmt')
AS Document;
GO
SET IDENTITY_INSERT Demo.SalesOrderDetail OFF
GO

Look closely at the script. I have added notes to the specific problem fields. In addition, note that I am dropping the table and recreating it for the purposes of the demo.

Reviewing the Bulk XE Session

This is where the real juicy stuff is. Now that three successful bulk inserts have been performed, let’s look at the session and see what was captured.

bulk copy

Shown here, you can see I have two of the events highlighted. Both of these are related to the  bcp type of bulk inserts (or bulk copies). Consequently, these events were triggered by the first demo when the bcp was performed. Take note that the throughput event highlighted in red becomes very noisy. For a mere 57 events triggered for the rows copy (highlighted in green), you get 3844 events triggered for throughput.

I will not dive into the throughput, but there is some useful data to be gleaned from it. Thus, I have included it in the XEvent session.

Bulk Insert Review

Next up is to review data from the second demo, which is the bulk insert demo. There are a few more events lumped together in this review over the previous one. Let’s take a gander.

Much the same as the previous section, I have highlighted different events of interest. In the red section, we see the insert throughput event. This time, the associated throughput event is much less noisy. It only produced 56 events (which happen to be for both openrowset and regular bulk insert).

Highlighted in blue is the rpc_starting event. Only one event was triggered across all demos for this event and it is related to the traditional bulk insert. In purple, we see the bulk_insert_usage event. Once again this event is triggered by the traditional bulk insert. This event provides some additional info around data source, codepage and format file.

Lastly, for the second demo, you will see the bulk_insert_rows event that triggered a fair number of times. Mind you, this event triggers for all types of bulk insert and that means it also triggered for the openrowset demo, which is to be discussed next.

Bulk openrowset Review

This section of the XEvent session data is very similar to the previous session. Both perform bulk inserts and both trigger some of the similar events. That said, this section should be fairly quick and easy.

I am only going to discuss the two highlighted events. First, we have the red section which illustrates the openrowset_bulk_usage event. This event is very similar in nature to the bulk_insert_usage event – except that it is just for the openrowset flavor of bulk inserts.

In the green section, highlighted above, you will see the insert rows event. As was discussed in the preceding section, this event is also fired for the openrowset inserts. We can distinguish the difference via the sql_text action that was attached to the event payload. Look closely, and you should recognize it immediately. Speaking of actions…

Loads of Actions

Wait, the results displayed in the previous section didn’t show all of the actions that were attached to each of the events. What happened to all of those actions? Truth be told, I simply did not want to show them.

The reason behind that decision is quite simple. Some of the actions attached to the payload revealed user names. And some of the actions require a lengthier discussion (e.g. the frame or callstack related actions). All of the attached actions do return results in this session. Your homework is to go and discover more from this session and all of the actions included. Enjoy!

Put a Bow on It

In this article, I have demonstrated a useful and powerful means to capture data for various bulk operations (insert and copy). The session is available for download via the provided script link. While, I did not include the entire demo script, there is adequate there to hopefully reproduce a demo for your lab environment.

For more uses of Extended Events, I recommend my series of articles designed to help you learn XE little by little.

Interested in seeing the power of XE over Profiler? Check this one out!

This is eleventh post in the 2021 “12 Days of Christmas” series. All articles in the series (for all years) is available via this page.

The post Bulk Insert – Insights That Will Make You Drool first appeared on SQL RNNR.

Original post (opens in new tab)
View comments in original post (opens in new tab)

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating