August 15, 2013 at 2:01 am
Note, I am not talking about a clustered index against a non-cluster index, I mean a clustered index comparing to no index at all for inserting performance.
I saw lots of links as below said that, a clustered index has better performance against no index even for insert operation, due to IAM/PFS/bitmap/..., but with my testing, seems no index is faster than cluster index, what's the problem?
http://support.microsoft.com/kb/297861
my test scripts:
-------------------------------------------
--prepare table with clustered index
CREATE TABLE [dbo].[BigTable_CI](
[id] [int] IDENTITY(1,1) NOT NULL,
[BigChar] [char](4100) NOT NULL
)
GO
CREATE CLUSTERED INDEX CIX_BigTable_CI
ON BigTable_CI(id)
GO
ALTER TABLE [dbo].[BigTable_CI]
ADD CONSTRAINT [DF_BigTable_BigChar_CI] DEFAULT ('a') FOR [BigChar]
GO
CREATE PROCEDURE [dbo].[AddDataToBigTable_CI]
(@NumberOfRows bigint)
AS
SET NOCOUNT ON;
DECLARE @Counter int = 0;
DECLARE @Start datetime = GETDATE();
DECLARE @End datetime;
DECLARE @ElapsedTime int = 0;
DECLARE @RowsPerSecond int = 0;
WHILE (@Counter < @NumberOfRows)
BEGIN
INSERT INTO dbo.BigTable_CI DEFAULT VALUES;
SELECT @Counter += 1;
END;
-- Calculate elapsed time and rows/second
SET @End = GETDATE();
SET @ElapsedTime = CONVERT(INTEGER, DATEDIFF (second, @Start, @End));
SET @RowsPerSecond = @NumberOfRows/@ElapsedTime;
-- Record results in local table
INSERT INTO dbo.Results
(StartTime, EndTime, ElapsedTime, NumberOfRows, RowsPerSecond)
VALUES (@Start, @End, @ElapsedTime, @NumberOfRows, @RowsPerSecond);
RETURN;
-------------------------------------------
--prepare table without any index at all.
CREATE TABLE [dbo].[BigTable_NI](
[id] [int] IDENTITY(1,1) NOT NULL,
[BigChar] [char](4100) NOT NULL
)
GO
ALTER TABLE [dbo].[BigTable_NI]
ADD CONSTRAINT [DF_BigTable_BigChar_NI] DEFAULT ('a') FOR [BigChar]
GO
CREATE PROCEDURE [dbo].[AddDataToBigTable_NI]
(@NumberOfRows bigint)
AS
SET NOCOUNT ON;
DECLARE @Counter int = 0;
DECLARE @Start datetime = GETDATE();
DECLARE @End datetime;
DECLARE @ElapsedTime int = 0;
DECLARE @RowsPerSecond int = 0;
WHILE (@Counter < @NumberOfRows)
BEGIN
INSERT INTO dbo.BigTable_NI DEFAULT VALUES;
SELECT @Counter += 1;
END;
-- Calculate elapsed time and rows/second
SET @End = GETDATE();
SET @ElapsedTime = CONVERT(INTEGER, DATEDIFF (second, @Start, @End));
SET @RowsPerSecond = @NumberOfRows/@ElapsedTime;
-- Record results in local table
INSERT INTO dbo.Results
(StartTime, EndTime, ElapsedTime, NumberOfRows, RowsPerSecond)
VALUES (@Start, @End, @ElapsedTime, @NumberOfRows, @RowsPerSecond);
RETURN;
-------------------------------------------
--prepare the results table
create table dbo.Results
(
StartTime datetime,
EndTime datetime,
ElapsedTime int,
NumberOfRows int,
RowsPerSecond int
)
-------------------------------------------
--run scripts:
exec [dbo].[AddDataToBigTable_NI] 1000000
exec [dbo].[AddDataToBigTable_CI] 1000000
August 15, 2013 at 6:46 am
Think about it: there is going to be more overhead inserting into a table with indexes Vs one with no indexes.
The probability of survival is inversely proportional to the angle of arrival.
August 15, 2013 at 7:43 am
This is a classic, it depends problem. Depending on how you structure the indexes and how your data is being loaded, yes, clustered indexes actually speed up insert operations. But, depending on how you structure the indexes and how your data is being loaded, clustered indexes actually slow down insert operations. There was a study published a while back, and I can't find the link right now I looked, that showed just how much better clustered inserts were over non-clustered. But, it was dependent on how that cluster was defined. Non-unique clustered indexes are going to be worse. Random or wildly distributed clusters, such as with a GUID, are going to be worse. But if the data is relatively well structured such that the loads are following a pattern that will not require massive page splits and rearrangements, clustered is going to radically outperform a heap. Brad discusses a lot of the internals details of why these things work this way here[/url].
I did another search. I think this article from Tibor is the one I was thinking of, but maybe not. Anyway, hopefully it adds to the discussion.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
August 15, 2013 at 8:16 am
If the heap and the clustered index both start out completely empty with no data page allocations the heap inserts will be faster. Beyond that, as Grant stated, there are any number of variables that can obfuscate any performance measurements.
There are so many variables to consider before deciding on an indexing scheme for doing lots of inserts. For my tables that are batch loaded at intervals I drop all indexes, truncate the tables, bulk insert then create the indexes (clustered first). For transaction tables you will have other considerations.
best recommendations if best performance is critical run some test with several methodologies and choose what works best for your data and concurrent environment.
The probability of survival is inversely proportional to the angle of arrival.
Viewing 4 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy