Better insert performance on a table only with a clustered index or a table without any index on SqlServer 2008?

  • Note, I am not talking about a clustered index against a non-cluster index, I mean a clustered index comparing to no index at all for inserting performance.

    I saw lots of links as below said that, a clustered index has better performance against no index even for insert operation, due to IAM/PFS/bitmap/..., but with my testing, seems no index is faster than cluster index, what's the problem?

    http://stackoverflow.com/questions/7264820/removing-a-primary-key-clustered-index-to-increase-insert-performance

    http://support.microsoft.com/kb/297861

    my test scripts:

    -------------------------------------------

    --prepare table with clustered index

    CREATE TABLE [dbo].[BigTable_CI](

    [id] [int] IDENTITY(1,1) NOT NULL,

    [BigChar] [char](4100) NOT NULL

    )

    GO

    CREATE CLUSTERED INDEX CIX_BigTable_CI

    ON BigTable_CI(id)

    GO

    ALTER TABLE [dbo].[BigTable_CI]

    ADD CONSTRAINT [DF_BigTable_BigChar_CI] DEFAULT ('a') FOR [BigChar]

    GO

    CREATE PROCEDURE [dbo].[AddDataToBigTable_CI]

    (@NumberOfRows bigint)

    AS

    SET NOCOUNT ON;

    DECLARE @Counter int = 0;

    DECLARE @Start datetime = GETDATE();

    DECLARE @End datetime;

    DECLARE @ElapsedTime int = 0;

    DECLARE @RowsPerSecond int = 0;

    WHILE (@Counter < @NumberOfRows)

    BEGIN

    INSERT INTO dbo.BigTable_CI DEFAULT VALUES;

    SELECT @Counter += 1;

    END;

    -- Calculate elapsed time and rows/second

    SET @End = GETDATE();

    SET @ElapsedTime = CONVERT(INTEGER, DATEDIFF (second, @Start, @End));

    SET @RowsPerSecond = @NumberOfRows/@ElapsedTime;

    -- Record results in local table

    INSERT INTO dbo.Results

    (StartTime, EndTime, ElapsedTime, NumberOfRows, RowsPerSecond)

    VALUES (@Start, @End, @ElapsedTime, @NumberOfRows, @RowsPerSecond);

    RETURN;

    -------------------------------------------

    --prepare table without any index at all.

    CREATE TABLE [dbo].[BigTable_NI](

    [id] [int] IDENTITY(1,1) NOT NULL,

    [BigChar] [char](4100) NOT NULL

    )

    GO

    ALTER TABLE [dbo].[BigTable_NI]

    ADD CONSTRAINT [DF_BigTable_BigChar_NI] DEFAULT ('a') FOR [BigChar]

    GO

    CREATE PROCEDURE [dbo].[AddDataToBigTable_NI]

    (@NumberOfRows bigint)

    AS

    SET NOCOUNT ON;

    DECLARE @Counter int = 0;

    DECLARE @Start datetime = GETDATE();

    DECLARE @End datetime;

    DECLARE @ElapsedTime int = 0;

    DECLARE @RowsPerSecond int = 0;

    WHILE (@Counter < @NumberOfRows)

    BEGIN

    INSERT INTO dbo.BigTable_NI DEFAULT VALUES;

    SELECT @Counter += 1;

    END;

    -- Calculate elapsed time and rows/second

    SET @End = GETDATE();

    SET @ElapsedTime = CONVERT(INTEGER, DATEDIFF (second, @Start, @End));

    SET @RowsPerSecond = @NumberOfRows/@ElapsedTime;

    -- Record results in local table

    INSERT INTO dbo.Results

    (StartTime, EndTime, ElapsedTime, NumberOfRows, RowsPerSecond)

    VALUES (@Start, @End, @ElapsedTime, @NumberOfRows, @RowsPerSecond);

    RETURN;

    -------------------------------------------

    --prepare the results table

    create table dbo.Results

    (

    StartTime datetime,

    EndTime datetime,

    ElapsedTime int,

    NumberOfRows int,

    RowsPerSecond int

    )

    -------------------------------------------

    --run scripts:

    exec [dbo].[AddDataToBigTable_NI] 1000000

    exec [dbo].[AddDataToBigTable_CI] 1000000

  • Think about it: there is going to be more overhead inserting into a table with indexes Vs one with no indexes.

    The probability of survival is inversely proportional to the angle of arrival.

  • This is a classic, it depends problem. Depending on how you structure the indexes and how your data is being loaded, yes, clustered indexes actually speed up insert operations. But, depending on how you structure the indexes and how your data is being loaded, clustered indexes actually slow down insert operations. There was a study published a while back, and I can't find the link right now I looked, that showed just how much better clustered inserts were over non-clustered. But, it was dependent on how that cluster was defined. Non-unique clustered indexes are going to be worse. Random or wildly distributed clusters, such as with a GUID, are going to be worse. But if the data is relatively well structured such that the loads are following a pattern that will not require massive page splits and rearrangements, clustered is going to radically outperform a heap. Brad discusses a lot of the internals details of why these things work this way here[/url].

    I did another search. I think this article from Tibor is the one I was thinking of, but maybe not. Anyway, hopefully it adds to the discussion.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • If the heap and the clustered index both start out completely empty with no data page allocations the heap inserts will be faster. Beyond that, as Grant stated, there are any number of variables that can obfuscate any performance measurements.

    There are so many variables to consider before deciding on an indexing scheme for doing lots of inserts. For my tables that are batch loaded at intervals I drop all indexes, truncate the tables, bulk insert then create the indexes (clustered first). For transaction tables you will have other considerations.

    best recommendations if best performance is critical run some test with several methodologies and choose what works best for your data and concurrent environment.

    The probability of survival is inversely proportional to the angle of arrival.

Viewing 4 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply