Index Size Difference

  • Hi Experts,

    For testing I have created a table with a clustered index selecting all columns and inserted some data . The size of clustered index was shown as 56.6MB. I created a non-clustered index again selecting all columns (ignoring varchar(max) column) and the size of NC index was shown as 4.3MB. I recreated the Clustered index by keeping only one value and the size of clustered index changed to 56MB but what surprised me is that when size of the Non-Clustered index grew from 4.3MB to 5.1MB.

    Why and how the Non-clustered index grew?

    Is it because earlier it was same as clustered index and now when the column is removed from Clustered Index the non-clustered index has to actually add the column to it?

    Thanks in Advance

  • Let's understand some basics about what CLUSTERED and NONCLUSTERED indexes are:

    1.) When you create a CLUSTERED index, that index is actually the table itself, and the table is then kept in sorted order by the primary key.   As you cannot "INCLUDE" columns in a CLUSTERED index because such an index ALWAYS includes ALL columns, the size of such an index is always the size of the table.
    2.) When the key of a CLUSTERED index is changed, the table increases in size to accommodate changes in the leaf-level values that need to be kept.   Increasing the number of fields in the key is a way to increase the size, while removing a column from the key would decrease the size.
    3.) When you create a NONCLUSTERED index, it's leaf-levels have to include the leaf-levels of any existing CLUSTERED index, so an increase in the number of columns composing the key of the CLUSTERED index means that the NONCLUSTERED indexes will increase in size to accommodate the larger leaf-level values in the CLUSTERED index.
    4.) Similarly to number 3, if the key of the CLUSTERED index has a field removed, it's leaf-levels will be smaller, and concomitantly, the NONCLUSTERED indexes' leaf-level values will be reduced in size.

    I'm not sure of the more arcane details of those processes, and I may have some of this incorrect, but I'm pretty sure that if I have this wrong, someone will be swift to correct my misunderstanding.

    Steve (aka sgmunson) πŸ™‚ πŸ™‚ πŸ™‚
    Rent Servers for Income (picks and shovels strategy)

  • Rebuild your nonclustered index and report the size back.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Are you doing any inserts or merges?

  • VastSQL - Tuesday, March 7, 2017 6:29 AM

    Hi Experts,

    For testing I have created a table with a clustered index selecting all columns and inserted some data . The size of clustered index was shown as 56.6MB. I created a non-clustered index again selecting all columns (ignoring varchar(max) column) and the size of NC index was shown as 4.3MB. I recreated the Clustered index by keeping only one value and the size of clustered index changed to 56MB but what surprised me is that when size of the Non-Clustered index grew from 4.3MB to 5.1MB.

    Why and how the Non-clustered index grew?

    Is it because earlier it was same as clustered index and now when the column is removed from Clustered Index the non-clustered index has to actually add the column to it?

    Thanks in Advance

    If you show your script, others will be better equipped to answer your questions. It could be something as simple as the choice of metric for index size.

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • sgmunson - Tuesday, March 7, 2017 7:38 AM

    Let's understand some basics about what CLUSTERED and NONCLUSTERED indexes are:

    1.) When you create a CLUSTERED index, that index is actually the table itself, and the table is then kept in sorted order by the primary key.   As you cannot "INCLUDE" columns in a CLUSTERED index because such an index ALWAYS includes ALL columns, the size of such an index is always the size of the table.
    2.) When the key of a CLUSTERED index is changed, the table increases in size to accommodate changes in the leaf-level values that need to be kept.   Increasing the number of fields in the key is a way to increase the size, while removing a column from the key would decrease the size.
    3.) When you create a NONCLUSTERED index, it's leaf-levels have to include the leaf-levels of any existing CLUSTERED index, so an increase in the number of columns composing the key of the CLUSTERED index means that the NONCLUSTERED indexes will increase in size to accommodate the larger leaf-level values in the CLUSTERED index.
    4.) Similarly to number 3, if the key of the CLUSTERED index has a field removed, it's leaf-levels will be smaller, and concomitantly, the NONCLUSTERED indexes' leaf-level values will be reduced in size.

    I'm not sure of the more arcane details of those processes, and I may have some of this incorrect, but I'm pretty sure that if I have this wrong, someone will be swift to correct my misunderstanding.

    I agree with you and that is why  was surprised when  saw growth in NC index Size when reduced the clustered index columns.

  • TheSQLGuru - Tuesday, March 7, 2017 7:52 AM

    Rebuild your nonclustered index and report the size back.

    Hi Kevin,

    I tried rebuild NC index ,drop and recreate NC index but that didnt make any change in NC index value.

  • ChrisM@Work - Tuesday, March 7, 2017 8:36 AM

    VastSQL - Tuesday, March 7, 2017 6:29 AM

    Hi Experts,

    For testing I have created a table with a clustered index selecting all columns and inserted some data . The size of clustered index was shown as 56.6MB. I created a non-clustered index again selecting all columns (ignoring varchar(max) column) and the size of NC index was shown as 4.3MB. I recreated the Clustered index by keeping only one value and the size of clustered index changed to 56MB but what surprised me is that when size of the Non-Clustered index grew from 4.3MB to 5.1MB.

    Why and how the Non-clustered index grew?

    Is it because earlier it was same as clustered index and now when the column is removed from Clustered Index the non-clustered index has to actually add the column to it?

    Thanks in Advance

    If you show your script, others will be better equipped to answer your questions. It could be something as simple as the choice of metric for index size.

    Here go the scripts

    CREATE TABLE [dbo].[Technologies](
        [TechnologyId] [int] IDENTITY(1,1) NOT NULL,
        [Technology] [nvarchar](max) NOT NULL,
        [Active] [bit] NULL CONSTRAINT [DF_Technologies_Active] DEFAULT ((1)),
        [CreatedOn] [datetime] NULL CONSTRAINT [DF_Technologies_CreatedOn] DEFAULT (getdate()),
        [ModifiedOn] [datetime] NULL,
        [CreatedUser] [bigint] NOT NULL,
        [ModifiedUser] [bigint] NULL
    ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

    GO

    CREATE CLUSTERED INDEX [ClusteredIndex-20170307-165653] ON [dbo].[Technologies]
    (
        [Active] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
    GO

    CREATE NONCLUSTERED INDEX [NonClusteredIndex-20170307-170122] ON [dbo].[Technologies]
    (
        [Active] ASC,
        [TechnologyId] ASC,
        [CreatedOn] ASC,
        [ModifiedOn] ASC,
        [CreatedUser] ASC,
        [ModifiedUser] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
    GO

    Initially the Clustered Index had same columns as that of NC index.

  • You put the clustered index on a bit column? Very odd choice since there are only two possible values there. It's not a very good index, clustered or nonclustered, and is going to be utterly unuseful to the optimizer.

    To answer your question, what you're seeing is probably the need for the engine to create a uniquifier for that clustered index. Since the clustered index, as explained earlier in this thread, defines table storage, where each row goes, when you have a non-unique clustered index, additional columns are added internally to make the clustered index unique. Now, let's jump over to the non-clustered index for a moment. Each non-clustered index also has the clustered index keys inside of it. It's how the rows are identified from the non-clustered index since the non-clustered index doesn't store the data (unlike the clustered index). So what you're seeing is the non-clustered index is bigger than it was because it has to store the uniquifier as well as the keys that you've defined in your indexes. Without a doubt, that's what's making it bigger.

    As to the rest, you really need to spend some time with fundamentals on how to construct tables, clustered indexes, nonclustered indexes, etc. because it really looks like you're going off into left field with these experiments (largely basing this on the clustered key of a bit column).

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey - Tuesday, March 7, 2017 10:57 PM

    You put the clustered index on a bit column? Very odd choice since there are only two possible values there. It's not a very good index, clustered or nonclustered, and is going to be utterly unuseful to the optimizer.

    To answer your question, what you're seeing is probably the need for the engine to create a uniquifier for that clustered index. Since the clustered index, as explained earlier in this thread, defines table storage, where each row goes, when you have a non-unique clustered index, additional columns are added internally to make the clustered index unique. Now, let's jump over to the non-clustered index for a moment. Each non-clustered index also has the clustered index keys inside of it. It's how the rows are identified from the non-clustered index since the non-clustered index doesn't store the data (unlike the clustered index). So what you're seeing is the non-clustered index is bigger than it was because it has to store the uniquifier as well as the keys that you've defined in your indexes. Without a doubt, that's what's making it bigger.

    As to the rest, you really need to spend some time with fundamentals on how to construct tables, clustered indexes, nonclustered indexes, etc. because it really looks like you're going off into left field with these experiments (largely basing this on the clustered key of a bit column).

    Thanks Grant.

    As I mentioned earlier this was created as a test and you are right the bit column is not at all a candidate for clustered index.  I have made the TechnologyID as only key column and this time the clustered index size reduced and NC index remained the same. Cannot agree with you on the uniquefier adding space since  the clustered index was not created as unique so it had uniquefier before and after the change.

  • VastSQL - Wednesday, March 8, 2017 11:23 PM

    Grant Fritchey - Tuesday, March 7, 2017 10:57 PM

    You put the clustered index on a bit column? Very odd choice since there are only two possible values there. It's not a very good index, clustered or nonclustered, and is going to be utterly unuseful to the optimizer.

    To answer your question, what you're seeing is probably the need for the engine to create a uniquifier for that clustered index. Since the clustered index, as explained earlier in this thread, defines table storage, where each row goes, when you have a non-unique clustered index, additional columns are added internally to make the clustered index unique. Now, let's jump over to the non-clustered index for a moment. Each non-clustered index also has the clustered index keys inside of it. It's how the rows are identified from the non-clustered index since the non-clustered index doesn't store the data (unlike the clustered index). So what you're seeing is the non-clustered index is bigger than it was because it has to store the uniquifier as well as the keys that you've defined in your indexes. Without a doubt, that's what's making it bigger.

    As to the rest, you really need to spend some time with fundamentals on how to construct tables, clustered indexes, nonclustered indexes, etc. because it really looks like you're going off into left field with these experiments (largely basing this on the clustered key of a bit column).

    Thanks Grant.

    As I mentioned earlier this was created as a test and you are right the bit column is not at all a candidate for clustered index.  I have made the TechnologyID as only key column and this time the clustered index size reduced and NC index remained the same. Cannot agree with you on the uniquefier adding space since  the clustered index was not created as unique so it had uniquefier before and after the change.

    You sure about that? How many collisions would there be in the old 5 column key versus how many with a bit column? A few, a very few values with a uniequifier added to break the ties, versus, all of them.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • However, I'm a little hung on why this is so vitally important? What's the end goal? What are we trying to do? Is it just about learning index internals?

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • To understand internals and find why that increase in size.

    Thanks all for the help . Learned something .

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply