Index Size Difference

Question

Index Size Difference

VastSQL

SSC-Dedicated

Points: 36069
More actions
March 7, 2017 at 6:29 am

#401745

Hi Experts,
For testing I have created a table with a clustered index selecting all columns and inserted some data . The size of clustered index was shown as 56.6MB. I created a non-clustered index again selecting all columns (ignoring varchar(max) column) and the size of NC index was shown as 4.3MB. I recreated the Clustered index by keeping only one value and the size of clustered index changed to 56MB but what surprised me is that when size of the Non-Clustered index grew from 4.3MB to 5.1MB.
Why and how the Non-clustered index grew?
Is it because earlier it was same as clustered index and now when the column is removed from Clustered Index the non-clustered index has to actually add the column to it?
Thanks in Advance

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply

sgmunson SSC Guru Points: 110639 More actions · Answer 1

Let's understand some basics about what CLUSTERED and NONCLUSTERED indexes are:

1.) When you create a CLUSTERED index, that index is actually the table itself, and the table is then kept in sorted order by the primary key. As you cannot "INCLUDE" columns in a CLUSTERED index because such an index ALWAYS includes ALL columns, the size of such an index is always the size of the table.
2.) When the key of a CLUSTERED index is changed, the table increases in size to accommodate changes in the leaf-level values that need to be kept. Increasing the number of fields in the key is a way to increase the size, while removing a column from the key would decrease the size.
3.) When you create a NONCLUSTERED index, it's leaf-levels have to include the leaf-levels of any existing CLUSTERED index, so an increase in the number of columns composing the key of the CLUSTERED index means that the NONCLUSTERED indexes will increase in size to accommodate the larger leaf-level values in the CLUSTERED index.
4.) Similarly to number 3, if the key of the CLUSTERED index has a field removed, it's leaf-levels will be smaller, and concomitantly, the NONCLUSTERED indexes' leaf-level values will be reduced in size.

I'm not sure of the more arcane details of those processes, and I may have some of this incorrect, but I'm pretty sure that if I have this wrong, someone will be swift to correct my misunderstanding.

Steve (aka sgmunson) 🙂 🙂 🙂
Rent Servers for Income (picks and shovels strategy)

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 2

Rebuild your nonclustered index and report the size back.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

nsadams87xx Old Hand Points: 386 More actions · Answer 3

nsadams87xx

Old Hand

Points: 386

March 7, 2017 at 8:36 am

#1932056

Are you doing any inserts or merges?

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 4

VastSQL - Tuesday, March 7, 2017 6:29 AM
Hi Experts,
For testing I have created a table with a clustered index selecting all columns and inserted some data . The size of clustered index was shown as 56.6MB. I created a non-clustered index again selecting all columns (ignoring varchar(max) column) and the size of NC index was shown as 4.3MB. I recreated the Clustered index by keeping only one value and the size of clustered index changed to 56MB but what surprised me is that when size of the Non-Clustered index grew from 4.3MB to 5.1MB.
Why and how the Non-clustered index grew?
Is it because earlier it was same as clustered index and now when the column is removed from Clustered Index the non-clustered index has to actually add the column to it?
Thanks in Advance

If you show your script, others will be better equipped to answer your questions. It could be something as simple as the choice of metric for index size.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

VastSQL SSC-Dedicated Points: 36069 More actions · Answer 5

sgmunson - Tuesday, March 7, 2017 7:38 AM
Let's understand some basics about what CLUSTERED and NONCLUSTERED indexes are:
1.) When you create a CLUSTERED index, that index is actually the table itself, and the table is then kept in sorted order by the primary key. As you cannot "INCLUDE" columns in a CLUSTERED index because such an index ALWAYS includes ALL columns, the size of such an index is always the size of the table.
2.) When the key of a CLUSTERED index is changed, the table increases in size to accommodate changes in the leaf-level values that need to be kept. Increasing the number of fields in the key is a way to increase the size, while removing a column from the key would decrease the size.
3.) When you create a NONCLUSTERED index, it's leaf-levels have to include the leaf-levels of any existing CLUSTERED index, so an increase in the number of columns composing the key of the CLUSTERED index means that the NONCLUSTERED indexes will increase in size to accommodate the larger leaf-level values in the CLUSTERED index.
4.) Similarly to number 3, if the key of the CLUSTERED index has a field removed, it's leaf-levels will be smaller, and concomitantly, the NONCLUSTERED indexes' leaf-level values will be reduced in size.
I'm not sure of the more arcane details of those processes, and I may have some of this incorrect, but I'm pretty sure that if I have this wrong, someone will be swift to correct my misunderstanding.

I agree with you and that is why was surprised when saw growth in NC index Size when reduced the clustered index columns.

VastSQL SSC-Dedicated Points: 36069 More actions · Answer 6

TheSQLGuru - Tuesday, March 7, 2017 7:52 AM
Rebuild your nonclustered index and report the size back.

Hi Kevin,

I tried rebuild NC index ,drop and recreate NC index but that didnt make any change in NC index value.

VastSQL SSC-Dedicated Points: 36069 More actions · Answer 7

ChrisM@Work - Tuesday, March 7, 2017 8:36 AM
VastSQL - Tuesday, March 7, 2017 6:29 AM
Hi Experts,
For testing I have created a table with a clustered index selecting all columns and inserted some data . The size of clustered index was shown as 56.6MB. I created a non-clustered index again selecting all columns (ignoring varchar(max) column) and the size of NC index was shown as 4.3MB. I recreated the Clustered index by keeping only one value and the size of clustered index changed to 56MB but what surprised me is that when size of the Non-Clustered index grew from 4.3MB to 5.1MB.
Why and how the Non-clustered index grew?
Is it because earlier it was same as clustered index and now when the column is removed from Clustered Index the non-clustered index has to actually add the column to it?
Thanks in Advance
If you show your script, others will be better equipped to answer your questions. It could be something as simple as the choice of metric for index size.

Here go the scripts

CREATE TABLE [dbo].[Technologies](
    [TechnologyId] [int] IDENTITY(1,1) NOT NULL,
    [Technology] [nvarchar](max) NOT NULL,
    [Active] [bit] NULL CONSTRAINT [DF_Technologies_Active] DEFAULT ((1)),
    [CreatedOn] [datetime] NULL CONSTRAINT [DF_Technologies_CreatedOn] DEFAULT (getdate()),
    [ModifiedOn] [datetime] NULL,
    [CreatedUser] [bigint] NOT NULL,
    [ModifiedUser] [bigint] NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

GO

CREATE CLUSTERED INDEX [ClusteredIndex-20170307-165653] ON [dbo].[Technologies]
(
[Active] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO

CREATE NONCLUSTERED INDEX [NonClusteredIndex-20170307-170122] ON [dbo].[Technologies]
(
    [Active] ASC,
    [TechnologyId] ASC,
    [CreatedOn] ASC,
    [ModifiedOn] ASC,
    [CreatedUser] ASC,
    [ModifiedUser] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO

Initially the Clustered Index had same columns as that of NC index.

Grant Fritchey SSC Guru Points: 398694 More actions · Answer 8

You put the clustered index on a bit column? Very odd choice since there are only two possible values there. It's not a very good index, clustered or nonclustered, and is going to be utterly unuseful to the optimizer.

To answer your question, what you're seeing is probably the need for the engine to create a uniquifier for that clustered index. Since the clustered index, as explained earlier in this thread, defines table storage, where each row goes, when you have a non-unique clustered index, additional columns are added internally to make the clustered index unique. Now, let's jump over to the non-clustered index for a moment. Each non-clustered index also has the clustered index keys inside of it. It's how the rows are identified from the non-clustered index since the non-clustered index doesn't store the data (unlike the clustered index). So what you're seeing is the non-clustered index is bigger than it was because it has to store the uniquifier as well as the keys that you've defined in your indexes. Without a doubt, that's what's making it bigger.

As to the rest, you really need to spend some time with fundamentals on how to construct tables, clustered indexes, nonclustered indexes, etc. because it really looks like you're going off into left field with these experiments (largely basing this on the clustered key of a bit column).

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

VastSQL SSC-Dedicated Points: 36069 More actions · Answer 9

Grant Fritchey - Tuesday, March 7, 2017 10:57 PM
You put the clustered index on a bit column? Very odd choice since there are only two possible values there. It's not a very good index, clustered or nonclustered, and is going to be utterly unuseful to the optimizer.
To answer your question, what you're seeing is probably the need for the engine to create a uniquifier for that clustered index. Since the clustered index, as explained earlier in this thread, defines table storage, where each row goes, when you have a non-unique clustered index, additional columns are added internally to make the clustered index unique. Now, let's jump over to the non-clustered index for a moment. Each non-clustered index also has the clustered index keys inside of it. It's how the rows are identified from the non-clustered index since the non-clustered index doesn't store the data (unlike the clustered index). So what you're seeing is the non-clustered index is bigger than it was because it has to store the uniquifier as well as the keys that you've defined in your indexes. Without a doubt, that's what's making it bigger.
As to the rest, you really need to spend some time with fundamentals on how to construct tables, clustered indexes, nonclustered indexes, etc. because it really looks like you're going off into left field with these experiments (largely basing this on the clustered key of a bit column).

Thanks Grant.

As I mentioned earlier this was created as a test and you are right the bit column is not at all a candidate for clustered index. I have made the TechnologyID as only key column and this time the clustered index size reduced and NC index remained the same. Cannot agree with you on the uniquefier adding space since the clustered index was not created as unique so it had uniquefier before and after the change.

Grant Fritchey SSC Guru Points: 398694 More actions · Answer 10

VastSQL - Wednesday, March 8, 2017 11:23 PM
Grant Fritchey - Tuesday, March 7, 2017 10:57 PM
You put the clustered index on a bit column? Very odd choice since there are only two possible values there. It's not a very good index, clustered or nonclustered, and is going to be utterly unuseful to the optimizer.
To answer your question, what you're seeing is probably the need for the engine to create a uniquifier for that clustered index. Since the clustered index, as explained earlier in this thread, defines table storage, where each row goes, when you have a non-unique clustered index, additional columns are added internally to make the clustered index unique. Now, let's jump over to the non-clustered index for a moment. Each non-clustered index also has the clustered index keys inside of it. It's how the rows are identified from the non-clustered index since the non-clustered index doesn't store the data (unlike the clustered index). So what you're seeing is the non-clustered index is bigger than it was because it has to store the uniquifier as well as the keys that you've defined in your indexes. Without a doubt, that's what's making it bigger.
As to the rest, you really need to spend some time with fundamentals on how to construct tables, clustered indexes, nonclustered indexes, etc. because it really looks like you're going off into left field with these experiments (largely basing this on the clustered key of a bit column).
Thanks Grant.
As I mentioned earlier this was created as a test and you are right the bit column is not at all a candidate for clustered index. I have made the TechnologyID as only key column and this time the clustered index size reduced and NC index remained the same. Cannot agree with you on the uniquefier adding space since the clustered index was not created as unique so it had uniquefier before and after the change.

You sure about that? How many collisions would there be in the old 5 column key versus how many with a bit column? A few, a very few values with a uniequifier added to break the ties, versus, all of them.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Grant Fritchey SSC Guru Points: 398694 More actions · Answer 11

However, I'm a little hung on why this is so vitally important? What's the end goal? What are we trying to do? Is it just about learning index internals?

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

VastSQL SSC-Dedicated Points: 36069 More actions · Answer 12

To understand internals and find why that increase in size.

Thanks all for the help . Learned something .