Index Types and Primary Keys versus Space Used

Question

Index Types and Primary Keys versus Space Used

DB_Newbie2007

SSCarpal Tunnel

Points: 4446
More actions
February 20, 2009 at 11:03 am

#134931

I recently received a script from a developer where there were comments through the object creations with comments such as "A CLUSTERED INDEX ON A TABLE WITH JUST ONE RECORD WILL TAKE UP MORE SPACE AND IS NOT NECESSARY" and "TABLE IS NOT USED IN QUERIES, AND THE VALUES IN THE COLUMNS ARE NOT SEQUENTIAL" (maybe not for the developer or user, but needed for tech support!!!).
Anyway, I decided to create three examples to see what the impact on space was:
CREATE TABLE [dbo].[FilesImportedNonClust](
[FileName] [varchar](100) NOT NULL,
[ImportDate] [datetime] NULL,
[FileFormat] [varchar](10) NOT NULL,
CONSTRAINT [PK_FilesImportedNonClust] PRIMARY KEY NONCLUSTERED
(
[FileName] ASC,
[FileFormat] ASC
)WITH (IGNORE_DUP_KEY = ON) ON [PRIMARY]
CREATE TABLE [dbo].[FilesImportedClust](
[FileName] [varchar](100) NOT NULL,
[ImportDate] [datetime] NULL,
[FileFormat] [varchar](10) NOT NULL,
CONSTRAINT [PK_FilesImportedNonClust] PRIMARY KEY CLUSTERED
(
[FileName] ASC,
[FileFormat] ASC
)WITH (IGNORE_DUP_KEY = ON) ON [PRIMARY]
CREATE TABLE [dbo].[FilesImportedNoKey](
[FileName] [varchar](100) NOT NULL,
[ImportDate] [datetime] NULL,
[FileFormat] [varchar](10) NOT NULL
) ON [PRIMARY]
I then put one record into each table:
INSERT INTO dbo.FilesImportedNoKey VALUES ('MyFile', GetDate(), 'MyFormat')
INSERT INTO dbo.FilesImportedNonClust VALUES ('MyFile', GetDate(), 'MyFormat')
INSERT INTO dbo.FilesImportedClust VALUES ('MyFile', GetDate(), 'MyFormat')
Then ran a modified version of the query from Eder Dias script, http://www.sqlservercentral.com/scripts/Administration/61183/:
Select
obj.name as TableName,
(
Select Sum(((used * 8192.00)/1024))
from
sys.sysindexes
where
id in (select objChild.id from sys.sysobjects objChild where objChild.name = obj.name)
and indid in(0,1) --> Only Table Size
) TableSizeKb,
(
Select
Sum(((used * 8192.00)/1024))
from
sys.sysindexes
where
id in (select objChild.id from sys.sysobjects objChild where objChild.name = obj.name)
and indid not in (255,0,1) --> Only indexes
) IndexSizeKb
From
sys.sysobjects obj
Where obj.type = 'U' --> only user tables
AND obj.name LIKE 'FilesImport%'
Order By
TableSizeKb desc;
The results are as follows:
TableName TableSizeKbIndexSizeKb
FilesImportedNonClust3216
FilesImportedNoKey16NULL
FilesImportedClust 16NULL
I then put an additional 300 records into each table, with the results as follows:
TableNameTableSizeKbIndexSizeKb
FilesImportedNonClust5632
FilesImportedClust 32NULL
FilesImportedNoKey24NULL
My questions:
1) If a Clustered index is not specified, doesn't SQL Server create its own clustered index behind the scenes?
2) Why is the NonClusterered table (with a primary key) versus the Clustered table (with a primary key) about 2x the size?
3) Why does the table without the Primary Key take up less space than the table that is Clustered?
4) Can anyone point me to online resource (or something I missed in BOL) to read up on this?
Thanks! 🙂
Argue for your limitations, and sure enough they're yours (Richard Bach, Illusions)

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

Steve Jones - SSC Editor SSC Guru Points: 734449 More actions · Answer 1

Very interesting.

SQL does mark the physical records in a heap, AFAIK, with something to be sure they are unique. I have heard before that the optimizer likes clustered indexes, so they are recommended.

A table with one (or few rows), might take up more space with a clustered index, but the time it takes to discuss this isn't worth the space used. Just create a clustered index. It isn't enough space to worry about.

The nonclustered index for the PK means you create a heap, and then another index for the PK. A clustered PK is just the heap converted to clustered, no extra index.

Grant Fritchey SSC Guru Points: 398690 More actions · Answer 2

Add a hearty "me too" to everything Steve said.

This was just posted on Brad Magehee's blog today:

http://technet.microsoft.com/en-us/library/cc917672.aspx

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 3

Steve Jones - Editor (2/20/2009)
SQL does mark the physical records in a heap, AFAIK, with something to be sure they are unique.

RID (Row Identifier). An 8 byte combination of File ID, Page ID and Slot Index. It's stored in nonclusters as the row's address, not in the heap itself.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

DB_Newbie2007 SSCarpal Tunnel Points: 4446 More actions · Answer 4

I did finally find some information in BOL (Creating Nonclustered Indexes) which helps explain things for me. For the benefit of people who do not know what a "heap" is (or are programmers new to SQL Server): simply (not getting into Extends, Pages, B-Trees, etc) a heap is just an "un-ordered" storage of data on the hard drive, versus a clustered index storing the data in a specified order.

[FROM BOL]: Nonclustered indexes are implemented in the following ways:

PRIMARY KEY and UNIQUE constraints

When you create a PRIMARY KEY constraint, a unique clustered index on the column or columns is automatically created if a clustered index on the table does not already exist and you do not specify a unique nonclustered index. The primary key column cannot allow NULL values.

When you create a UNIQUE constraint, a unique nonclustered index is created to enforce a UNIQUE constraint by default. You can specify a unique clustered index if a clustered index on the table does not already exist. For more information, see PRIMARY KEY Constraints and UNIQUE Constraints.

Index independent of a constraint

By default, a nonclustered index is created if clustered is not specified. The maximum number of nonclustered indexes that can be created per table is 249. This includes any indexes created by PRIMARY KEY or UNIQUE constraints, but does not include XML indexes.

Nonclustered index on an indexed view

After a unique clustered index has been created on a view, nonclustered indexes can be created. For more information, see Creating Indexed Views.

Thanks for all of the inputs everyone.... it helped me do more search(s) online for data to explain this! 🙂 🙂

Argue for your limitations, and sure enough they're yours (Richard Bach, Illusions)

prabhu.st SSCrazy Points: 2552 More actions · Answer 5

Your Question 1):
If a Clustered index is not specified, doesn't SQL Server create its own clustered index behind the scenes?

answer:

if you miss to specify the Index Type, there is no implicit speicifation from Mircosoft to create a Clustered/NonClustered Index behind the screen as it is purely depends on the user and requirement of the table creation, so simply it will be a "Heap" as you told.

2) Why is the NonClusterered table (with a primary key) versus the Clustered table (with a primary key) about 2x the size?

answer:

as like "Gail" & "Steve Jones" told you, Non Clustered Index would occupy more space as it holds one Index and one Row Identifier

whereas the clustered index dont have that structure so it will occupy less space and faster too..

--- XXX ---

I believe I have just given my opinion and not more anything better than the three experts above (Steve,Grant,Gail)

Regards,

Prabhu