clustered or nonclustered index for UserId?

Question

Post reply

clustered or nonclustered index for UserId?

adnani

Old Hand

Points: 323
More actions
August 17, 2010 at 9:47 am

#239037

We have a Table with this structure
CREATE TABLE tblMessages (
[Id] [bigint] IDENTITY (1, 1) NOT NULL ,
[UserId] [int] ,
[Message] [ntext] ,
[MsgTime] [datetime]
)
Table tblMessages has one milion rows and 80,000 new rows per day , 20,000 delete rows per day.
every user can have 200 ~10,000 rows in this table.
Id is Unique and UserId is not.
many of requests are( more 80%)
select * from tblMessages Where UserId=@UserId Order By Id
also this request for delete or update
delete/update ... where UserId=@UserId and Id=@Id
Which type of index is better for UserId? ( clustered or nonclustered ) and Which type of index is better for Id?
I tested it before , clustered index on userId is fastest for select . but i think clustered index on userid maybe increase I/O when INSERT/DELETE and fragmentation. but I am not sure.
Id is IDENTITY (1,2,3,4,5,...) but UserID is not ( 1,982,452,1,1,4,982,982,4,4,4,4,55,4,6,....)
Which type of index is better for UserId?

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply

crazy4sql SSCoach Points: 19590 More actions · Answer 1

from performance perspective, it should be clustered index with appropriate fill factor.

Also make sure the index is being updated regularly for threshold limit like :-

if fregmentation is <30 then reorganise else rebuild with online = on

----------
Ashish

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 2

I would start with a clustered index on ID, then create a nonclustered index on (UserID, ID) and see how SQL uses them.

How many rows will a match on UserID return?

Why are you using ntext? It's deprecated and will be removed in a future version. Use nvarchar(max) instead.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Jeff Moden SSC Guru Points: 1003857 More actions · Answer 3

ashish.kuriyal (8/17/2010)
from performance perspective, it should be clustered index with appropriate fill factor.
Also make sure the index is being updated regularly for threshold limit like :-
if fregmentation is <30 then reorganise else rebuild with online = on

I wouldn't put the clustered index on the UserID... it'll just make a whole lot of fragmentation and pages splits even if you do have a respectable fill factor without much of a speed advantage in the long run. I'd do like Gail suggests on this one.

You don't actually need to make the nonclustered index as (UserID,ID) because ID (the clustered index) will automatically be added to the UserID but it does make it painfully obvious to the casual user that doesn't know that.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Steve Jones - SSC Editor SSC Guru Points: 734434 More actions · Answer 4

I agree with Jeff and Gail, and also, change ntext to varchar(max)

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 5

Jeff Moden (8/17/2010)
You don't actually need to make the nonclustered index as (UserID,ID) because ID (the clustered index) will automatically be added to the UserID but it does make it painfully obvious to the casual user that doesn't know that.

If I'm adding the clustering key to an index, I prefer to be explicit about putting it there. That way, if someone adds another column to the index it doesn't change the order of columns that I intended. Also makes it clearer for others and prevents amusing fun of the clustered index gets changed at a later date.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

adnani Old Hand Points: 323 More actions · Answer 6

thanks.

this is ntext because this database upgraded from SQL 2000.

but text or varchar (max) is not problem.

I have some tables in othar databases that use varchar(1000) or varchar(2000). the question is I must clusterd index on UserId or clustered index on ID.

many of requests are

select * from tblMessages Where UserId=@UserId Order by Id desc

** average of return rows is 100~200 rows.

also we have one 400,000 users .

when clustered index on UserId, all of rows that related with one user under each other. so reading of them is fast. but as I said I think writing of them increases I/O usage and fragmention.

I use clustered index on userid ,sometimes I check Index fragmention, and I think this table is fragmented very soon.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 7

I stand by the index suggestions I made above.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Jeff Moden SSC Guru Points: 1003857 More actions · Answer 8

GilaMonster (8/17/2010)
Jeff Moden (8/17/2010)
You don't actually need to make the nonclustered index as (UserID,ID) because ID (the clustered index) will automatically be added to the UserID but it does make it painfully obvious to the casual user that doesn't know that.
If I'm adding the clustering key to an index, I prefer to be explicit about putting it there. That way, if someone adds another column to the index it doesn't change the order of columns that I intended. Also makes it clearer for others and prevents amusing fun of the clustered index gets changed at a later date.

Agreed on the clarity thing. That's why I brought it up... was trying to head off all the folks that know that ID would be added automatically. Good point on the prevention of change on the order of columns, as well.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)