Index selection + C or NC ?

Question

Index selection + C or NC ?

karthik M

One Orange Chip

Points: 29139
More actions
February 20, 2009 at 9:03 am

#73461

Hi All,
I need your people guidance to create the indexes for an existing table which is used in production. Actually it is an exception table, if any exception happened, and then the corresponding error message with the details will be insered into this table.
Table Structure:
create table exception
(
v_fnd_id int,
fnd_id int,
symbol,
symbol_type char(2),
feed_dt datetime,
process_dt datetime,
feed_num int,
err_cd int
)
insert into exception
select 1,100,'AQ','LL','20/jan/2009','02/feb/2009',10,22
union
select 1,100,'AR','SS','20/jan/2009','02/feb/2009',10,22
union
select 1,100,'AS','MM','20/jan/2009','02/feb/2009',10,22
union
select 1,100,'AQ','LL','20/jan/2009','02/feb/2009',11,22
union
select 1,100,'AR','SS','20/jan/2009','02/feb/2009',11,22
union
select 1,100,'AS','MM','20/jan/2009','02/feb/2009',11,22
union
select 1,100,'AQ','LL','20/jan/2009','02/feb/2009',12,22
union
select 1,100,'AR','SS','20/jan/2009','02/feb/2009',12,22
union
select 1,100,'AS','MM','20/jan/2009','02/feb/2009',12,22
union
select 1,100,'AQ','LL','20/jan/2009','04/feb/2009',13,22
union
select 1,100,'AR','SS','20/jan/2009','04/feb/2009',13,22
union
select 1,100,'AS','MM','20/jan/2009','04/feb/2009',13,22
union
select 1,100,'AQ','LL','20/jan/2009','06/feb/2009',15,22
union
select 1,100,'AR','SS','20/jan/2009','06/feb/2009',15,22
union
select 1,100,'AS','MM','20/jan/2009','06/feb/2009',15,22
union
select 1,200,'BS','MM','20/jan/2009','02/feb/2009',12,22
union
select 1,200,'BQ','LL','20/jan/2009','04/feb/2009',13,22
union
select 1,200,'BR','SS','20/jan/2009','04/feb/2009',13,22
union
select 1,200,'BS','MM','20/jan/2009','04/feb/2009',13,22
union
select 1,200,'BQ','LL','20/jan/2009','06/feb/2009',10,22
union
select 1,200,'BR','SS','20/jan/2009','06/feb/2009',10,22
union
select 1,200,'BS','MM','20/jan/2009','06/feb/2009',10,22
union
select 2,600,'CQ','LL','20/jan/2009','03/feb/2009',10,22
union
select 2,600,'CR','SS','20/jan/2009','03/feb/2009',10,22
union
select 2,600,'CS','MM','20/jan/2009','03/feb/2009',10,22
The thing is, the table contain 1000000 records.
which index should we create to this table? As of my knowledge we need to create non clustered index. Am i correct?
If I am correct, which are all columns should I have include in the index.
Shall we create more than one composite index? if yes, which combination should be best?
Please let me know some good books name to know about 'INDEX SELECTION'
Inputs are Highly Appreciable!
karthik

Viewing 15 posts - 1 through 15 (of 22 total)

You must be logged in to reply to this topic. Login to reply

steveb. SSC-Forever Points: 46733 More actions · Answer 1

I would be inclined to put the clustered index on the date. But this really depends on how you will be querying the table.

Are you able to post the queries that you will be using to extract data.

GSquared SSC Guru Points: 260824 More actions · Answer 2

It really depends on what you'll be using the data for, and how you'll be querying it. Since that's what indexes are for, it's necessary to know that before offering any advice on them.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

karthik M One Orange Chip Points: 29139 More actions · Answer 3

Sure.

create proc p1_t

as

begin

--#1)

select a.sec_id, b.symbol, b.prtf_id, c.p_id,a.flag into #symb_map

from funds a, Mapping b, p_identifier c

where a.sec_id = b.symbol

and symbol_type = c.id_typ_cd

and a.sec_id = c.symbol

and c.id_typ_cd = 'MS'

and b.prtf_id = c.prtf_id

--#2)

select count(distinct f.sec_id)

from #symb_map f

left join exception e on e.symbol = f.symbol

and err_cd in (9,22,23,24,34)

and feed_num = 23

where e.symbol is null

end

This table is used by 7 stored procedurs, all of them used

symbol column to join

err_cd column to check the err_code

feed_num column to check the given input parameter value.

If you look at the above query #2, all the procedures follow the same logic but with some differences, say for example , some of them will check err_cd in (22,24) or err_cd in (22,23,24) something like that there should be some differences.

create noncluctered index id1 on exception(symbol,err_cd,feed_num)

create nonclustered index id2 on exception(err_cd)

will it be a right choice? Please correct me if i am wrong.

Note:

1) Table will get inserted on daily basis. The maximum record count would be 2500 and minimum record would be 100.

2) No deletion will happen i mean DELETE command are not passed to this table.

3) But They will truncate the table once for a month and the archieve records will go into a history table.

4) No update operation happen in this table.

karthik

karthik M One Orange Chip Points: 29139 More actions · Answer 4

I will give you some more example.

--------------------------------------------------

create proc p2_t

(

@feed_num int

)

as

begin

select a.*

from exception a ,err_lkp b

where a.err_cd=b.err_cd

and a.err_cd <> 36

and a.feed_num = @feed_num

select symbol, feed_dt = convert(varchar(10),'')

into #t1

from identifier

update #t1

set feed_dt=a.feed_dt

from exception a

where a.feed_num = @feed_num

Select * from #t1

end

----------------------------------------------------

karthik

GSquared SSC Guru Points: 260824 More actions · Answer 5

Judging by these, I'd make feed_num the first column in the index, then err_cd, then symbol. That should satisfy all of the queries you've posted so far. One index, in that sequence. Try that, see how it works.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

karthik M One Orange Chip Points: 29139 More actions · Answer 6

I tested the below indexes

create nonclustered index sk_exception on exception(feed_num,err_cd,symbol)

create nonclustered index sk_exception on exception(symbol,err_cd,feed_num)

I didn't see any difference between them. But which combination will give good performance in the long run ?

karthik

karthik M One Orange Chip Points: 29139 More actions · Answer 7

can you tell me how should i know the cost of both indexes ?

karthik

karthik M One Orange Chip Points: 29139 More actions · Answer 8

for #1, I have noticed backward scan

for #2,I have noticed forward scan

which one is good? i got confused.....

karthik

karthik M One Orange Chip Points: 29139 More actions · Answer 9

for #1, I have noticed backward scan

Total Logical Read: 65818

Details:

Tables Logical Read

summary1

exception2

funds6466

Mapping4309

identifier134

#symb_m010070100033911052112

#symb_m010070100033911057

Worktable12146

#symb_m010070100033911057

asset_alloc4232

Worktable16314

#symb_m010070100033911057

asset_alloc4216

Worktable12156

#symb_m010070100033911057

exception8392

Worktable11904

#symb_m010070100033911052295

feed_output474

exception4

Worktable2250

Worktable1269

#symb_m010070100033911050

feed_output0

exception2

Worktable27

Worktable17

#symb_m010070100033911059

feed_output716

Worktable1376

#symb_m010070100033911052295

feed_output36

exception4

Worktable232

Worktable1269

#symb_m010070100033911050

feed_output0

exception2

Worktable27

Worktable17

#symb_m010070100033911059

Worktable1376

#symb_m010070100033911057146

feed_output17

Worktable11962

#symb_m010070100033911052295

feed_output710

exception4

Worktable2233

Worktable1269

#symb_m010070100033911050

feed_output0

exception2

Worktable27

Worktable17

#symb_m010070100033911052295

feed_output474

exception4

Worktable2250

Worktable1269

#symb_m010070100033911050

feed_output0

exception2

Worktable27

Worktable17

for #2,I have noticed forward scan

Total Logical Read: 62684

Details:

Tables Logical Read

summary1

exception113

funds6,466

Mapping4309

identifier134

#symb_m010070100033911052112

#symb_m010070100033911057

Worktable12146

#symb_m010070100033911057

asset_alloc4,232

Worktable16314

#symb_m010070100033911057

asset_alloc4,216

Worktable12156

#symb_m010070100033911057

exception4,267

Worktable11,904

#symb_m010070100033911052295

feed_output474

exception113

Worktable2250

Worktable1269

#symb_m010070100033911050

feed_output0

exception113

Worktable27

Worktable17

#symb_m010070100033911059

feed_output716

Worktable1376

#symb_m010070100033911052295

feed_output36

exception113

Worktable232

Worktable1269

#symb_m010070100033911050

feed_output0

exception113

Worktable27

Worktable17

#symb_m010070100033911059

Worktable1376

#symb_m010070100033911057146

feed_output17

Worktable11,962

#symb_m010070100033911052295

feed_output710

exception113

Worktable2233

Worktable1269

#symb_m010070100033911050

feed_output0

exception113

Worktable27

Worktable17

#symb_m010070100033911052295

feed_output474

exception113

Worktable2250

Worktable1269

#symb_m010070100033911050

feed_output0

exception113

Worktable27

Worktable17

I am very much confused to take the conclusion which method is good and faster?

karthik

karthik M One Orange Chip Points: 29139 More actions · Answer 10

I did some more research about FORWARD & BACKWARD scan.

Test #1:

select name,id from sp_help sysobjects

order by id

Query Plan:

|--Clustered Index Scan(OBJECT:([IPStatic].[dbo].[sysobjects].[sysobjects]), ORDERED FORWARD)

Test #2:

select name,id from sp_help sysobjects

order by id DESC

Query Plan:

|--Clustered Index Scan(OBJECT:([IPStatic].[dbo].[sysobjects].[sysobjects]), ORDERED BACKWARD)

My Questions:

1) How BACKWARD SCAN work internally?

2) Which scan method is best when we look performance?

3) Which scan method will require more memory space?

4) When we go BACKWARD scan ? I mean which situation we need to use this scan type.

karthik

GSquared SSC Guru Points: 260824 More actions · Answer 11

Test your queries, use "set statistics time on" and "set statistics io on", and see which index works best for you.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

karthik M One Orange Chip Points: 29139 More actions · Answer 12

I tested both the indexes. Both of them took 2 seconds to complete the execution.

But it is varying in IO. I have already posted the logical read details. Can you see my earlier reply?

karthik

GSquared SSC Guru Points: 260824 More actions · Answer 13

I must be misunderstanding something in your post. You list two different queries, and you seem to be listing the reads for both queries. They aren't the same query, so why would it matter if they have different reads? At least, that's how the posts seem to read to me. What am I missing?

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

karthik M One Orange Chip Points: 29139 More actions · Answer 14

I have created a non clustered index to exception table.

But i used two different column order.

1)create nonclustered index sk_exception on exception(feed_num,err_cd,symbol)

I created the above index first.

then i executed the procedure.

exec p1_t

I have noticed the query plan.

query plan:

query optimizer used backward scan method to scan the exception table.

Total Logical Read: 65818
Details:
Tables Logical Read
summary 1
exception 2
funds 6466
Mapping 4309
identifier 134
#symb_m01007010003391105 2112
#symb_m01007010003391105 7
Worktable1 2146
#symb_m01007010003391105 7
asset_alloc 4232
Worktable1 6314
#symb_m01007010003391105 7
asset_alloc 4216
Worktable1 2156
#symb_m01007010003391105 7
exception 8392
Worktable1 1904
#symb_m01007010003391105 2295
feed_output 474
exception 4
Worktable2 250
Worktable1 269
#symb_m01007010003391105 0
feed_output 0
exception 2
Worktable2 7
Worktable1 7
#symb_m01007010003391105 9
feed_output 716
Worktable1 376
#symb_m01007010003391105 2295
feed_output 36
exception 4
Worktable2 32
Worktable1 269
#symb_m01007010003391105 0
feed_output 0
exception 2
Worktable2 7
Worktable1 7
#symb_m01007010003391105 9
Worktable1 376
#symb_m01007010003391105 7146
feed_output 17
Worktable1 1962
#symb_m01007010003391105 2295
feed_output 710
exception 4
Worktable2 233
Worktable1 269
#symb_m01007010003391105 0
feed_output 0
exception 2
Worktable2 7
Worktable1 7
#symb_m01007010003391105 2295
feed_output 474
exception 4
Worktable2 250
Worktable1 269
#symb_m01007010003391105 0
feed_output 0
exception 2
Worktable2 7
Worktable1 7

I dropped the above index and created the below index.

2)create nonclustered index sk_exception on exception(symbol,err_cd,feed_num)

you can the difference in the column order.

I have executed the procedure p1_t once again and noticed the query plan.

query plan:

query optimizer used forward scan method to scan the exception table.

Total Logical Read: 62684
Details:
Tables Logical Read
summary 1
exception 113
funds 6,466
Mapping 4309
identifier 134
#symb_m01007010003391105 2112
#symb_m01007010003391105 7
Worktable1 2146
#symb_m01007010003391105 7
asset_alloc 4,232
Worktable1 6314
#symb_m01007010003391105 7
asset_alloc 4,216
Worktable1 2156
#symb_m01007010003391105 7
exception 4,267
Worktable1 1,904
#symb_m01007010003391105 2295
feed_output 474
exception 113
Worktable2 250
Worktable1 269
#symb_m01007010003391105 0
feed_output 0
exception 113
Worktable2 7
Worktable1 7
#symb_m01007010003391105 9
feed_output 716
Worktable1 376
#symb_m01007010003391105 2295
feed_output 36
exception 113
Worktable2 32
Worktable1 269
#symb_m01007010003391105 0
feed_output 0
exception 113
Worktable2 7
Worktable1 7
#symb_m01007010003391105 9
Worktable1 376
#symb_m01007010003391105 7146
feed_output 17
Worktable1 1,962
#symb_m01007010003391105 2295
feed_output 710
exception 113
Worktable2 233
Worktable1 269
#symb_m01007010003391105 0
feed_output 0
exception 113
Worktable2 7
Worktable1 7
#symb_m01007010003391105 2295
feed_output 474
exception 113
Worktable2 250
Worktable1 269
#symb_m01007010003391105 0
feed_output 0
exception 113
Worktable2 7
Worktable1 7

I was confused to decide which index is good. I hope i have explained clearly.

karthik