Index selection + C or NC ?

  • Hi All,

    I need your people guidance to create the indexes for an existing table which is used in production. Actually it is an exception table, if any exception happened, and then the corresponding error message with the details will be insered into this table.

    Table Structure:

    create table exception

    (

    v_fnd_id int,

    fnd_id int,

    symbol,

    symbol_type char(2),

    feed_dt datetime,

    process_dt datetime,

    feed_num int,

    err_cd int

    )

    insert into exception

    select 1,100,'AQ','LL','20/jan/2009','02/feb/2009',10,22

    union

    select 1,100,'AR','SS','20/jan/2009','02/feb/2009',10,22

    union

    select 1,100,'AS','MM','20/jan/2009','02/feb/2009',10,22

    union

    select 1,100,'AQ','LL','20/jan/2009','02/feb/2009',11,22

    union

    select 1,100,'AR','SS','20/jan/2009','02/feb/2009',11,22

    union

    select 1,100,'AS','MM','20/jan/2009','02/feb/2009',11,22

    union

    select 1,100,'AQ','LL','20/jan/2009','02/feb/2009',12,22

    union

    select 1,100,'AR','SS','20/jan/2009','02/feb/2009',12,22

    union

    select 1,100,'AS','MM','20/jan/2009','02/feb/2009',12,22

    union

    select 1,100,'AQ','LL','20/jan/2009','04/feb/2009',13,22

    union

    select 1,100,'AR','SS','20/jan/2009','04/feb/2009',13,22

    union

    select 1,100,'AS','MM','20/jan/2009','04/feb/2009',13,22

    union

    select 1,100,'AQ','LL','20/jan/2009','06/feb/2009',15,22

    union

    select 1,100,'AR','SS','20/jan/2009','06/feb/2009',15,22

    union

    select 1,100,'AS','MM','20/jan/2009','06/feb/2009',15,22

    union

    select 1,200,'BS','MM','20/jan/2009','02/feb/2009',12,22

    union

    select 1,200,'BQ','LL','20/jan/2009','04/feb/2009',13,22

    union

    select 1,200,'BR','SS','20/jan/2009','04/feb/2009',13,22

    union

    select 1,200,'BS','MM','20/jan/2009','04/feb/2009',13,22

    union

    select 1,200,'BQ','LL','20/jan/2009','06/feb/2009',10,22

    union

    select 1,200,'BR','SS','20/jan/2009','06/feb/2009',10,22

    union

    select 1,200,'BS','MM','20/jan/2009','06/feb/2009',10,22

    union

    select 2,600,'CQ','LL','20/jan/2009','03/feb/2009',10,22

    union

    select 2,600,'CR','SS','20/jan/2009','03/feb/2009',10,22

    union

    select 2,600,'CS','MM','20/jan/2009','03/feb/2009',10,22

    The thing is, the table contain 1000000 records.

    which index should we create to this table? As of my knowledge we need to create non clustered index. Am i correct?

    If I am correct, which are all columns should I have include in the index.

    Shall we create more than one composite index? if yes, which combination should be best?

    Please let me know some good books name to know about 'INDEX SELECTION'

    Inputs are Highly Appreciable!

    karthik

  • I would be inclined to put the clustered index on the date. But this really depends on how you will be querying the table.

    Are you able to post the queries that you will be using to extract data.

  • It really depends on what you'll be using the data for, and how you'll be querying it. Since that's what indexes are for, it's necessary to know that before offering any advice on them.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Sure.

    create proc p1_t

    as

    begin

    --#1)

    select a.sec_id, b.symbol, b.prtf_id, c.p_id,a.flag into #symb_map

    from funds a, Mapping b, p_identifier c

    where a.sec_id = b.symbol

    and symbol_type = c.id_typ_cd

    and a.sec_id = c.symbol

    and c.id_typ_cd = 'MS'

    and b.prtf_id = c.prtf_id

    --#2)

    select count(distinct f.sec_id)

    from #symb_map f

    left join exception e on e.symbol = f.symbol

    and err_cd in (9,22,23,24,34)

    and feed_num = 23

    where e.symbol is null

    end

    This table is used by 7 stored procedurs, all of them used

    symbol column to join

    err_cd column to check the err_code

    feed_num column to check the given input parameter value.

    If you look at the above query #2, all the procedures follow the same logic but with some differences, say for example , some of them will check err_cd in (22,24) or err_cd in (22,23,24) something like that there should be some differences.

    create noncluctered index id1 on exception(symbol,err_cd,feed_num)

    create nonclustered index id2 on exception(err_cd)

    will it be a right choice? Please correct me if i am wrong.

    Note:

    1) Table will get inserted on daily basis. The maximum record count would be 2500 and minimum record would be 100.

    2) No deletion will happen i mean DELETE command are not passed to this table.

    3) But They will truncate the table once for a month and the archieve records will go into a history table.

    4) No update operation happen in this table.

    karthik

  • I will give you some more example.

    --------------------------------------------------

    create proc p2_t

    (

    @feed_num int

    )

    as

    begin

    select a.*

    from exception a ,err_lkp b

    where a.err_cd=b.err_cd

    and a.err_cd <> 36

    and a.feed_num = @feed_num

    select symbol, feed_dt = convert(varchar(10),'')

    into #t1

    from identifier

    update #t1

    set feed_dt=a.feed_dt

    from exception a

    where a.feed_num = @feed_num

    Select * from #t1

    end

    ----------------------------------------------------

    karthik

  • Judging by these, I'd make feed_num the first column in the index, then err_cd, then symbol. That should satisfy all of the queries you've posted so far. One index, in that sequence. Try that, see how it works.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I tested the below indexes

    create nonclustered index sk_exception on exception(feed_num,err_cd,symbol)

    create nonclustered index sk_exception on exception(symbol,err_cd,feed_num)

    I didn't see any difference between them. But which combination will give good performance in the long run ?

    karthik

  • can you tell me how should i know the cost of both indexes ?

    karthik

  • for #1, I have noticed backward scan

    for #2,I have noticed forward scan

    which one is good? i got confused.....

    karthik

  • for #1, I have noticed backward scan

    Total Logical Read: 65818

    Details:

    Tables Logical Read

    summary1

    exception2

    funds6466

    Mapping4309

    identifier134

    #symb_m010070100033911052112

    #symb_m010070100033911057

    Worktable12146

    #symb_m010070100033911057

    asset_alloc4232

    Worktable16314

    #symb_m010070100033911057

    asset_alloc4216

    Worktable12156

    #symb_m010070100033911057

    exception8392

    Worktable11904

    #symb_m010070100033911052295

    feed_output474

    exception4

    Worktable2250

    Worktable1269

    #symb_m010070100033911050

    feed_output0

    exception2

    Worktable27

    Worktable17

    #symb_m010070100033911059

    feed_output716

    Worktable1376

    #symb_m010070100033911052295

    feed_output36

    exception4

    Worktable232

    Worktable1269

    #symb_m010070100033911050

    feed_output0

    exception2

    Worktable27

    Worktable17

    #symb_m010070100033911059

    Worktable1376

    #symb_m010070100033911057146

    feed_output17

    Worktable11962

    #symb_m010070100033911052295

    feed_output710

    exception4

    Worktable2233

    Worktable1269

    #symb_m010070100033911050

    feed_output0

    exception2

    Worktable27

    Worktable17

    #symb_m010070100033911052295

    feed_output474

    exception4

    Worktable2250

    Worktable1269

    #symb_m010070100033911050

    feed_output0

    exception2

    Worktable27

    Worktable17

    for #2,I have noticed forward scan

    Total Logical Read: 62684

    Details:

    Tables Logical Read

    summary1

    exception113

    funds6,466

    Mapping4309

    identifier134

    #symb_m010070100033911052112

    #symb_m010070100033911057

    Worktable12146

    #symb_m010070100033911057

    asset_alloc4,232

    Worktable16314

    #symb_m010070100033911057

    asset_alloc4,216

    Worktable12156

    #symb_m010070100033911057

    exception4,267

    Worktable11,904

    #symb_m010070100033911052295

    feed_output474

    exception113

    Worktable2250

    Worktable1269

    #symb_m010070100033911050

    feed_output0

    exception113

    Worktable27

    Worktable17

    #symb_m010070100033911059

    feed_output716

    Worktable1376

    #symb_m010070100033911052295

    feed_output36

    exception113

    Worktable232

    Worktable1269

    #symb_m010070100033911050

    feed_output0

    exception113

    Worktable27

    Worktable17

    #symb_m010070100033911059

    Worktable1376

    #symb_m010070100033911057146

    feed_output17

    Worktable11,962

    #symb_m010070100033911052295

    feed_output710

    exception113

    Worktable2233

    Worktable1269

    #symb_m010070100033911050

    feed_output0

    exception113

    Worktable27

    Worktable17

    #symb_m010070100033911052295

    feed_output474

    exception113

    Worktable2250

    Worktable1269

    #symb_m010070100033911050

    feed_output0

    exception113

    Worktable27

    Worktable17

    I am very much confused to take the conclusion which method is good and faster?

    karthik

  • I did some more research about FORWARD & BACKWARD scan.

    Test #1:

    select name,id from sp_help sysobjects

    order by id

    Query Plan:

    |--Clustered Index Scan(OBJECT:([IPStatic].[dbo].[sysobjects].[sysobjects]), ORDERED FORWARD)

    Test #2:

    select name,id from sp_help sysobjects

    order by id DESC

    Query Plan:

    |--Clustered Index Scan(OBJECT:([IPStatic].[dbo].[sysobjects].[sysobjects]), ORDERED BACKWARD)

    My Questions:

    1) How BACKWARD SCAN work internally?

    2) Which scan method is best when we look performance?

    3) Which scan method will require more memory space?

    4) When we go BACKWARD scan ? I mean which situation we need to use this scan type.

    karthik

  • Test your queries, use "set statistics time on" and "set statistics io on", and see which index works best for you.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I tested both the indexes. Both of them took 2 seconds to complete the execution.

    But it is varying in IO. I have already posted the logical read details. Can you see my earlier reply?

    karthik

  • I must be misunderstanding something in your post. You list two different queries, and you seem to be listing the reads for both queries. They aren't the same query, so why would it matter if they have different reads? At least, that's how the posts seem to read to me. What am I missing?

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I have created a non clustered index to exception table.

    But i used two different column order.

    1)create nonclustered index sk_exception on exception(feed_num,err_cd,symbol)

    I created the above index first.

    then i executed the procedure.

    exec p1_t

    I have noticed the query plan.

    query plan:

    query optimizer used backward scan method to scan the exception table.

    Total Logical Read: 65818

    Details:

    Tables Logical Read

    summary 1

    exception 2

    funds 6466

    Mapping 4309

    identifier 134

    #symb_m01007010003391105 2112

    #symb_m01007010003391105 7

    Worktable1 2146

    #symb_m01007010003391105 7

    asset_alloc 4232

    Worktable1 6314

    #symb_m01007010003391105 7

    asset_alloc 4216

    Worktable1 2156

    #symb_m01007010003391105 7

    exception 8392

    Worktable1 1904

    #symb_m01007010003391105 2295

    feed_output 474

    exception 4

    Worktable2 250

    Worktable1 269

    #symb_m01007010003391105 0

    feed_output 0

    exception 2

    Worktable2 7

    Worktable1 7

    #symb_m01007010003391105 9

    feed_output 716

    Worktable1 376

    #symb_m01007010003391105 2295

    feed_output 36

    exception 4

    Worktable2 32

    Worktable1 269

    #symb_m01007010003391105 0

    feed_output 0

    exception 2

    Worktable2 7

    Worktable1 7

    #symb_m01007010003391105 9

    Worktable1 376

    #symb_m01007010003391105 7146

    feed_output 17

    Worktable1 1962

    #symb_m01007010003391105 2295

    feed_output 710

    exception 4

    Worktable2 233

    Worktable1 269

    #symb_m01007010003391105 0

    feed_output 0

    exception 2

    Worktable2 7

    Worktable1 7

    #symb_m01007010003391105 2295

    feed_output 474

    exception 4

    Worktable2 250

    Worktable1 269

    #symb_m01007010003391105 0

    feed_output 0

    exception 2

    Worktable2 7

    Worktable1 7

    I dropped the above index and created the below index.

    2)create nonclustered index sk_exception on exception(symbol,err_cd,feed_num)

    you can the difference in the column order.

    I have executed the procedure p1_t once again and noticed the query plan.

    query plan:

    query optimizer used forward scan method to scan the exception table.

    Total Logical Read: 62684

    Details:

    Tables Logical Read

    summary 1

    exception 113

    funds 6,466

    Mapping 4309

    identifier 134

    #symb_m01007010003391105 2112

    #symb_m01007010003391105 7

    Worktable1 2146

    #symb_m01007010003391105 7

    asset_alloc 4,232

    Worktable1 6314

    #symb_m01007010003391105 7

    asset_alloc 4,216

    Worktable1 2156

    #symb_m01007010003391105 7

    exception 4,267

    Worktable1 1,904

    #symb_m01007010003391105 2295

    feed_output 474

    exception 113

    Worktable2 250

    Worktable1 269

    #symb_m01007010003391105 0

    feed_output 0

    exception 113

    Worktable2 7

    Worktable1 7

    #symb_m01007010003391105 9

    feed_output 716

    Worktable1 376

    #symb_m01007010003391105 2295

    feed_output 36

    exception 113

    Worktable2 32

    Worktable1 269

    #symb_m01007010003391105 0

    feed_output 0

    exception 113

    Worktable2 7

    Worktable1 7

    #symb_m01007010003391105 9

    Worktable1 376

    #symb_m01007010003391105 7146

    feed_output 17

    Worktable1 1,962

    #symb_m01007010003391105 2295

    feed_output 710

    exception 113

    Worktable2 233

    Worktable1 269

    #symb_m01007010003391105 0

    feed_output 0

    exception 113

    Worktable2 7

    Worktable1 7

    #symb_m01007010003391105 2295

    feed_output 474

    exception 113

    Worktable2 250

    Worktable1 269

    #symb_m01007010003391105 0

    feed_output 0

    exception 113

    Worktable2 7

    Worktable1 7

    I was confused to decide which index is good. I hope i have explained clearly.

    karthik

Viewing 15 posts - 1 through 15 (of 22 total)

You must be logged in to reply to this topic. Login to reply