Question On Indexing

  • Hi All,

    I have a table that contains 77 million rows with 26 columns. I have a clustered index that's scripted as follows:

    CREATE CLUSTERED INDEX [ix_join] ON [dbo].[BO_FIN_HISTORY]

    (

    [BOFH_CLIENT_NO] ASC,

    [BOFH_CHECK_DIG] ASC,

    [BOFH_SUFFIX] ASC,

    [Account] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

    When I execute the following query the data is returned in 0 seconds:

    --Query1

    select *

    from bo_fin_history

    where BOFH_CLIENT_NO = 8000002

    and BOFH_CHECK_DIG = 4

    and BOFH_SUFFIX = 221

    The query below executes in 7 seconds:

    --Query 2

    select *

    from bo_fin_history

    where account = '00000000008000002422101'

    Both queries have a where clause that use columns from the same clustered index but the one query takes longer than the other. Why is this? I've included the execution plans for both queries. There is a big difference between the 2 and I'm hoping someone can tell me why. Btw, the first time I ran "Query 2" it took 53 seconds, it was only when I included the execution plan and ran it again that it took 7 seconds, why would this happen?

    Hope that y'all can give me a hand on this one.

    Cheers 🙂

    ----------------------------------------------------------------------------------------------
    Struggling to find the question?
    http://weblogs.sqlteam.com/jeffs/archive/2008/05/13/question-needed-not-answer.aspx

  • If you look at the 2nd query if see a key lookup. That happens when the server uses an index to find the correct data to match the where clause then goes back into the clustered index to find the required columns.

    That's why you get a missing index warning (I'm using 2008).

    Now without the real query, it's pretty much impossible to optimize this because we can't offer covering indexes which is most likely what you need.

  • --Query 2

    select *

    from bo_fin_history

    where account = '00000000008000002422101'

    Hi,

    Ur second query ll not take the C.I index "Ix_join"

    See the execution plan the optimizer took the index "Ix_post" that y the key lookup happened.

    Can u post non C.I index "Ix_post" structure.

    Also the predicate was missing "account"

    Edit:The Actual no is very low only 93 rows

    Muthukkumaran Kaliyamoorthy
    https://www.sqlserverblogforum.com/

  • Simply put, the first query uses the leading edge of the index you posted. The first column in the index is part of the Where clause. That's optimal use of the index.

    The second query would have to dig several levels deep in the index, which means it's probably more efficient to use a different index and a bookmark lookup, so that's what SQL Server does.

    On the 53 seconds vs 7 seconds, that's just caching. The first run was uncached, while the second was cached because of the first run. It's very normal for subsequent runs of a query to be much faster than the first run.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • WilliamBendall,

    Please read the Grant Fritchey's article about execution Plan [/url]

    Muthukkumaran Kaliyamoorthy
    https://www.sqlserverblogforum.com/

  • SQL can only do an index seek when the where clause references a left-based subset of the index key. In your first query, that's exactly what it's doing, so that's a simple index seek.

    The second query references a column that's in the cluster, but it's not the left-most column. Hence that cannot use an index seek on the clustered index. It can seek on a different index (if there is one that has that column as the left-most column of the key), but that may include key lookups (which are slow)

    Have a look at this series on indexes. http://www.sqlservercentral.com/articles/Indexing/68439/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thank you all for your input.

    I'm getting a much better picture on how indexing works 😉 For the table I used in my original post it sounds like it would be more beneficial for me to then remove the account field from the clustered index. And then create a non clustered index with just the account field.

    One more question, if I query the table as such:

    SELECT *

    FROM BO_FIN_HISTORY

    WHERE Account = '00000000008000002422101' AND POST_DATE >= '2010/01/01'

    Is it more beneficial to add 2 non clustered indexes (one with Account and one with POST_DATE) or just one non clustered index containing both columns when 90% of the time I query Account in conjunction with POST_DATE? Or, is it just a matter of testing to see which of the 2 options is going to be better?

    Once again thanks for all your advice.

    ----------------------------------------------------------------------------------------------
    Struggling to find the question?
    http://weblogs.sqlteam.com/jeffs/archive/2008/05/13/question-needed-not-answer.aspx

  • How often do you filter just by account? How often do you filter by just postdate?

    Oh, and do you really need every single column in the table? If not, lose the SELECT *. Even if you do, lost the SELECT * and specify the columns.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • One index with both columns will almost certainly be better. The order for them in the index will depend on how often you just use one or the other of those columns.

    Also, if you have specific columns that you want to have in the Select clause, which aren't in the Where clause, take a look at the Include option for indexes. That can speed the whole thing up tremendously.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Thanks I'll try out the included columns. Appreciate the advice.

    ----------------------------------------------------------------------------------------------
    Struggling to find the question?
    http://weblogs.sqlteam.com/jeffs/archive/2008/05/13/question-needed-not-answer.aspx

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply