Why is my SELECT query slower when the table is indexed?

  • Hello,

    I am new to SQL and am slowly llearning, so please forgive me if I ask a dumb question.

    I was reading through a couple of articles on the web relating to indexes and where they should be used and why. OK, so I get the idea.

    However, upon conducting some tests using a simple single column query (I can provide the query and underlying table structure if required) on a single table I was perplexed as to why the query ran faster when the table was completely unindexed.

    Here's some stats to illustrate:

    * Unindexed query = 23 seconds

    * With primary key (not on SELECTed column though) took 44 seconds

    * With index on SELECTed column took 32 seconds

    Perhaps I'm missing something here, so would appreciate any help.

  • Indexing tables is more of an art than a formula sometimes 🙂 Off the top of my head I'm thinking that the index has not been used enough to allow the SQL query optimizer to create the best execution plan for the query engine yet, and/or not enough statistics have been collected yet for optimum use of the index. I believe the number of columns, number of rows, and the selectivity of the indexed column's data also have a lot to do with it. If you have a very small table with few columns, and/or the data in the newly indexed column was already highly selectable (very unique) then you could see a degrading rather than an improvement.

    *edit: Those are my immediate thought, anyway. Someone, please correct me if I have misspoken.

    _________________________________
    seth delconte
    http://sqlkeys.com

  • raotor (11/3/2011)


    Hello,

    I am new to SQL and am slowly llearning, so please forgive me if I ask a dumb question.

    I was reading through a couple of articles on the web relating to indexes and where they should be used and why. OK, so I get the idea.

    However, upon conducting some tests using a simple single column query (I can provide the query and underlying table structure if required) on a single table I was perplexed as to why the query ran faster when the table was completely unindexed.

    Here's some stats to illustrate:

    * Unindexed query = 23 seconds

    * With primary key (not on SELECTed column though) took 44 seconds

    * With index on SELECTed column took 32 seconds

    Perhaps I'm missing something here, so would appreciate any help.

    Several things we would need to really answer your question. First the query, DDL for the tables, sample data (all in a readily consumable format; i.e. cut, paste, run). The actual execution plans from your runs of the code against the tables unindexed and indexed.

    Save the actual execution plans as .sqlplan files.

  • Please see second article in lynn's signature. I've seen many times experts are asking to provide detail in that format but most of the people ask questions without much detail. I've already experienced this and providing the data in that format helps even me to understand few bits about my data and can get the best advice from this forum. It looks bit of work for you but at a same time you are asking for help and expected to provide sufficient and relevant information.

  • Lynn Pettis (11/3/2011)Save the actual execution plans as .sqlplan files.

    OK, Here's the query:

    SELECTCOUNT(Price)

    FROMTempSales

    GROUP BY Price

    ORDER BY Price;

    Here's the table:

    CREATE TABLE TempSales

    (

    SalesID INT NOT NULL,

    SalesDate DATE,

    SalesFlag BIT,

    Price FLOAT,

    PriceUSD FLOAT

    )

    I'm afraid as I'm so green with this forum you'll have to tell me the best way to attach the execution plans and sample data you wanted.

  • Read the first two articles I reference below in my signature block.

  • You didn't post your index(es). What are they?

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • In terms of data types, I would not use a FLOAT to define a monetary field. FLOAT is an approximate numeric. You should be using NUMERIC or DECIMAL (You can use MONEY as well, but I prefer to use NUMERIC)

    Money: http://msdn.microsoft.com/en-us/library/ms179882.aspx

    Numeric/Decimal: http://msdn.microsoft.com/en-us/library/ms187746.aspx

    Float/Real: http://msdn.microsoft.com/en-us/library/ms173773.aspx

  • cliffb (11/4/2011)


    In terms of data types, I would not use a FLOAT to define a monetary field. FLOAT is an approximate numeric.

    You'd be amazed at how many businesses want that approximation. Banks, Waste Management companies, etc. I've actually worked with a few of them that say that NUMERIC and MONEY just don't work for them.

    That was back in my early DBA days.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • raotor (11/3/2011)


    OK, Here's the query:

    SELECTCOUNT(Price)

    FROMTempSales

    GROUP BY Price

    ORDER BY Price;

    I'm afraid as I'm so green with this forum you'll have to tell me the best way to attach the execution plans and sample data you wanted.

    Your query also seem a bit strange. Why the "group by" when you are only returning an aggregate?

  • ...........

    I'm afraid as I'm so green with this forum you'll have to tell me the best way to attach the execution plans and sample data you wanted.

    Here's a test rig you may care to play around with...

    --PLEASE NOTE THAT THIS WILL DELETE ANY EXG TABLES WITH SAME NAME IN EXG TEMPDB....please amend if required

    --+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    --==== takes under 10 secs to run on PC

    --==== start in safe place!!!

    USE [tempdb]

    GO

    IF Object_id('tempdb..TempSales', 'U') IS NOT NULL

    DROP TABLE tempdb..TempSales;

    SELECT TOP 1000000 ---- NOTE 1 MILLION rows .....

    SalesID = IDENTITY(INT, 1, 1),

    SalesDate = Dateadd(dd, Datediff(dd, 0, ( CAST(Rand(Checksum(Newid())) * 1096 + 40177 AS DATETIME) )), 0),

    SalesFlag = CAST(1 AS BIT),

    --Price = CAST(Rand(Checksum(Newid())) * 9 + 1 AS DECIMAL(8, 4))

    Price = CAST(CAST(Rand(Checksum(Newid())) * 9 + 1 AS DECIMAL(8, 4)) AS FLOAT )

    INTO TempSales

    FROM sys.all_columns ac1

    CROSS JOIN sys.all_columns ac2

    CROSS JOIN sys.all_columns ac3

    ALTER TABLE [dbo].[TempSales] ADD CONSTRAINT [PK_Tempsales] PRIMARY KEY CLUSTERED ([SalesID] ASC)

    GO

    PRINT '-----------UNINDEXED'

    SET STATISTICS TIME ON

    SET STATISTICS IO ON

    SELECT Price, COUNT(Price)

    FROM TempSales

    GROUP BY Price

    ORDER BY Price

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    --- create an index on the price

    CREATE NONCLUSTERED INDEX [IX_PRICE]

    ON [dbo].[TempSales] ( [Price] ASC )

    GO

    PRINT '-----------INDEX ON PRICE'

    SET STATISTICS TIME ON

    SET STATISTICS IO ON

    SELECT Price, COUNT(Price)

    FROM TempSales

    GROUP BY Price

    ORDER BY Price

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • Martin Schoombee (11/4/2011)


    raotor (11/3/2011)


    OK, Here's the query:

    SELECTCOUNT(Price)

    FROMTempSales

    GROUP BY Price

    ORDER BY Price;

    I'm afraid as I'm so green with this forum you'll have to tell me the best way to attach the execution plans and sample data you wanted.

    Your query also seem a bit strange. Why the "group by" when you are only returning an aggregate?

    I noticed that, as well. Also, the ORDER BY isn't going to help either. In fact, it's a part of the problem and not the solution.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • cliffb (11/4/2011)


    In terms of data types, I would not use a FLOAT to define a monetary field. FLOAT is an approximate numeric. You should be using NUMERIC or DECIMAL (You can use MONEY as well, but I prefer to use NUMERIC)

    Money: http://msdn.microsoft.com/en-us/library/ms179882.aspx

    Numeric/Decimal: http://msdn.microsoft.com/en-us/library/ms187746.aspx

    Float/Real: http://msdn.microsoft.com/en-us/library/ms173773.aspx

    Thanks for the tip.

    However, although this is slightly off-topic, I decided to use FLOAT instead of MONEY or DECIMAL because of an earlier thread I stumbled upon from this site as I recall that explained through various examples how FLOAT was the only type that didn't lose accuracy when doing things like calculate tax additives then remove tax additives via a percentage calculation. If I recall correclty one poster said that they use FLOAT for all monetary values because of rounding issues with other types.

    Perhaps I've got this wrong, but when I did the example test provided I could see how other types lost accuracy.

  • Martin Schoombee (11/4/2011)[hrYour query also seem a bit strange. Why the "group by" when you are only returning an aggregate?

    Opps! You're right! Apologies for this. I should've also included the 'Price' column as well such that the query returns the count of sales at each price point in price order.

    Revised query should read:

    SELECT COUNT(Price),

  • Martin Schoombee (11/4/2011)[hrYour query also seem a bit strange. Why the "group by" when you are only returning an aggregate?

    Opps! You're right! Apologies for this. I should've also included the 'Price' column as well such that the query returns the count of sales at each price point in price order.

    Revised query should read:

    SELECT COUNT(Price),

    Price

    FROM TempSales

    GROUP BY Price

    ORDER BY Price;

Viewing 15 posts - 1 through 15 (of 55 total)

You must be logged in to reply to this topic. Login to reply