Why is my SELECT query slower when the table is indexed?

  • Ninja's_RGR'us (11/10/2011)


    Your test env is underpowered.

    Try reducing to 100 000 rows see if that helps. Going much under that is not going to be really a good test.

    It runs in 10 secs on my servers as well.

    3 GB ram and 1 cpu. Nothing massively powerful as you can see.

    I will reduce the row count as you suggest.

    However, I am running a 3Ghz dual core CPU, 4Gb RAM, so I am a little baffled as to why my PC should be so slow while running this particular script. It does appear to be quite snappy with everything else - then again, I suppose I don't have much else to compare it with.

  • raotor (11/10/2011)


    Ninja's_RGR'us (11/10/2011)


    Your test env is underpowered.

    Try reducing to 100 000 rows see if that helps. Going much under that is not going to be really a good test.

    It runs in 10 secs on my servers as well.

    3 GB ram and 1 cpu. Nothing massively powerful as you can see.

    I will reduce the row count as you suggest.

    However, I am running a 3Ghz dual core CPU, 4Gb RAM, so I am a little baffled as to why my PC should be so slow while running this particular script. It does appear to be quite snappy with everything else - then again, I suppose I don't have much else to compare it with.

    Maybe your max ram setting is too low. SQL doesn't have access to what it needs to run this.

    As a PS, you could always run this on another server. I've run it on my prod server and it just flies.

  • Ninja's_RGR'us (11/10/2011)


    raotor (11/10/2011)


    Ninja's_RGR'us (11/10/2011)


    However, I am running a 3Ghz dual core CPU, 4Gb RAM, so I am a little baffled as to why my PC should be so slow while running this particular script. It does appear to be quite snappy with everything else - then again, I suppose I don't have much else to compare it with.

    Maybe your max ram setting is too low. SQL doesn't have access to what it needs to run this.

    As a PS, you could always run this on another server. I've run it on my prod server and it just flies.

    I believe the maximum amount of RAM configured for use with SS2008R2 is already set as high as it will go (i.e. I cannot increase the approximate 2.7Gb value any further.

    Unfortunately, I do not have any other servers on which to try this as it's just me and my PC 🙂

    Oh well, not to worry. I might just let this script run for a while longer to see if it finishes any time soon - I'm sort of curious now as to how long it might take.

  • Then lower it a little (leave at least 1 GB for windows). You can't starve the OS either. It's just as bad.

  • raotor (11/10/2011)


    J Livingston SQL (11/10/2011)


    take a look at my sample testrig earlier in this thread.

    I've copied your test script into SSMS and executed it. However, it appears to run extremely slow. Thus far, the script has been running for 3.5 hours!

    I can see that something is happening as the TempDB log file is slowly growing (around the 750Mb mark at 3.5 hour script execution duration).

    In additiom, I'm not sure that generating data with random price values would sufficiently represent the data I have. The price colum has large numbers of duplicate values and as such would probably affect the speed of any index using this column ... I think. SOmething about lack of selectivity as I recall ?

    Then again, I am new to SQL, so I could just be spouting nonsense! 🙂

    I am sorry to hear the testrig is not working for you...it certainly should not take hours!!

    as suggested elsewhere...try running it at say 10K or 100K recs to start with and post back.

    slightly amemded version ref your comments about price duplicates .see comments in code

    IF Object_id('tempdb..TempSales', 'U') IS NOT NULL

    DROP TABLE tempdb..TempSales;

    --==== MyPC.... SQL 2008R2 64 Dev...Vista 64... 1.86Ghz...2GB RAM

    --==== sub 10 secs

    SELECT TOP 1000000 ---- NOTE 1 MILLION rows .....

    SalesID = IDENTITY(INT, 1, 1),

    SalesDate = Dateadd(dd, Datediff(dd, 0, ( CAST(Rand(Checksum(Newid())) * 1096 + 40177 AS DATETIME) )), 0),

    SalesFlag = CAST(1 AS BIT),

    --==== following gives a reasonably well distributed sample of prices between 0 and 99.99

    --==== min number of duplicates is 50 and max is 137...your experience may vary

    Price = CAST(CAST(Rand(Checksum(Newid())) * 99.99 + 0 AS DECIMAL(8, 2)) AS FLOAT ) -- amended to match your sample data

    INTO TempSales

    FROM sys.all_columns ac1

    CROSS JOIN sys.all_columns ac2

    CROSS JOIN sys.all_columns ac3

    --==== sub 2 secs

    ALTER TABLE [dbo].[TempSales] ADD CONSTRAINT [PK_Tempsales] PRIMARY KEY CLUSTERED ([SalesID] ASC)

    GO

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • raotor (11/3/2011)


    Hello,

    I am new to SQL and am slowly llearning, so please forgive me if I ask a dumb question.

    I was reading through a couple of articles on the web relating to indexes and where they should be used and why. OK, so I get the idea.

    However, upon conducting some tests using a simple single column query (I can provide the query and underlying table structure if required) on a single table I was perplexed as to why the query ran faster when the table was completely unindexed.

    Here's some stats to illustrate:

    * Unindexed query = 23 seconds

    * With primary key (not on SELECTed column though) took 44 seconds

    * With index on SELECTed column took 32 seconds

    Perhaps I'm missing something here, so would appreciate any help.

    I did the same test with the script you provided but increased the number of records to 81 million. For Unindexed vs. primary key I could not find too much difference in performance, so maybe there is something strange going on with your server. But keep in mind that you not only added Primary Key which doesn't have SELECTed column, you actually changed the physical structure of the table. When you create PK the server by default creates it as a clustered index therefore performance on a heap table or a table with clustered index can be different.

    When I compared unindexed vs. non-clustered index I've found very interesting SQL Server behavior.

    After you created NC index the table itself became cached in the buffer (because the server scans it to create an index). Then when you try to run a query it uses only NC index to produce the result not a heap table. But if you don't have enough memory available to put this index to cache then each time you are running the query there will be a physical reads. At this point memory is already taken by the heap table in the cache, and believe it or not server doesn't want to release this memory even if there is a memory pressure. This happens only with a heap table, it starts to work better whenever you create clustered index on a table.

    Try to create an index on Price column, run DBCC DROPCLEANBUFFERS and then execute your query a few times then you should be able to see different results.

    I am going to investigate this more and will share the details later on.


    Alex Suprun

  • Alexander Suprun (11/10/2011)


    raotor (11/3/2011)


    Hello,

    I am new to SQL and am slowly llearning, so please forgive me if I ask a dumb question.

    I was reading through a couple of articles on the web relating to indexes and where they should be used and why. OK, so I get the idea.

    However, upon conducting some tests using a simple single column query (I can provide the query and underlying table structure if required) on a single table I was perplexed as to why the query ran faster when the table was completely unindexed.

    Here's some stats to illustrate:

    * Unindexed query = 23 seconds

    * With primary key (not on SELECTed column though) took 44 seconds

    * With index on SELECTed column took 32 seconds

    Perhaps I'm missing something here, so would appreciate any help.

    I did the same test with the script you provided but increased the number of records to 81 million. For Unindexed vs. primary key I could not find too much difference in performance, so maybe there is something strange going on with your server. But keep in mind that you not only added Primary Key which doesn't have SELECTed column, you actually changed the physical structure of the table. When you create PK the server by default creates it as a clustered index therefore performance on a heap table or a table with clustered index can be different.

    When I compared unindexed vs. non-clustered index I've found very interesting SQL Server behavior.

    After you created NC index the table itself became cached in the buffer (because the server scans it to create an index). Then when you try to run a query it uses only NC index to produce the result not a heap table. But if you don't have enough memory available to put this index to cache then each time you are running the query there will be a physical reads. At this point memory is already taken by the heap table in the cache, and believe it or not server doesn't want to release this memory even if there is a memory pressure. This happens only with a heap table, it starts to work better whenever you create clustered index on a table.

    Try to create an index on Price column, run DBCC DROPCLEANBUFFERS and then execute your query a few times then you should be able to see different results.

    I am going to investigate this more and will share the details later on.

    I tried to recreate this scenario with other test tables but looks like it doesn't work the way I described earlier. I'm totally confused. :crazy: Maybe there are some variables I didn't take into account. So for now just ignore my explanations and only try to run DBCC DROPCLEANBUFFERS before each of your tests and please post the results.


    Alex Suprun

  • raotor (11/10/2011)


    I've copied your test script into SSMS and executed it. However, it appears to run extremely slow. Thus far, the script has been running for 3.5 hours!

    I can see that something is happening as the TempDB log file is slowly growing (around the 750Mb mark at 3.5 hour script execution duration).

    That would seem to indicate that you never setup the growth pattern or initial size for TempDB. If you left the defaults, it temporarily stopped running the test-rig script approximately 750,000 times, grew the TempDB log file, and then when back to it. It would do nearly the same thing to the TempDB data file at the end of the implicit transaction.

    Before you do another thing with test scripts of this size, setup the TempDB data file "Initial Size" to 1,000 MB (1Gig) and the log file "Initial Size" to 250 or 500 MB and set the growth for both to 100MB. Then, either "bounce" the SQL Service or reboot your PC.

    You should do the same thing if you're testing in another database.

    Also, make sure that your virus scanner, whatever it is, is NOT checking *.MDF, *.NDF, or *.LDF files.

    Please see the articles at the following links for how to configure TempDB and why.

    http://msdn.microsoft.com/en-us/library/ms345368.aspx

    http://technet.microsoft.com/en-us/library/cc966545.aspx

    There's also an article by Paul Randall on optimizing the growth settings but let's peel one potato at a time.;-)

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • J Livingston SQL (11/10/2011)


    raotor (11/10/2011)


    J Livingston SQL (11/10/2011)


    I am sorry to hear the testrig is not working for you...it certainly should not take hours!!

    as suggested elsewhere...try running it at say 10K or 100K recs to start with and post back.

    slightly amemded version ref your comments about price duplicates .see comments in code

    Thanks for the update script. I will give it a go just as soon as I can stop the previous run from executing.

    Despite clicking "Cancel Execution" upon initial execution of the original script you provided there wasn't the usual message informing me that the query execution had been cancelled.

    After rebooting my PC and using the DBCC OPENTRAN command I get the following result:

    Transaction information for database 'tempdb'.

    Oldest active transaction:

    SPID (server process ID): 54

    UID (user ID) : -1

    Name : SELECT INTO

    LSN : (24:84:56)

    Start time : Nov 10 2011 1:19:49:327PM

    SID : 0x010500000000000515000000fd77b1569e407e14828ba628eb030000

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    Now, I've looked into this and found a "KILL" command and judging by the books online documentation I issued the following command based on the output above:

    KILL 54

    Upon doing this the following message is returned:

    SPID 54: transaction rollback in progress. Estimated rollback completion: 0%. Estimated time remaining: 0 seconds.

    Despite this, I can see that hours later the transaction is apparently still running (I can see that the TempDBLog file is slowly growing) and DBCC OPENTRAN still reports the same situation as before.

    Could you please offer any suggestions to help?

    Thanks once again.

  • The transaction will continue to run until it's completely rolled back. Rollbacks can take a while, just like the initial execution. It depends on how far the initial transaction went. This is normal, sorry to say.

    Now people will recommend stopping the service as a way of killing the transaction, which will do it. But I do not recommend doing so as it can cause you more harm than good. Be patient. Wait it out.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Brandie Tarvin (11/11/2011)


    The transaction will continue to run until it's completely rolled back. Rollbacks can take a while, just like the initial execution. It depends on how far the initial transaction went. This is normal, sorry to say.

    Now people will recommend stopping the service as a way of killing the transaction, which will do it. But I do not recommend doing so as it can cause you more harm than good. Be patient. Wait it out.

    Hmmmm ... I see. So the KILL command I executed did actually work then?

    I've rebooted my PC - so isn't that equivalent to stopping the service?

    Also, as I am a total neebie to SQL I am somewaht confused about the fact that I cancelled the script after it had been running for approximately 54.5 hours and yet the "rollback" process has so far been running for over twice that time. I would've thought that the roolback would take around the same to to revert to the pre-transaction state as it did to reach the point at which the transaction was cancelled?

    Also, and again please forgive my ignorance here, but I am puzzled as to why the TempDB Log file is still growing beyond the size at which the transaction was cancelled. I thought a rollback would undo the transactions logged so far and not generate additional ones?

  • J Livingston SQL (11/10/2011)


    raotor (11/10/2011)


    J Livingston SQL (11/10/2011)


    I am sorry to hear the testrig is not working for you...it certainly should not take hours!!

    as suggested elsewhere...try running it at say 10K or 100K recs to start with and post back.

    slightly amemded version ref your comments about price duplicates .see comments in code

    Update:

    After digesting several suggestions in this thread I have subsequently re-run your test rig script.

    This time around the script executed in 6 seconds!

    I made the following changes to my SS config.

    * Reduced the maximum amount of memory available to SQL Server to 1024Mb (1Gb).

    * Increased the Kb allocated for query size from 1024 to 5128.

    * Altered the TempDB size to 1Gb with a auto-growth of 256Mb

    * Altered TempDB log file size to 512Mb with an auto-growth of 128Mb.

    Now, I'm not sure which (if any) of the above might have fixed the performance issue with respect to your script, but something did.

  • good to hear you have the testrig sorted now.

    so back to your original post

    However, upon conducting some tests using a simple single column query (I can provide the query and underlying table structure if required) on a single table I was perplexed as to why the query ran faster when the table was completely unindexed.

    Here's some stats to illustrate:

    * Unindexed query = 23 seconds

    * With primary key (not on SELECTed column though) took 44 seconds

    * With index on SELECTed column took 32 seconds

    what figures are you returning from the testrig?

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • raotor (11/11/2011)


    Brandie Tarvin (11/11/2011)


    The transaction will continue to run until it's completely rolled back. Rollbacks can take a while, just like the initial execution. It depends on how far the initial transaction went. This is normal, sorry to say.

    Now people will recommend stopping the service as a way of killing the transaction, which will do it. But I do not recommend doing so as it can cause you more harm than good. Be patient. Wait it out.

    Hmmmm ... I see. So the KILL command I executed did actually work then?

    Probably.

    I've rebooted my PC - so isn't that equivalent to stopping the service?

    Yes, but that's a bad way to stop a service. In fact, rebooting your PC in the middle of anything should always be a solution of last resort. You can lose a lot more than SQL Server by doing that.

    Also, as I am a total neebie to SQL I am somewaht confused about the fact that I cancelled the script after it had been running for approximately 54.5 hours and yet the "rollback" process has so far been running for over twice that time. I would've thought that the roolback would take around the same to to revert to the pre-transaction state as it did to reach the point at which the transaction was cancelled?

    When it comes to SQL Server, never assume. The kill command does take time to process along with all the other commands. It doesn't necessarily happen instantly.

    Also, and again please forgive my ignorance here, but I am puzzled as to why the TempDB Log file is still growing beyond the size at which the transaction was cancelled. I thought a rollback would undo the transactions logged so far and not generate additional ones?

    Nope. The transaction log has nothing to do with this. The log file is recording everything that happens. Though, given that TempDB is usually set to SIMPLE recovery mode, it's not actually logging every single transaction detail so much as it is logging the affected extents. The rollback itself, if I'm not mistaken, is happening in the data file.

    Of course, to verify that I haven't completely mislead you, I'm going to tag an expert on these issues and ask her to double-check my post for errors.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • raotor (11/11/2011)


    I would've thought that the roolback would take around the same to to revert to the pre-transaction state as it did to reach the point at which the transaction was cancelled?

    Rollbacks are more work than the original operation. The original operation just has to make the change. The rollback has to read the tran log, locate the original change, generate compensating operations and then log and perform those. Rollbacks almost always take longer than the original operation.

    Also, and again please forgive my ignorance here, but I am puzzled as to why the TempDB Log file is still growing beyond the size at which the transaction was cancelled. I thought a rollback would undo the transactions logged so far and not generate additional ones?

    Yes, it is undoing the transactions. And to do so, it need to reverse the changes made to the DB. Any and all changes to the DB are logged, including the undo operations.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 15 posts - 31 through 45 (of 55 total)

You must be logged in to reply to this topic. Login to reply