IO performance issues

  • Hello everyone, I have a server with 4 drives. SQL server is on C, Logs on D, Index on E, and Data on F.

    My user, since 2 days ago, has got very slow response on his procedure. He has been running this procedure for months, and it takes on average 3 hours. It has been running for over 40 hours. There is a lot of CPU cycles, but the IO is minimal. I ran Performance Monitor and looked at the Physicaldisk> Avg. Disk Read Queue Length and Avg. Disk Write Queue Length for each of the drives. I see that at all times one or more drives have a very high queue (if the line is up, that means there is more in the queue, right?)

    Sometimes it C drive that is high on queue, sometimes it is the log (but not usually) and sometimes it is the index and/or data drives that have high queue.

    Am I interpreting the graph correctly? What can I do to speed up IO? Do you think that the IO will get better if I add more drives and spread the data around, and move the tempdb files to their own drive?

    What gets me is that one day the process ran in approx 3 hours, and the next day, the problems started and the process has been running for 40+ hours. There is no major changes, just the usual data inserts and updates (2,000 inserts a night).

    Drive C has 11 Gigs open with 6 Gigs used. Drive D has 1 Gig open with 16 Gigs used.

    Drive E has 6 Gigs open with 10 Gigs used.

    Drive F has 32 Gigs open with 36 Gigs used.

    Any help is very much appreciated.

  • What does the execution plan show?

    Do you update statistics or rebuild indexes?

    Are these drives single drives or raid arrays?

    Open task manager, add the columns to it for IO, page faults, etc

    What processes hold the top IO? Page Faults? Etc? Is virtual memory being used? by what?

    The most common cause of what your describing are virtual memory swapping, page splits in indexes, or other processes keeping your server IO bound. With the information from above, you should be able to identify or eliminate these as possibilities.

  • Run sp_who and see what is happening on SPID 1-5. Awhile back I started getting a BACKUP LOG on temmpDB and my server screeched not to a halt but it seemed like it. Ultimately it turns out my tempDB was doing this because of size remaining on the drive where it is located and it need to grow, I fortunately had 30GB I could free of stuff I wantd to be rid of anyway and I have not had an issue since. Sometimes other processes can be the root of your issues.

  • Not critising your code, but in my experience, anything that takes 3 hours to run can be improved upon, and if there are improvements then this generally means that the optimiser is not getting a good plan.

    What can result, because of this, is that a small change can cause the bad plan to become really really bad, and thus take 40Hrs + to run.

    Are you on 7 or 2000, with 7 there is a known issue with parallism that can cause this.

    What does the SP do?

    Simon Sabin

    Co-author of SQL Server 2000 XML Distilled

    http://www.amazon.co.uk/exec/obidos/ASIN/1904347088


    Simon Sabin
    SQL Server MVP

    http://sqlblogcasts.com/blogs/simons

  • What should I look for in spids 1-5? Also, if tempdb is growing and taking up the whole drive space, can't I just look at the free space remaining through explorer? I hope that this is the problem, it would be an easy fix.

    simon, you are right, I did an explain plan and found out that there is a table scan. I checked the table, there is no index. I have created several indexes, have tested the script, but the performance doesn't pick up. I created several combo and single column non-clustered indexes, but sql server didn't want to use them. I gave hint in the statement to use the index, and sql server did use it, but then it showed the bookmark lookup step. So, this causes my user's query to run just as slow as before.

    I created clustered indexes, once using the PK column, and once using the column in the select statement's where clause. The same performance there too.

    Here is the t-sql, can you help? If you want, I can add whatever indexes you tell me to add and I will add and post the explain plan. Also, to let you know, the explain plan shows that the sub-procedure does not hurt the performance, its the select statement in the cursor that is the problem.

    ---------------------------------------------------------------------

    -- GUID assignment pass 1

    ---------------------------------------------------------------------

    SET QUOTED_IDENTIFIER OFF

    GO

    SET ANSI_NULLS ON

    GO

    -- set-up

    declare @result int, @ierror int, @irowcount int

    -- counters

    declare @iAssignedUPID int, @iNewUPID int

    select @iAssignedUPID = 0, @iNewUPID = 0

    -- Iterate through all non-guid assigned records, try to find guids, or assign new ones as needed

    declare @foundUPID int

    declare @pk int, @pFirstName varchar(40), @pLastName varchar(40),

    @N_ADDRESS1 varchar(40), @N_ADDRESS2 varchar(40), @N_HOUSENUM varchar(11),

    @N_PREDIR varchar(2), @N_STREET varchar(40), @N_STRSUF varchar(11),

    @N_POSTDIR varchar(2), @N_UNIT varchar(11), @N_UNITDES varchar(4),

    @N_CITY varchar(30), @N_STATE varchar(2),

    @N_ZIP varchar(10), @N_ZIP4 varchar(8), @N_ZIP10 varchar(18),

    @N_COUNTY varchar(44), @N_COUNTRY varchar(50), @N_MATCHCODE char(4),

    @emailaddress varchar(60),

    @LFIRM char(6), @LUSER char(10), @crSource char(3), @crChannel char(3),

    @crCellCode varchar(16), @crListCode varchar(16), @returnedUPID int

    set nocount on

    -- set-up a cursor with all the records that do not have GUID

    declare newUPID_cursor cursor for

    select PK, mf_firstName, mf_lastName, dp_addr1, dp_addr2, dp_housenum,

    dp_predir, dp_street, dp_strsuf,

    dp_postdir, dp_unit, dp_unitdes,

    dp_city, dp_state,

    dp_zip, dp_zip4, dp_zip10,

    dp_county, '', dp_mcode,

    '', ''

    from sepNOBO_AnalysisTable

    where a_UPIDonDPUL = -1

    for update of a_UPIDonDPUL

    open newUPID_cursor

    fetch next from newUPID_cursor

    into @pk, @pFirstName, @pLastName, @N_ADDRESS1, @N_ADDRESS2, @N_HOUSENUM,

    @N_PREDIR, @N_STREET, @N_STRSUF,

    @N_POSTDIR, @N_UNIT, @N_UNITDES,

    @N_CITY, @N_STATE,

    @N_ZIP, @N_ZIP4, @N_ZIP10,

    @N_COUNTY, @N_COUNTRY, @N_MATCHCODE,

    @emailaddress, @crListCode

    -- iterate through all of them and attempt to find guids, or assign new ones when needed

    while (@@fetch_status = 0)

    begin

    -- attempt to get a valid UPID from dpUL

    exec @result = dbo.findUPIDMatch2 @pFirstName, @pLastName, @N_HOUSENUM, @N_PREDIR,

    @N_STREET, @N_STRSUF, @N_POSTDIR, @N_UNIT,

    @N_ZIP, @N_ZIP4, @emailaddress, @foundUPID output

    if @foundUPID is not null

    -- if a valid UPID is found, use it; but also check to see if dupes were available

    begin

    update sepNOBO_AnalysisTable

    set a_UPIDonDPUL = @foundUPID

    where current of newUPID_cursor

    set @iAssignedUPID = @iAssignedUPID + 1

    end

    else

    -- if no valid UPID is found, simply assign upid -2 to the table

    begin

    set @crSource = ''

    set @crChannel = ''

    set @crCellCode = ''

    update sepNOBO_AnalysisTable

    set a_UPIDonDPUL = -2

    where current of newUPID_cursor

    set @iNewUPID = @iNewUPID + 1

    end

    -- get the next candidate record in the set

    fetch next from newUPID_cursor

    into @pk, @pFirstName, @pLastName, @N_ADDRESS1, @N_ADDRESS2, @N_HOUSENUM, @N_PREDIR, @N_STREET, @N_STRSUF,

    @N_POSTDIR, @N_UNIT, @N_UNITDES,

    @N_CITY, @N_STATE,

    @N_ZIP, @N_ZIP4, @N_ZIP10,

    @N_COUNTY, @N_COUNTRY, @N_MATCHCODE,

    @emailaddress, @crListCode

    end

    close newUPID_cursor

    deallocate newUPID_cursor

    set nocount off

    Thanks everyone, really appreciate it.

  • All I can say is CURSOR. What does findUPIDMatch2 do.

    Firstly I hate cursors so don't use them much. I might be wrong in these statements but you I would do this

    Don't use cursor do a SELECT TOP 1 where you do the fetch.

    Try and replace it all by a set based UPDATE statement.

    Simon Sabin

    Co-author of SQL Server 2000 XML Distilled

    http://www.amazon.co.uk/exec/obidos/ASIN/1904347088


    Simon Sabin
    SQL Server MVP

    http://sqlblogcasts.com/blogs/simons

  • Simon, I am confused about your instruction below:

    "Don't use cursor do a SELECT TOP 1 where you do the fetch."

    Why do a select top 1?

    Also, what is a set based update? Sorry, I am new to sql server and have not spent much time in programming yet.

    The match2 procedure tries to find out if the person already exists in our database. It truncates a staging table, and tries to match the information that was passed to it to some rows in our main table. If it finds some matches, it inserts them into the staging table. Then it does some more testing (only a few and simple) to see if one or more of the rows in the staging table are for sure the same person. It then returns success or failure depending on if it found an exact match or not.

  • A set based process process all data in a set at once.

    i.e

    UPDATE MyTABLE

    SET guid = MyMainTable.NewGuid

    , othercolumn = MyMainTable.otherColumn

    FROM mYTable

    LEFT JOIN MyMainTable ON MyMainTable.name = MyTable.name

    where a_UPIDonDPUL = -1

    So this updates all your records in one go (obviously it needs to be more complex).

    What I mean by using select top 1 is that if you remove the cursor stuff and put the select top 1 statement where the fetches where it will do exactly the same as you have with out the use of a cursor. In addition use @@ROWCOUNT rather than @@FETCH_STATUS.

    Simon Sabin

    Co-author of SQL Server 2000 XML Distilled

    http://www.amazon.co.uk/exec/obidos/ASIN/1904347088


    Simon Sabin
    SQL Server MVP

    http://sqlblogcasts.com/blogs/simons

  • great call, simonsabin.

    ALWAYS set based over cursor, if possible. In my opinion, the time to use a cursor is when NO other functionality will work, as SQL Server LIKES to be a set based creature.

  • A bit of history that I am not 100% on is that CURSOR support was only put maintained in SQL to keep ANSI compliance. But was not the biggest priority and thus very very poor performance. especially compared with ORACLE where CURSORS are (if I remember) really quite good.

    Simon Sabin

    Co-author of SQL Server 2000 XML Distilled

    http://www.amazon.co.uk/exec/obidos/ASIN/1904347088


    Simon Sabin
    SQL Server MVP

    http://sqlblogcasts.com/blogs/simons

  • My understanding is that most of the large SQL transactions in Oracle use Cursors in processing under the hood.

  • I have had a similiar issue with a stored-proc. The procedure ran in under 10 seconds and it ran fine for over 18 months. 1 day it stopped performing and took over the resources of the server and ran for 1 hour + before we killed it.

    Took us a half a day to try and figure out what was wrong (Including having MS on-site).

    Final resolution:

    We changed one input var from varchar to char ran sp_recompile changed the var back to varchar redid the sp_recompile and BOOM started working perfectly again.

    However.... In your case looks like a MAJOR re-write get rid of CURSOR and possibly farm some parts to other SP's for processing.

    Good Luck,

    AJ

    AJ Ahrens

    SQL DBA

    Custom Billing AT&T Labs



    Good Hunting!

    AJ Ahrens


    webmaster@kritter.net

  • Actually - cursors were extremely bad in SQL 6.5 (I don't know about 7.0) but they perform very well in SQL 2000. At least in some of our routines, the CURSOR out-performed a SELECT TOP 1 by about 10%. That said - I personally prefer the set based routine unless you absolutely MUST deal with a single row at a time.

    Another thing that will catch you off guard is the use of database constraints and triggers. If your updates always go through a stored procedure, performance improves greatly by having your own code verify foreign keys or column constrain values rather than leaving that work to the database engine.

    Guarddata

  • The other guys are right!

    Don't use the cursor!

    As I've seen your TSQL I could determint that effectivly ther will be no index usage on the sepNOBO_AnalysisTable (a_UPIDonDPUL) because of the poor selectivity. there will be a table scan.

    In my tests running a cursor versus set based select/update can be at least 10 times slower. If in addition you are running a function on each row then it's normal to have 3 hours of runtime.

    Having a set based aproach you could finish this job within a few minutes. (Especially if you are setting a table lock hint to your update statement)

    If you give me (by email if you want) the DDL of your tables and the dbo.findUPIDMatch2 function maybe could we help you to find out a better way to do the job



    Bye
    Gabor

  • I have the same consensus as everyone else regarding use of cursors. The only time I really use cursors is if I am manually handling data, it's rare to see a piece of production work from me that uses cursors.

    Aside from rewriting the query, there are a few external factors you could look at also. Is it possible that your server is out of memory and busy swapping? This is a killer.. The other thing, is a personal pet peve (sp?) of mine. Make sure you turn off document indexing or whatever it's called. With NT I had to deal with idiots installing Office on my servers, which in turn installs and sets up FINDFAST to run for you (FINDFAST runs forever and does nothing but consume io's). After eliminating this I found out that the wonderful indexing feature included in Win2k and XP (possibly .NET also) is the same thing as FINDFAST. Unless you are also using your server as a file server MAKE SURE you have the indexing service disabled. You can either disable the service entirely (through control panel, as with other services) or through the right click properties on each individual drive.


    -Ken

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply