Alpha character in SPID? Disk I\O Bottleneck Troubleshooting.

  • Hi.

    OS is Server 2003 R2 Enterprise Service Pack 2.

    SQL is Microsoft SQL Server 2005 - 9.00.3353.00.

    I was tasked with troubleshooting the following repeating error:

    Sourcespid4s

    Message

    SQL Server has encountered 30 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [C:\Program Files\Microsoft SQL Server\MSSQL.2\MSSQL\DATA\DataBase_data.mdf] in database [DataBase] (8). The OS file handle is 0x00000960. The offset of the latest long I/O is: 0x000000d80d8000

    However, I don't know how to troubleshoot "spid4s".

    What is the alpha character 's' doing in the spid?

    There spid is just the session ID, right?

    Session IDs only have numbers.

    There is a session 4, but that's been asleep while the error has happened.

    It's happened multiple times while I've watched it because of 'SPID4' which I can't even investigate.

    Can anyone shed some light for me?

  • System process.

    It's session_id 4 and the s marks it as a system process. So spid 56s would be some system process running on session_id 56

    p.s. Session_id 4 was the one reporting the error, not the one causing the problem.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • So, SPID4 is reporting the problem, not necessarily the cause of the problem?

    SPID 4 is Lazy Writer.

    Am I correct in assuming that because of the 'S' in the SPID that SPID 12 'CHECKPOINT' would be reported as '12S'?

    Or is it the fact that it involves a Windows process that invokes the 'S'?

  • Adam_Smasher (3/7/2013)


    So, SPID4 is reporting the problem, not necessarily the cause of the problem?

    Correct.

    The slow IOs is telling you that in the last hour, there were x number of IO operations that took over 15 seconds to complete. Since IO operations should be measured in the ms, that's a slight problem. There's nothing in the message that can suggest which processes had slow IOs, just which file.

    Am I correct in assuming that because of the 'S' in the SPID that SPID 12 'CHECKPOINT' would be reported as '12S'?

    If the checkpoint had reason to write into the error log, the source column would be spid12s

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks for the help.

    I just wanted to let everyone know that I am looking to move the most heavily used database and the TempDB to another physical drive.

    I haven't received confirmation from the application analyst when I can do this.

    I will update this thread whatever the outcome when I can.

  • Make sure you get your physical hardware folks to check out the storage. A RAID controller cache battery going bad can cause the RAID card to disable its [write] cache, which can cause this. Network/FC issues, etc. for non-local storage can also have effects.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply