The transient database snapshot for database 'DB1' (database ID 5) has been marked suspect due to an IO operation failure.

  • I have looked all over google regarding 'snapshots' in SQL 2008 and I have been reading up on their function and how they are created and removed as well as where you can review them from SSMS...

    I have NOT found anything of relevance regarding the actual error message. I did find a couple blog posts regarding the error message and DiskKeepper 2010 but that doesn't apply to my situation.

    We have an SAP instance that has several different kind of landscapes, on the particular server in question there is a lot of activity during the period of a DBCC check being performed.

    A lot of Disc I/O is generated...sometimes we get the error message in the subject. Additional executions of the DBCC even when nothing is running will fail.

    once the server is rebooted and the DBCC is started again then it works just fine.

    I am curious if someone has a different view point on the error listed above.

    The error is occurring on a SQL 2008 x64 enterprise edition hosted on windows 2008 enterprise. The servers are all virtual servers on an EMC san

    Any discussion is appreciated.

    Thanks,

    Lee

  • Maybe someone has some info on what a transient database snapshot is and how it is either viewed, created, or deleted...

    I cannot find any information regarding what a 'transient' db snapshot is...

    Any help is appreciated.

    Thanks,

    Lee

  • Hi Lee,

    DBCC CHECKDB uses an internal database snapshot to get a consistent view of the data and not report false errors anymore. You won't be able to see what it is through SSMS or SQLCMD, it is hidden from the user.

    Paul Randal has posted series of articles on all the steps performed by DBCC CHECKDB, they are worth reading:

    http://www.sqlskills.com/blogs/paul/post/CHECKDB-From-Every-Angle-Complete-description-of-all-CHECKDB-stages.aspx

    Hope this helps,

    David B.

    David B.

  • Hi David,

    I had figured as much...based on previous conversations with my co-workers...and I did see Paul's Blog (I visit it frequently)...

    I will read up on that link you posted and see if I can get a grasp on the 'concept'

    I will probably be posting something else as a result after reading ha!

  • Just an aside, it's hard to see a long subject. Please in future post the errors in the body of the post, not the subject.

    A transient DB snapshot is one created by CheckDB and automatically dropped once it completes. It's in the same directory as the data file(s) and it stored in an alternate NTFS stream.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Whoops...sorry about that...will make a mental note in the future.

    As for your comments about the internal snapshot, I have read through Paul's blog and do have an general understanding of how the internal snapshot is utilized...I have also found the following kb article

    http://support.microsoft.com/kb/926070

    Which details quite clearly the probable causes and effects...of our error in the SQL log.

    One thing that does not add up...at least for my situation is...when the integrity check fails...it has generally be during a period of heavy activity (i.e. data refreshes as this is a QA / Testing system)...so that scenario makes sense to me where you have potential for the error to occur...the snapshot would lose out in the resource battle...

    Even after the data refreshes are finished and the system is idle...if the integrity check is re-run, it continues to fail with the same error, there is ample free space on each drive (they are all sized the same drive size and file size) so that to me should rule the disc space concern out.

    In our troubleshooting efforts, we simply rebooted the server and re-ran the integrity check and all worked...it is almost like that 'internal' snapshot was regenerated and THEN it worked just fine...but I can't prove that...

    Easiest thing would be to move the integrity check to an off time where the loads aren't occurring...simple enough...however I would still like to know if there is a 'way' to trigger the recreation of that snapshot to see if that is indeed the case...that all is needed

    OR

    If someone has some additional thoughts on troubleshooting why a DBCC CheckDB would continue to fail during an idle time I am all ears.

    Thanks again guys for the replies!

    Lee

  • TalkToLee (4/2/2010)


    In our troubleshooting efforts, we simply rebooted the server and re-ran the integrity check and all worked...it is almost like that 'internal' snapshot was regenerated and THEN it worked just fine...but I can't prove that...

    Well, I don't have the full error, so just guessing...

    The snapshot is only created when checkDB starts, there's no way to manually create the hidden snapshot, nor can it be manually dropped. iirc you can create a normal snapshot yourself and checkDB that.

    If someone has some additional thoughts on troubleshooting why a DBCC CheckDB would continue to fail during an idle time I am all ears.

    Post the full and complete errors that checkDB returns.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Job History Error Message

    Message

    Microsoft (R) SQL Server Execute Package Utility Version 10.0.2531.0 for 64-bit Copyright (C) Microsoft Corp 1984-2005. All rights reserved. Started: 4:19:36 PM Progress: 2010-03-28 16:19:40.10 Executing query "DECLARE @GUID UNIQUEIDENTIFIER EXECUTE msdb..sp...".: 100% complete End Progress Progress: 2010-03-28 16:19:41.73 Source: Check Database Integrity Executing query "USE [master] ".: 50% complete End Progress Progress: 2010-03-28 16:19:45.25 Source: Check Database Integrity Executing query "DBCC CHECKDB(N'master') WITH NO_INFOMSGS ".: 100% complete End Progress Progress: 2010-03-28 16:19:45.26 Source: Check Database Integrity Executing query "USE [model] ".: 50% complete End Progress Progress: 2010-03-28 16:19:45.82 Source: Check Database Integrity Executing query "DBCC CHECKDB(N'model') WITH NO_INFOMSGS ".: 100% complete End Progress Progress: 2010-03-28 16:19:45.83 Source: Check Database Integrity Executing query "USE [msdb] ".: 50% complete End Progress Progress: 2010-03-28 16:19:47.29 Source: Check Database Integrity Executing query "DBCC CHECKDB(N'msdb') WITH NO_INFOMSGS ".: 100% complete End Progress Progress: 2010-03-28 16:19:47.29 Source: Check Database Integrity Executing query "USE [ET3] ".: 50% complete End Progress Error: 2010-03-28 18:19:24.37 Code: 0xC002F210 Source: Check Database Integrity Execute SQL Task Description: Executing the query "DBCC CHECKDB(N'ET3') WITH NO_INFOMSGS " failed with the following error: "Check terminated. The transient database snapshot for database 'ET3' (database ID 5) has been marked suspect due to an IO operation failure. Refer to the SQL Server error log for details. A severe error occurred on the current command. The results, if any, should be discarded.". Possible failure reasons: Problems with the query, "ResultSet" property not set correctly, parameters not set correctly, or connection not established correctly. End Error Progress: 2010-03-28 18:19:24.39 Source: Check Database Integrity Executing query "USE [DB_UTILS] ".: 50% complete End Progress Progress: 2010-03-28 18:19:25.86 Source: Check Database Integrity Executing query "DBCC CHECKDB(N'DB_UTILS') WITH NO_INFOMSGS ".: 100% complete End Progress DTExec: The package execution returned DTSER_FAILURE (1). Started: 4:19:36 PM Finished: 6:19:29 PM Elapsed: 7193.16 seconds. The package execution failed. The step failed.

    SQL Log Entries (was quite lengthy...I tried to crop out dupliates)

    2010-03-20 13:55:41.15 spid52 The operating system returned error 665(The requested operation could not be completed due to a file system limitation) to SQL Server during a write at offset 0x0000001a990000 in file 'L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7'. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

    2010-03-20 13:55:41.15 spid52 Error: 17053, Severity: 16, State: 1.

    2010-03-20 13:55:41.15 spid52 L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7: Operating system error 665(The requested operation could not be completed due to a file system limitation) encountered.

    2010-03-20 13:55:41.18 spid52 The operating system returned error 665(The requested operation could not be completed due to a file system limitation) to SQL Server during a write at offset 0x0000001a990000 in file 'L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7'. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

    2010-03-20 13:55:41.18 spid52 Error: 17053, Severity: 16, State: 1.

    2010-03-20 13:55:41.18 spid52 L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7: Operating system error 665(The requested operation could not be completed due to a file system limitation) encountered.

    2010-03-20 13:55:41.18 spid52 The operating system returned error 665(The requested operation could not be completed due to a file system limitation) to SQL Server during a write at offset 0x0000001a990000 in file 'L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7'. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

    2010-03-20 13:55:41.57 spid52 Error: 17053, Severity: 16, State: 1.

    2010-03-20 13:55:41.57 spid52 L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7: Operating system error 665(The requested operation could not be completed due to a file system limitation) encountered.

    2010-03-20 13:55:41.57 spid52 The operating system returned error 665(The requested operation could not be completed due to a file system limitation) to SQL Server during a write at offset 0x0000001a992000 in file 'L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7'. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

    2010-03-20 13:55:41.57 spid52 Error: 17053, Severity: 16, State: 1.

    2010-03-20 13:55:41.57 spid52 L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7: Operating system error 665(The requested operation could not be completed due to a file system limitation) encountered.

    2010-03-20 13:55:41.57 spid52 The operating system returned error 665(The requested operation could not be completed due to a file system limitation) to SQL Server during a write at offset 0x0000001a992000 in file 'L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7'. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

    2010-03-20 13:55:41.57 spid52 Error: 17053, Severity: 16, State: 1.

    2010-03-20 13:55:41.57 spid52 L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7: Operating system error 665(The requested operation could not be completed due to a file system limitation) encountered.

    2010-03-20 13:55:41.57 spid52 The operating system returned error 665(The requested operation could not be completed due to a file system limitation) to SQL Server during a write at offset 0x0000001a992000 in file 'L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7'. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

    ...

    ...

    ...

    2010-03-20 13:55:41.64 spid109 DBCC CHECKDB (ET3) WITH no_infomsgs executed by sa found 0 errors and repaired 0 errors. Elapsed time: 3 hours 54 minutes 47 seconds. Internal database snapshot has split point LSN = 0000212f:00113232:00b2 and first LSN = 0000212e:003be790:0001. This is an informational message only. No user action is required.

    2010-03-20 13:55:41.64 spid52 The operating system returned error 665(The requested operation could not be completed due to a file system limitation) to SQL Server during a write at offset 0x0000001a994000 in file 'L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7'. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

    2010-03-20 13:55:41.64 spid52 Error: 17053, Severity: 16, State: 1.

    2010-03-20 13:55:41.64 spid52 L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7: Operating system error 665(The requested operation could not be completed due to a file system limitation) encountered.

    2010-03-20 13:55:41.65 spid52 The operating system returned error 665(The requested operation could not be completed due to a file system limitation) to SQL Server during a write at offset 0x0000001a994000 in file 'L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7'. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

    ...

    ...

    ...

    2010-03-20 13:55:41.98 spid52 Error: 17053, Severity: 16, State: 1.

    2010-03-20 13:55:41.98 spid52 L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7: Operating system error 665(The requested operation could not be completed due to a file system limitation) encountered.

    2010-03-20 13:55:41.99 spid52 The operating system returned error 665(The requested operation could not be completed due to a file system limitation) to SQL Server during a write at offset 0x0000001a994000 in file 'L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7'. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

    2010-03-20 13:55:41.99 spid52 Error: 17053, Severity: 16, State: 1.

    2010-03-20 13:55:41.99 spid52 L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7: Operating system error 665(The requested operation could not be completed due to a file system limitation) encountered.

    2010-03-20 13:55:41.99 spid61 The operating system returned error 665(The requested operation could not be completed due to a file system limitation) to SQL Server during a write at offset 0x0000021ac70000 in file 'L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7'. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

    2010-03-20 13:55:41.99 spid52 The operating system returned error 665(The requested operation could not be completed due to a file system limitation) to SQL Server during a write at offset 0x0000001a994000 in file 'L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7'. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

    2010-03-20 13:55:41.99 spid52 Error: 17053, Severity: 16, State: 1.

    2010-03-20 13:55:41.99 spid52 L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7: Operating system error 665(The requested operation could not be completed due to a file system limitation) encountered.

    2010-03-20 13:55:42.00 spid61 Error: 17053, Severity: 16, State: 1.

    2010-03-20 13:55:42.00 spid61 L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7: Operating system error 665(The requested operation could not be completed due to a file system limitation) encountered.

    2010-03-20 13:55:42.01 spid52 The operating system returned error 665(The requested operation could not be completed due to a file system limitation) to SQL Server during a write at offset 0x0000001a994000 in file 'L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7'. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

    2010-03-20 13:55:42.01 spid61 The operating system returned error 665(The requested operation could not be completed due to a file system limitation) to SQL Server during a write at offset 0x0000021ac70000 in file 'L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7'. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

    2010-03-20 13:55:42.01 spid52 Error: 17053, Severity: 16, State: 1.

    2010-03-20 13:55:42.01 spid52 L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7: Operating system error 665(The requested operation could not be completed due to a file system limitation) encountered.

    2010-03-20 13:55:42.01 spid61 Error: 17053, Severity: 16, State: 1.

    2010-03-20 13:55:42.01 spid61 L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7: Operating system error 665(The requested operation could not be completed due to a file system limitation) encountered.

    2010-03-20 13:55:42.01 spid52 The operating system returned error 665(The requested operation could not be completed due to a file system limitation) to SQL Server during a write at offset 0x0000001a994000 in file 'L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7'. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

    2010-03-20 13:55:42.04 spid52 Error: 17053, Severity: 16, State: 1.

    2010-03-20 13:55:42.04 spid52 L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7: Operating system error 665(The requested operation could not be completed due to a file system limitation) encountered.

    2010-03-20 13:55:42.07 spid61 The operating system returned error 665(The requested operation could not be completed due to a file system limitation) to SQL Server during a write at offset 0x000000c98ea000 in file 'L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7'. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

    2010-03-20 13:55:42.07 spid52 The operating system returned error 665(The requested operation could not be completed due to a file system limitation) to SQL Server during a write at offset 0x0000001a994000 in file 'L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7'. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

    2010-03-20 13:55:42.07 spid52 Error: 17053, Severity: 16, State: 1.

    2010-03-20 13:55:42.07 spid52 L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7: Operating system error 665(The requested operation could not be completed due to a file system limitation) encountered.

    2010-03-20 13:55:42.07 spid61 Error: 17053, Severity: 16, State: 1.

    2010-03-20 13:55:42.07 spid61 L:\ET3DATA3\ET3DATA3.ndf:MSSQL_DBCC7: Operating system error 665(The requested operation could not be completed due to a file system limitation) encountered.

  • Lee,

    here is an explanation in PSS blog about 665 errors and dbcc checkdb snapshot errors:

    http://blogs.msdn.com/psssql/archive/2008/07/10/sql-server-reports-operating-system-error-1450-or-1452-or-665-retries.aspx

    It matches your scenario of running CHECKDB with high IO activity. You should schedule CHECDB off-business anyway.

    David B.

    David B.

  • Interesting that the limit on sparse file size reduced from 64GB (Windows 2003) to 16GB (Windows 2008, Vista). There appears to be a QFE for this: http://support.microsoft.com/default.aspx/kb/957065

  • So how are we supposed to check consistency now?

    Closing this as "by design" is quite ridiculous.

    This needs fixing and quickly, database outages with our system literally puts peoples lives at risk, this needs a rapid redesign as it is the most critical tool we have in our box.

    This cannot wait until the net release IT NEEDS FIXING NOW...I'm f***ing livid...By design ffs

  • sql.monkey (8/1/2011)


    So how are we supposed to check consistency now?

    Restore a backup onto a secondary server and checkDB that. It's the recommended approach for large, active databases due to the impact CheckDB has on a database while running.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks...

    That is not always feasible or practical with multi terabyte databases, it requires extra hardware and on very large systems (and ours is approaching 10tb, we already have an active/passive cluster behind it, mirrored san volumes etc.) and it is also requires a huge amount of expense, this is probably not going to happen as it would cost too much and hasn't been budgeted for in this FY.

    Its pretty difficult to justify that much expense to have a system available in case.

  • sql.monkey (8/1/2011)


    That is not always feasible or practical with multi terabyte databases, it requires extra hardware and on very large systems (and ours is approaching 10tb, we already have an active/passive cluster behind it, mirrored san volumes etc.) and it is also requires a huge amount of expense, this is probably not going to happen as it would cost too much and hasn't been budgeted for in this FY.

    So where/how do you test your backups? Typically that's where you'd do checkDB, restore the backup and checkDB it, ensures that the backup is restorable and that the source DB was undamaged at the time of backup

    I've worked with multi-terabyte databases and it's exactly those scenarios where you need to be doing checkDB on a backup simply because of the time it takes and the impact it has on the live database. I couldn't even imagine trying to do a checkDB during normal activity on the banking system I used to work with.

    There are enough tools available to help with this, Redgate's virtual restore and storage compress immediately come to mind.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • My 2 cents -

    I believe that the drive should have enough free disk space as Database snapshot creates internal snapshots and if that does not have enough free space, it may not be able to grow.

    Thanks.

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply