Intermittent timeouts when writing to varbinary(max) columns

  • We have an application that is generating intermittent timeout errors when streaming data to a varbinary(max) column using .WRITE. The problem started to occur once we had a large number of rows in the table (approx. 20,000). The binaries are of various sizes, typically 5KB to 100KB.

    In the profiler we noticed that the statement that causes the timeout performs a very large number of reads (see attached screen shot). The typical "update .write" statements perform about 40 reads.

    We buffer 8040 * 8 bytes before writing, with the 8040 being the recommended multiple for .WRITE calls.

    The timeout occurs at random points; not necessarily the first buffer written nor the last.

    The table definition we are inserting into is as follows:

    CREATE TABLE [Hyper].[File](

    [Id] [uniqueidentifier] ROWGUIDCOL NOT NULL CONSTRAINT [DF_File_Id] DEFAULT (newid()),

    [FileName] [nvarchar](200) NOT NULL,

    [MimeType] [varchar](100) NOT NULL,

    [Blob] [varbinary](max) NULL,

    [Rowversion] [timestamp] NOT NULL,

    [IsTemporary] [bit] NOT NULL,

    [CreatedDateTime] [datetime] NOT NULL CONSTRAINT [DF_File_CreatedDateTime] DEFAULT (getdate()),

    [UpdatedDateTime] [datetime] NULL,

    CONSTRAINT [PK_File] PRIMARY KEY CLUSTERED

    (

    [Id] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    The statements we use to first initialise the Blob column and then write are:

    exec sp_executesql N'insert into [Hyper].[File] ([FileName], [MimeType], [IsTemporary], [Id])

    values(@FileName, @MimeType, @IsTemporary, @Id)',

    N'@FileName nvarchar(8),@MimeType varchar(10),@IsTemporary bit,@Id uniqueidentifier',@FileName=N'TestFile',@MimeType='text/plain',@IsTemporary=1,@Id='9F9AD1CD-2C5A-DE11-A823-002268DF74EB'

    exec sp_executesql N'UPDATE [Hyper].[File] SET Blob = 0x WHERE Id = @guid',

    N'@guid uniqueidentifier',@guid='9F9AD1CD-2C5A-DE11-A823-002268DF74EB'

    exec sp_executesql N'UPDATE [Hyper].[File] SET Blob .WRITE (@bytes, @offset, @count) WHERE Id = @guid',

    N'@bytes varbinary(6080),@offset int,@count int,@guid uniqueidentifier',@bytes=0xAB6841B5C2C58A.....truncated for brevity.....101BC2A7D638A8DF3339161C4BBC4EA8E4650D,@offset=321600,@count=6080,@guid='9F9AD1CD-2C5A-DE11-A823-002268DF74EB'

    Note: the last statement above is actually the one that caused the timeout in the trace. Thus the offset is not zero.

    So every now and again we experience an "UPDATE .WRITE" statement that causes a very large number of reads.

    Does anyone know what might be happening to cause this?

    How can I trace it to find out?

  • I'd suspect that hte problem is the clustered index on hte GUID.

    Try changing the clustered index to a monotonic value - such as an IDENTITY column

  • Hey, interesting issue.

    Try tracing the Showplan XML Statistics Profile and Showplan Text (Unencoded) events.

    (You can save the XML plans to a separate file by specifying that on the Events Extraction Settings tab.)

    It would be great to see the text and/or XML plans for an example timeout.

    It makes sense to eliminate a simple plan-based cause before we move on to look for more complex causes.

    You will probably need to zip the plans before attaching to a post!

    Cheers,

    Paul

  • Capturing the "Showplan Text (Unencoded)" did show up something interesting. Unfortunately, the trace didn't generate a "Showplan XML Statistics Profile" trace record for the problem statement.

    The problem statement has a read count of 381,814. It appears to be doing a statistics generation on the Blob column.

    Execution Tree

    --------------

    Stream Aggregate(DEFINE: ([Expr1004]=STATMAN([Expr1006])))

    |--Sort(ORDER BY: ([Expr1003] ASC))

    |--Compute Scalar(DEFINE: ([Expr1003]=CONVERT(varbinary(200),substring([Chisholm_ISP].[Hyper].[File].[Blob],(1),(100))+substring([Chisholm_ISP].[Hyper].[File].[Blob],CONVERT_IMPLICIT(int,CASE WHEN len(CONVERT_IMPLICIT(varchar(max),[Chisholm_ISP].[Hyper].[File].[Blob],0))<=(200) THEN (101) ELSE len(CONVERT_IMPLICIT(varchar(max),[Chisholm_ISP].[Hyper].[File].[Blob],0))-(99) END,0),(100)),0), [Expr1006]=CONVERT(varbinary(200),substring([Chisholm_ISP].[Hyper].[File].[Blob],(1),(100))+substring([Chisholm_ISP].[Hyper].[File].[Blob],CONVERT_IMPLICIT(int,CASE WHEN len(CONVERT_IMPLICIT(varchar(max),[Chisholm_ISP].[Hyper].[File].[Blob],0))<=(200) THEN (101) ELSE len(CONVERT_IMPLICIT(varchar(max),[Chisholm_ISP].[Hyper].[File].[Blob],0))-(99) END,0),(100)),0)))

    |--Clustered Index Scan(OBJECT: ([Chisholm_ISP].[Hyper].[File].[PK_File]))

    I'm going to try generating statistics using "NO RECOMPUTE" to see if the problem goes away and, if so, organise for a regen of the statistics in the maintanance plan.

    I'll let you all know how it goes.

    Thanks.

    Glenn.

  • Have you considered enabling asynchronous statistics updates in the database?

    ALTER DATABASE [db name] SET AUTO_UPDATE_STATISTICS_ASYNC ON;

    Paul

  • Thanks Paul, your solution is even better. I'm wondering why this isn't the default setting. The only caveat seems to be that you have to turn it off to go to single user mode, but who uses single user mode anyway?

    Both solutions actually work and we can no longer generate timeout errors when updating this table. I have even ramped up the test harness to 30 concurrent users reading and writing random sized blobs to the table as fast as they can, which is much more than our system needs to support.

    Glenn.

  • Good stuff Glenn.

    Setting asynchronous statistics doesn't prevent a move to single user mode - but the change may be blocked until any in-progress background stats update completes. There are two primary ways to work past this:

    1. Issue ALTER DATABASE x SET SINGLE_USER WITH ROLLBACK IMMEDIATE | AFTER n [SECONDS]

    2. Use KILL STATS JOB and the sys.dm_exec_background_job_queue dynamic view.

    Method one is easier 😉

Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply