Improving Text and Image Column Performance

,

Problem

When I first developed the FAQ area on

SQLServerCentral.com, it performed wonderfully. After a number of large FAQs

began to be entered in though and as the library grew to over a few hundred

entries, I noticed that the web page would some times display much slower and

often timeout. I tried everything to remedy like more aggressive caching and

query hints. None worked and the frustrated readers continued to e-mail me about

the FAQ area and the Script areas slowness..

Solution

Since we're at a hosting provider, we're limited to what we can do

hardware-wise to remedy this problem. For example, we can't move the text and

image columns to a different file group on a different drive array. So, I

decided to try the new (not so new anymore) TEXT IN ROW feature that ships with

all editions of SQL Server 2000. To understand this feature, you must first

understand how text, ntext and image columns work. Take for example the below

abridged schema:

CREATE TABLE [FAQ2] (

[FAQID] [int] IDENTITY (1, 1) NOT NULL ,

[TITLE] [varchar] (200),

[AUTHORID] [int] NULL ,

[RATING] [numeric](18, 4) NULL,

[FAQBODY] [text] NOT NULL)

Once this table is created, notice how in the Table Properties screen below

how it shows its size as 16 bytes. Text and image columns can store up to 2 GB

of data though. So, why the disparity? First you must remember the total amount

of space that can be stored on a data page in SQL Server, which is 8060 bytes.

Hardly enough room for a 2GB text field! So, instead of storing the text or

image column on the same data page, it simply stores a 16 byte pointer to where

it can be found. This is sometimes why text and image columns perform slower

than a varchar(2000). In our varchar(2000) example, SQL Server never needs to

leave the data page to find its data and perform reads much quicker than with a

text column.

TEXT IN ROW to the rescue! The TEXT IN ROW feature allows you to store

smaller amounts of text and image data onto the data page. The great thing about

this feature is it lets you have your cake and eat it to. It will store data

less than a preset amount on the data page and anything above is still done the

traditional way. It's a great way of solving the issue of an analyst coming to

you and saying they "may" someday need all those text columns, even though

they're only using 200 characters at the most.

To enable the feature, use the SP_TABLEOPTION system stored procedure and

specify the table as shown below:

SP_TABLEOPTION 'FAQ2',

'TEXT IN ROW', 'ON'

Now the feature is turned on and has defaulted to anything less than or equal

to 256 bytes will be stored on the data page. This 256 byte setting is the

optimal setting for the feature. You may want to set this higher nonetheless.

For example, I noticed by running the below query that the average column was

about 600 bytes.

SELECT AVG(DATALENGTH(FAQBODY)) FROM FAQ2

I also determined from the following query that the maximum that this column

was storing was 9,243, a little too much for a varchar field.

SELECT MAX(DATALENGTH(FAQBODY)) FROM FAQ2

So I took the average column length (600) and decided to store all columns

equal to or less than 600 bytes in the data row. You can do this by using the

SP_TABLEOPTION syntax again as shown below:

SP_TABLEOPTION 'FAQ2',

'TEXT IN ROW', '600'

As soon as you do enable, you will see no effect until new rows are inserted

into the table. Most people can't wait this long however so the workaround would

be to run a quick UPDATE statement like the one shown below to move the eligible

items back to the data page:

UPDATE FAQ2

SET FAQBODY = FAQBODY

Any new items are automatically moved if their eligible. You will also know

when you go to Table Properties (shown below), that the table shows the FAQBODY

text column as 600 bytes. Data stored in the data page operates much like a

varchar field. For example, if you try to insert 280 characters of data into the

FAQBODY column, only 280 bytes of space will be used in the row. Anything

greater than 600 bytes stores the 16 byte pointer in the row and the remainder

is stored off the data page.

When you turn the option off by using the below syntax, it may take a long

time for this to finish as it has to move all the data back off the data row.

This is a logged operation so your transaction log may fill up quickly and your

table will be locked while this is occuring.

SP_TABLEOPTION 'FAQ2',

'TEXT IN ROW', 'OFF'

Note : If you have multiple text or image columns in a table that has this

option turned on, SQL Server will try to fit as much as it can on a single data

page. Anything that can't fit will be stored off the data page. Make sure that

the combined size of all the columns does not exceed 8060 bytes to avoid any

problems with this.

Negative Effects of TEXT IN ROW

Before implementing this option in production, make sure you fully test your

application. This extensive testing would be needed to confirm that your

application does not use one of the following text and image statements:

  • READTEXT
  • UPDATETEXT
  • WRITETEXT

This syntax will not work on tables using the TEXT IN ROW option.

Conclusion

We at SQLServerCentral.com have seen a tremendous improvement in the time out

rate of our FAQ area as soon as we turned on this option. The data retrieval of

these columns are at least 50% faster from our initial tests. As a general rule

make sure that you have justified why you're using a text or image columns and

confirm that these actions can't be done through a larger varchar field. Let us

know what you think of text and image column performance by clicking on the Your

Opinion button below.

 

Rate

4.67 (3)

Share

Share

Rate

4.67 (3)