SQLServerCentral Article

Handling The Text Data Type

,

Introduction

In this article I plan to show you various ways you can manipulate the data stored in a text data type. These techniques will be particularly useful when the number of characters in the text column exceeds 8,000. They may not work when the text in row option has been set for a table. Everything I explain here will focus on manipulating columns of data type text when the text in row option is not enabled for a table. I will focus on the text data type because I am most familiar with it, although it is possible that much of what I show here will work for ntext and maybe even image data types.

This article will give you an understanding of the text data type, show you how to get the pointer for a column with data type text, how to update the value of that column using the pointer, and a few other functions that work with the text data type.

Text Data Type

Most of the functions and techniques you can use for other character data types will not work on a text data type. The data in columns of data type text are not normally stored on the same data page with the data in other columns of a table. What is stored with the other columns of a table is the pointer that tells SQL Server where it can find the characters you store in the text column. I believe this is what makes SQL Server treat the text data type differently.

You can’t concatenate characters to a text data type the way you would for a varchar column. You must first cast the text column to a varchar and then concatenate. That works fine until you have more than 8,000 characters in your text column.

You can’t use the function’s LEFT, RIGHT, or LEN on a column of data type text without casting the column as varchar(8000). Again, you will be fine unless you have more than 8,000 characters in the column. To get the length you can use the function DATALENGTH instead of LEN.

Getting A Text Pointer

The following code will get and display a pointer value:

CREATE TABLE #TestTable 
(
RowID int,
ColText text
)
INSERT INTO #TestTable (RowID,ColText)
SELECT 1,''
DECLARE @ptrColText varbinary(16)
SELECT @ptrColText = TEXTPTR(ColText)
FROM #TestTable
SELECT @ptrColText

The pointer must be binary(16) or varbinary(16). There must be a value for the column with data type text before a pointer will be established for it, hence the INSERT statement in the above code. The function TEXTPTR gets the pointer value for each row in a result set.

Example 1: Concatenation

The UPDATETEXT command can be used to correct spelling errors and to make any other modifications and/or additions to the characters in a column of data type text. It also works well when you need to concatenate a list that has more than 8,000 characters in it.

Here is an example that can be executed after executing the code from the previous section. This code will concatenate numbers to create a coma delimited list that is more than 8,000 characters long:

DECLARE @Value varchar(10),
@Counter int
SET @Counter = 1
WHILE @Counter <= 2000
BEGIN
SET @Value = LTRIM(STR(@Counter))
IF @Counter < 2000
SET @Value = @Value + ','
UPDATETEXT #TestTable.ColText @ptrColText NULL 0 @Value 
SET @Counter = @Counter + 1
END
SELECT DATALENGTH(ColText), * FROM #TestTable

Let us look at the UPDATETEXT command a moment. After the keyword UPDATETEXT you see the name of the table followed by a period and the name of the column. Next is the variable that holds the pointer. Since each text column has its own pointer this will focus on one row (and only one column) of a table no matter how many rows (and columns) are in a table.

After this you see the word NULL. This is in the position of the insert offset. When it is NULL it tells SQL Server to append the new characters to the end of all the characters currently in the text column. If the value is 0 then SQL Server will insert the new characters before all the characters currently in the text column. If you specify a number then it will count from the first character to the number specified and insert the new characters after the one found at that point.

SQL Server will not delete any characters from those in the text column unless you specify a number other than 0. In the example above I used a 0 after the NULL as the delete length to indicate I didn’t want to delete any characters from those already stored. If I had used a value other than 0 I would have gotten the following error:

Server: Msg 7135, Level 16, State 2, Line 29

Deletion length 1 is not in the range of available text, ntext, or image data.

The statement has been terminated.

Last I used a variable to supply the characters I wanted to insert into the text column.

Also you will notice that I used the function DATALENGTH in the last query to get the number of characters now stored in the column ColText. The number of characters in the column are 8,892. This number is 700 more than can be viewed in Query Analyzer since it only allows you to view the first 8,192 characters of each column. In Example 3, I will show you a way you can view those extra 700 characters.

Example 2: Misspelled Words 

The delete length parameter is useful for correcting spelling errors in a text. Here is an example that will replace a misspelled word with the word spelled correctly:

CREATE TABLE #TestTable 
(
RowID int,
ColText text
)
INSERT INTO #TestTable (RowID,ColText)
SELECT 1,'Here I have one mispellet word.'
UNION ALL
SELECT 2,'Here I have one mispellet word in the first sentance.  Another one is mispellet in the second sentance.'
DECLARE @ptrColText varbinary(16),
@RowID int,
@MisspelledWord varchar(40),
@CorrectedWord varchar(40),
@LenMisspelledWord int,
@StartingPosition int
SET @RowID = 1
SET @MisspelledWord = 'mispellet'
SET @CorrectedWord = 'misspelled'
SET @LenMisspelledWord = LEN(@MisspelledWord)
PRINT 'Rows in table before correcting spelling errors.'
SELECT * FROM #TestTable WHILE @RowID <= 2
BEGIN
SET @StartingPosition = 1
--get pointer
SELECT @ptrColText = TEXTPTR(ColText) FROM #TestTable WHERE RowID = @RowID
WHILE @StartingPosition > 0
BEGIN
--get starting position of first occurence of characters in @MisspelledWord
SELECT @StartingPosition = CHARINDEX(@MisspelledWord, ColText) - 1
FROM #TestTable
WHERE RowID = @RowID
IF @StartingPosition > 0
UPDATETEXT #TestTable.ColText @ptrColText @StartingPosition @LenMisspelledWord @CorrectedWord 
END
SET @RowID = @RowID + 1
END
PRINT 'Rows in table after correcting spelling errors.'
SELECT * FROM #TestTable
DROP TABLE #TestTable

Upon execution you will notice that the three occurrences of the word mispellet were replaced with misspelled. CHARINDEX will return a 0 when the character combination in the variable @MisspelledWord is not found in the column called ColText. Thus all occurrences will be replaced in both rows of the temp table.

Notice that in this example I used variables for the insert offset and the delete length. In the previous example these two parameters were hard coded to one value.

Example 3:  SUBSTRING

With a column of data type text it can sometimes be difficult to view the characters beyond the first 8,000 or 8,192. A varchar variable can only hold 8,000 characters. Casting the text column to varchar also will only get you the first 8,000 characters. Viewing the column in Query Analyzer will only let you see the first 8,192 characters. Enterprise Manager will display <Long Text> if there are more than 900 characters in a text field. A way to view what is beyond the first 8,192 characters can be useful. One way is to use the function SUBSTRING. Here is how this can be used on the data in the temp table from Example 1:

SELECT SUBSTRING(ColText, 8193, 700)
FROM #TestTable

The above query displays the 700 characters that you couldn’t see in Query Analyzer without using the SUBSTRING function. Using SUBSTRING you can view any section of characters stored in a column with data type text by giving the starting position number and how many characters from that point you want to see. You can use the function CHARINDEX or PATINDEX to find the desired starting point.

Conclusions

In this article you have learned how to get the pointer of a text column and how to use that pointer with the keyword UPDATETEXT to concatenate characters and replace characters in a column of data type text. You’ve also read how you can view characters stored in a column of data type text when they exceed the max normal capabilities of the tools that come with SQL Server. I have not covered every possibility. My intent in this article is to give you a starting point to develop solutions that will be helpful to you.

Rate

5 (4)

You rated this post out of 5. Change rating

Share

Share

Rate

5 (4)

You rated this post out of 5. Change rating