Blog Post

Watch those VARCHAR(MAX) columns, and ONLINE index rebuilds

,

G’day,

I’ve noted a few instances lately of cases where VARCHAR(MAX) columns have been used unnecessarily, when a small value (for a VARCHAR) would have been fine.

Apart from the storage implications of a VARCHAR(MAX) column there is also the issue that in SQL SERVER 2008, having a column of this type (a LOB) in your table will prevent any ONLINE index rebuilds of the clustered index. And while, a non cluster index on this table will be able to be rebuilt ONLINE – as long as it does not INCLUDE the VARCHAR(MAX) column – or any other LOB column for that matter,  the inability to preform ONLINE index operations may be an unintended side effect.

I’m sure that other people have seen this too. For example, a table containing names, where nobody wanted to make a call about a simple thing like the maximum length of the last name, so a VARCHAR(MAX) column has been added to the table – totally unnecessarily!

So let’s just create the table that was mentioned in the paragraph above and have a look at some of the unintended impacts that this will have no ONLINE index operations. Please note that ONLINE operations are only available in the higher versions of SQL SERVER 2008

USE TestDB
GO
--DROP THE TABLE IF IT EXISTS
IF OBJECT_ID('dbo.TestTable', 'U') IS NOT NULL
BEGIN
PRINT 'DROPPING TABLE ';
DROP TABLE dbo.TestTable;
END
GO
--CREATE THE TABLE NOW IT'S DEFINITELY BEEN REMOVED
CREATE TABLE dbo.TestTable
(
PK_ID INT NOT NULL IDENTITY(1,1),
FIRST_NAME VARCHAR(30) NOT NULL,
LAST_NAME VARCHAR(MAX) NOT NULL,
PHONE VARCHAR(20)
);
--Lets add a few rows
INSERT INTO dbo.TestTable(FIRST_NAME, LAST_NAME, PHONE) VALUES('ALBERT' , 'EINSTEIN' , '0800 MC2');
INSERT INTO dbo.TestTable(FIRST_NAME, LAST_NAME, PHONE) VALUES('STEVE' , 'JOBS' , '0800 APPLE');
GO

Before we look at the impact on ONLINE index operstions, let’s first make a note of the impact that the above table has on our initial choice of indexes.

First off the LOB column can’t be used as a primary key (or as any index key for that matter) – but let’s try, just to see what happens.

ALTER TABLE dbo.TestTable ADD CONSTRAINT
PK_TESTABLE PRIMARY KEY CLUSTERED (LAST_NAME)
GO

whoops….

Msg 1919, Level 16, State 1, Line 1
Column ‘LAST_NAME’ in table ‘TestTable’ is of a type that is invalid for use as a key column in an index.
Msg 1750, Level 16, State 0, Line 1
Could not create constraint. See previous errors.
so, lets just choose PK_ID as our clustered primary key
ALTER TABLE dbo.TestTable ADD CONSTRAINT
PK_TESTABLE PRIMARY KEY CLUSTERED (PK_ID)
GO

OK, so now let’s look at the effect on ONLINE index operations, because this looks like a table that could grow real big, and in a busy database we’ll need access to this all the time – so we don’t want any downtime that will break our SLA’s caused by long running index rebuilds – so we’ll rebuild the index ONLINE.

ALTER INDEX PK_TESTABLE ON dbo.TestTable
REBUILD WITH (ONLINE=ON);

Whoops..

Msg 2725, Level 16, State 2, Line 1

An online operation cannot be performed for index ‘PK_TESTABLE’

because the index contains column ‘LAST_NAME’ of data type text,

ntext, image, varchar(max), nvarchar(max), varbinary(max), xml,

or large CLR type. For a non-clustered index, the column could

be an include column of the index. For a clustered index, the

column could be any column of the table. If DROP_EXISTING is used,

the column could be part of a new or old index.

The operation must be performed offline.

OK, so we’ll try to drop the index ONLINE

ALTER TABLE dbo.TestTable
DROP CONSTRAINT PK_TESTABLE WITH (ONLINE=ON);

Same error message again.

So, we’ll just cheat and drop the index offline (if we did this on a very large table then the index would be inaccessible while it was being dropped )

ALTER TABLE dbo.TestTable
DROP CONSTRAINT PK_TESTABLE;

OK, so now we don’t have a primary key on the table at all, we assume we want to put the original one back, only this time we’ll use an ONLINE operation to do it

ALTER TABLE dbo.TestTable
ADD CONSTRAINT PK_TESTABLE PRIMARY KEY
CLUSTERED (PK_ID) WITH (ONLINE=ON);
GO

and our old friend msg 2725 is back again.

OK, our conclusion here is that we pretty much can’t make any ONLINE alterations to the primary key when we have a LOB column in the table.

So lets see how non clustered indexes hold up.

It’d be kind of good to have a non clustered index on first name and last name, somebody might search on that, so we’ll attempt to create that as a non clustered.

CREATE NONCLUSTERED INDEX NC1_TESTTABLE
ON dbo.TestTable(FIRST_NAME , LAST_NAME);
GO

and we’re back to the error message

Msg 1919, Level 16, State 1, Line 1

Column ‘LAST_NAME’ in table ‘dbo.TestTable’

is of a type that is invalid for use as

a key column in an index.

But, what if we have a non clustered index that does not include the LOB column, like so

CREATE NONCLUSTERED INDEX NC2_TESTTABLE
ON TestTable(FIRST_NAME);
GO

Can we re-build that ONLINE

ALTER INDEX NC2_TESTTABLE
ON TestTable REBUILD WITH (ONLINE=ON);
GO

We sure can.

So, can we get our LOB column to appear in any index at all?

Well, yes, we can use an INCLUDEd INDEX, like so – but first let’s try creating the index ONLINE

CREATE NONCLUSTERED INDEX NC3_TESTTABLE
ON TestTable(FIRST_NAME)
INCLUDE(LAST_NAME) WITH (ONLINE=ON)
GO

back to our old friend, error message 2725

So we’ll just do it as an offline operation

CREATE NONCLUSTERED INDEX NC3_TESTTABLE
ON TestTable(FIRST_NAME)
INCLUDE(LAST_NAME) WITH (ONLINE=OFF)
GO

and it’s successful! Note that we specifically set ONLINE to OFF above, but that’s the default anyway so we don’t have to do it.

In summary, the main point I wanted to make here is that a poor choice of data type at design time (in this case a LOB rather than a small VARCHAR) can have enormous consequences down the track when the amount of data becomes large (or even when it doesn’t)

anyway, if you’re still reading, I hope you enjoyed this.

Have a nice day.

Cheers

Martin.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating