Expanding column length on very large table

Question

Expanding column length on very large table

david.tyler

SSCrazy

Points: 2206
More actions
August 5, 2009 at 9:19 am

#213125

Hi all - This may be very simple but I wanted to get other opinions on my challenge. I have a table that contains several million rows. Within the table I have a char(1) NULL column that sits in the middle of the column list. I have a need to expand that column to varchar(2) NULL. A new business rule includes the possibility of a two character code. In the past, simply updating the length of the existing column took a very long time. Would it be faster to add a new column to the end of the table, insert the previous column data, drop the old column, and finally rename the new column?

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply

PaulB-TheOneAndOnly SSC Guru Points: 66772 More actions · Answer 1

The reason for the delay is datatype conversion from chr() to varchar2()

If your table is not too large meaning less than 50 million rows or so and you can afford a small maintenance window I would probably re-create the table resorting to CTAS.

_____________________________________
Pablo (Paul) Berzukov

Author of Understanding Database Administration available at Amazon and other bookstores.

Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.

MVDBA (Mike Vessey) SSC-Insane Points: 21797 More actions · Answer 2

if you are using management tools then this will be very slow as management studio will work on the principle of making this backwards compatible (to sql7 i think - might be 6.5) and performs the following sequence

create table x

bulk insert from y into x

drop y

rename x as y

the following is your fastest way

alter table x alter column y varchar(2)

i tested this against 32 million rows and it performed in 1 second

MVDBA

The Dixie Flatline SSC Guru Points: 53354 More actions · Answer 3

Were the rows populated, or did it have nulls in the char(1) column?

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

david.tyler SSCrazy Points: 2206 More actions · Answer 4

Thank you all for the replies. These is data in the column however some values are NULL.

PaulB-TheOneAndOnly SSC Guru Points: 66772 More actions · Answer 5

michael vessey (8/5/2009)alter table x alter column y varchar(2)
i tested this against 32 million rows and it performed in 1 second

Quick question... was column "Y" a populated char(1) datatype column sitting in the middle of the row?

_____________________________________
Pablo (Paul) Berzukov

Author of Understanding Database Administration available at Amazon and other bookstores.

Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.

david.tyler SSCrazy Points: 2206 More actions · Answer 6

Paul - Yes to your question. The field in question is populated but is a NULLABLE field and it does sit in the middle of the row.

FYI - I ran the ALTER TABLE ALTER COLUMN statement above in our DEV environment. The comment took 13 minutes for 13 million rows. I watched activity monitor and frequently had PAGEIOLATCH_EX waits.

david.tyler SSCrazy Points: 2206 More actions · Answer 7

david.tyler

SSCrazy

Points: 2206

August 5, 2009 at 10:38 am

#1034947

Rggg...typing. Command not comment.

PaulB-TheOneAndOnly SSC Guru Points: 66772 More actions · Answer 8

david.tyler (8/5/2009)
Paul - Yes to your question. The field in question is populated but is a NULLABLE field and it does sit in the middle of the row.
FYI - I ran the ALTER TABLE ALTER COLUMN statement above in our DEV environment. The comment took 13 minutes for 13 million rows. I watched activity monitor and frequently had PAGEIOLATCH_EX waits.

Sorry, my question was for Michael... hard to believe your scenario running on 1 second for 32 million rows.

_____________________________________
Pablo (Paul) Berzukov

Author of Understanding Database Administration available at Amazon and other bookstores.

Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.

The Dixie Flatline SSC Guru Points: 53354 More actions · Answer 9

It only takes a second when you are doing something like expanding the maximum size of a varchar field, because none of the actual data rows have to change. If it knew that all rows were null for the char(1), it could create a varchar(x) column and deactivate the old column.

But changing from a char(1) to a varchar, i'm sure pointers would have to be established on the page to support the varchar. It wouldn't necessarily all have to be done at once. Maybe it's smart enough to do that on a row by row basis (or page by page) as data is updated.

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

david.tyler SSCrazy Points: 2206 More actions · Answer 10

Follow Up: Thanks again everyone who replied. Here are my findings after testing both solutions:

Sol1: Alter the column with an Alter Table statement

Took 13 minutes to alter 13 million rows.

Sol2: Add a new col, copy data from old col to new, drop old col, rename new col

Took just under 3 minutes for the 13 million rows. Here is the script I used.

ALTER TABLE MyTable ADD newclass VARCHAR(2);

GO

UPDATE MyTable

SET newclass = Class;

GO

ALTER TABLE MyTable DROP COLUMN Class;

GO

EXEC sp_rename 'dbo.Segment.newclass', 'Class', 'COLUMN';

GO

MVDBA (Mike Vessey) SSC-Insane Points: 21797 More actions · Answer 11

yes - my column was populated

i converted it from 'a' to 'bb' 39 million rows

MVDBA

MVDBA (Mike Vessey) SSC-Insane Points: 21797 More actions · Answer 12

apologies - i went from varchar(1) to varchar(2)

i'd had too much coffee

MVDBA

Tim Walker. SSCertifiable Points: 5173 More actions · Answer 13

I thought this article by Alok Dwivedi gave a good explaination of how long schema changes are likely to take to action on big tables:

http://www.sqlservercentral.com/articles/Design+and+Theory/67553/

Tim

.

Paul White SSC Guru Points: 150467 More actions · Answer 14

Hey David,

On a large-ish table like this, even for a relatively modest change such as you describe, I would be very tempted to bulk copy the data out (ordered by the cluster key, if any), create a new table (with the clustered index), bulk load the data with the ORDER hint, and then create the non-clustered indexes.

You have to plan it well to ensure you get minimal logging (and things like foreign key relationships can make life harder) but it is an opportunity to ensure the schema will cope with future growth (VARCHAR(10) uses no more storage for the same size data than VARCHAR(2) does!), to defragment the table and indexes, and reclaim any space from dropped columns.

It is often just as quick to do this as to wait for a fully-logged ALTER TABLE statement to complete (unless the operation can be completed by just updating the metadata), and you get a much cleaner table as the end result. In your case, SQL Server must change every row on every page with full logging anyway.

Paul

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi