Expanding column length on very large table

  • Hi all - This may be very simple but I wanted to get other opinions on my challenge. I have a table that contains several million rows. Within the table I have a char(1) NULL column that sits in the middle of the column list. I have a need to expand that column to varchar(2) NULL. A new business rule includes the possibility of a two character code. In the past, simply updating the length of the existing column took a very long time. Would it be faster to add a new column to the end of the table, insert the previous column data, drop the old column, and finally rename the new column?

  • The reason for the delay is datatype conversion from chr() to varchar2()

    If your table is not too large meaning less than 50 million rows or so and you can afford a small maintenance window I would probably re-create the table resorting to CTAS.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • if you are using management tools then this will be very slow as management studio will work on the principle of making this backwards compatible (to sql7 i think - might be 6.5) and performs the following sequence

    create table x

    bulk insert from y into x

    drop y

    rename x as y

    the following is your fastest way

    alter table x alter column y varchar(2)

    i tested this against 32 million rows and it performed in 1 second

    MVDBA

  • Were the rows populated, or did it have nulls in the char(1) column?

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Thank you all for the replies. These is data in the column however some values are NULL.

  • michael vessey (8/5/2009)alter table x alter column y varchar(2)

    i tested this against 32 million rows and it performed in 1 second

    Quick question... was column "Y" a populated char(1) datatype column sitting in the middle of the row?

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • Paul - Yes to your question. The field in question is populated but is a NULLABLE field and it does sit in the middle of the row.

    FYI - I ran the ALTER TABLE ALTER COLUMN statement above in our DEV environment. The comment took 13 minutes for 13 million rows. I watched activity monitor and frequently had PAGEIOLATCH_EX waits.

  • Rggg...typing. Command not comment.

  • david.tyler (8/5/2009)


    Paul - Yes to your question. The field in question is populated but is a NULLABLE field and it does sit in the middle of the row.

    FYI - I ran the ALTER TABLE ALTER COLUMN statement above in our DEV environment. The comment took 13 minutes for 13 million rows. I watched activity monitor and frequently had PAGEIOLATCH_EX waits.

    Sorry, my question was for Michael... hard to believe your scenario running on 1 second for 32 million rows.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • It only takes a second when you are doing something like expanding the maximum size of a varchar field, because none of the actual data rows have to change. If it knew that all rows were null for the char(1), it could create a varchar(x) column and deactivate the old column.

    But changing from a char(1) to a varchar, i'm sure pointers would have to be established on the page to support the varchar. It wouldn't necessarily all have to be done at once. Maybe it's smart enough to do that on a row by row basis (or page by page) as data is updated.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Follow Up: Thanks again everyone who replied. Here are my findings after testing both solutions:

    Sol1: Alter the column with an Alter Table statement

    Took 13 minutes to alter 13 million rows.

    Sol2: Add a new col, copy data from old col to new, drop old col, rename new col

    Took just under 3 minutes for the 13 million rows. Here is the script I used.

    ALTER TABLE MyTable ADD newclass VARCHAR(2);

    GO

    UPDATE MyTable

    SET newclass = Class;

    GO

    ALTER TABLE MyTable DROP COLUMN Class;

    GO

    EXEC sp_rename 'dbo.Segment.newclass', 'Class', 'COLUMN';

    GO

  • yes - my column was populated

    i converted it from 'a' to 'bb' 39 million rows

    MVDBA

  • apologies - i went from varchar(1) to varchar(2)

    i'd had too much coffee

    MVDBA

  • I thought this article by Alok Dwivedi gave a good explaination of how long schema changes are likely to take to action on big tables:

    http://www.sqlservercentral.com/articles/Design+and+Theory/67553/

    Tim

    .

  • Hey David,

    On a large-ish table like this, even for a relatively modest change such as you describe, I would be very tempted to bulk copy the data out (ordered by the cluster key, if any), create a new table (with the clustered index), bulk load the data with the ORDER hint, and then create the non-clustered indexes.

    You have to plan it well to ensure you get minimal logging (and things like foreign key relationships can make life harder) but it is an opportunity to ensure the schema will cope with future growth (VARCHAR(10) uses no more storage for the same size data than VARCHAR(2) does!), to defragment the table and indexes, and reclaim any space from dropped columns.

    It is often just as quick to do this as to wait for a fully-logged ALTER TABLE statement to complete (unless the operation can be completed by just updating the metadata), and you get a much cleaner table as the end result. In your case, SQL Server must change every row on every page with full logging anyway.

    Paul

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply