Remeber very carefully. Don't ever set a default value when you add a new column to a table with such a large number of rows. Becasue it is no different to addding the column as null and then issue one transaction to update all the 35 million records with the default value. It's the worst way of doing this. That'll load the whole server and block the table for a very long time as well.
Yes, I agree with you guys. the Design is crap here, but my solution is for people who cannot change the whole design to add a new column to a table like this and to explain how the best way to traverse through a table to do this type of an update.
I am sure there are lot of people who were trying to do this type of thing and waited so long it to finish saying to themlesves "This is a very large table, no wonder it takes so long". This will help them to understand the underline operation bit better and approach it more efficiently next time.
And others who are designing the tables, now you know the troubles ahead if you didn't design it properly at the first time.
OK, firstly sorry if my writing is bit confusing. i'll try to clear it up with this explanation.
My first example which updates the newzip column, is actually to demonstrate the method widely suggested to follow when you add a new column to a very large table. That's just an example, and i first used that method to do the update on my own table called [LargeTable] and that's the one took 20+ hours. And i haven't posted it in the article because the provided example (with NewZip column) is good enough to understand what it does.
But for the solution i am suggesting, i have given the exact code i ran on my table [LargeTable].
Hope this clears the confusion.
I was going to keep out of this until the statement about the design being crap.
The clustered index is there - on that specific column - because it was designed that way for the critical operations on that DB. The performance-critical activities are searches on data, where the search criteria (usually) translate quite nicely into range searches on the column implemented by the clustered index. There is no other column on that table which would give the slightest benefit in a range search.
As to length... yes, that does look a bit silly. It is in fact accommodating the largest value that could be constructed as the search key without truncation. Does it really need to be that big? No. Nearly 98% of values are 25 characters or less, and the maximum is 56.
As to inserting / page splits. Yep, the effect would be rather worse than if (say) the identity column was used, but (a) the column is never updated, (b) inserts are never done online and (c) the usual insert rate is very low (well, there is one occasional batch process that's a bit heavy, and a bit of a dog, but that's a separate thread entirely).
What are my motives in responding? Well, yes, I had some involvement in this design, but more important is the principle. Don't automatically assume character columns are bad in clustered indexes. A proper design needs to understand ALL the intended uses of the table, and make indexing decisions accordingly.
An interesting article from Chris Hedgate is at: http://www.sqlservercentral.com/columnists/chedgate/clusterthatindex.asp