New Column Updates

  • Comments posted here are about the content posted at http://www.sqlservercentral.com/columnists/bFernando/2954.asp

  • I think this could do the job...

    set nocount on

    SET ROWCOUNT 1000

    declare @rowcount int, @batchcount int, @rowsupdated int

    declare @LastKey nvarchar(341)

    select @rowcount = 1,@batchcount = 0, @rowsupdated = 0, @LastKey = ''

    while @rowcount > 0

    begin

      begin tran

         update MyTable

         set NewZip = ZipCode,

              @LastKey = NameKey

         where NewZip is NULL

            and NameKey >= @LastKey

         set @rowcount = @@rowcount

         select @rowsupdated = @rowsupdated + @rowcount

         if @rowcount > 0 select @batchcount = @batchcount + 1

      commit

    end

    *If NameKey were unique, we could use ">" instead of ">=" .

  • hehe, very nice solution - simple is beautiful

  • > update MyTable

    > set NewZip = ZipCode,

    > @LastKey = NameKey

    That may well work in this example, but it is not something I'd like to use for production use. You are relying on @LastKey being set to the last (alphabetically) NameKey value updated. Your updating a set, it could take any of the values from that set.

  • Ians, I believe you can use it for production.

    The last value assigned to @LastKey is the largest NameKey in the set because the update query definitely uses the index [LargeTable__IE1].

  • Can you point me at the documentation that guarantees that behaviour?

  • You are actually both right.

    But just to make sure of this behavior, you can use an index hint in the query.  This pretty much guarantees the results.

  • We cant both be right.

    From BOL:-

    Setting Variables and Columns

    Variable names can be used in UPDATE statements to show the old and new values affected. This should only be used when the UPDATE statement affects a single record; if the UPDATE statement affects multiple records, the variables only contain the values for one of the updated rows.

    It just looks wrong to me, trying to set a single variable to many values from a table is a bad idea.

  • Regardless of the "perfect" SQL for this task, 35 minutes is sufficiently better than 20 hours that the solution in the article is certainly good enough. If it generalizes better than an ideal but single-purpose query, then (imo) it has better reusability.

    What is valuable in this article is that a problem is broken down and explained so we can see WHY the solution works so much better than the original example.

    I would be interested in the performance/timing of setting the default value to a UDF at creation of the column? (which could be removed after the initial load)

  • Haven't tried it, but couldn't you just use a ranking function over ordering by the clustered index key to get a pseudo identity column? I can't think of any drawback to this approach, except that you can't do it in SS2K

  • Yes you can both be right.  The value will be set to the last row updated, however in a few rare cases the, update may be done in something that looks like a random order, using the index hint "should" aleviate that possibility.

  • One other caveat I thought of with either solution, is that if the column will end up being mandatory aka Not Null, there will need to be a check at the end to ensure that there are no null rows that were added after the fact.  Also unless you can guarantee that the code will provide a value a default is almost always needed.

    For the question on using a default when adding the table, I would first crank up the size of the log file by a few GB, and I expect it would possibly be shorter than the 35 minutes, but it only helps if you have the same value to insert or a derivable formula.

    Enjoy, Brian

  • This is extra useful since SET ROWCOUNT affecting updates is going away in the next release of SQL. http://msdn2.microsoft.com/en-us/library/ms188774.aspx

  • Very interesting article, definitely worth seeing.

    Btw, though, shouldn't the columns be NCHAR(2) instead of NVARCHAR(2)?  It's a waste of space and processing overhead to make a 2-byte column variable.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Ha!  Very nice.  I'm facing a similar situation here.  This article demonstrates a nice way of dealing with this situation.  Not to mention the issue of blocking / locking on the table while just such an update is occurring.  Bravo.

Viewing 15 posts - 1 through 15 (of 28 total)

You must be logged in to reply to this topic. Login to reply