New Column Updates

  • Bimal Fernando

    SSCommitted

    Points: 1945

    Comments posted here are about the content posted at http://www.sqlservercentral.com/columnists/bFernando/2954.asp

  • JH Park

    SSC Enthusiast

    Points: 160

    I think this could do the job...

    set nocount on

    SET ROWCOUNT 1000

    declare @rowcount int, @batchcount int, @rowsupdated int

    declare @LastKey nvarchar(341)

    select @rowcount = 1,@batchcount = 0, @rowsupdated = 0, @LastKey = ''

    while @rowcount > 0

    begin

      begin tran

         update MyTable

         set NewZip = ZipCode,

              @LastKey = NameKey

         where NewZip is NULL

            and NameKey >= @LastKey

         set @rowcount = @@rowcount

         select @rowsupdated = @rowsupdated + @rowcount

         if @rowcount > 0 select @batchcount = @batchcount + 1

      commit

    end

    *If NameKey were unique, we could use ">" instead of ">=" .

  • Jure

    Say Hey Kid

    Points: 701

    hehe, very nice solution - simple is beautiful

  • ians

    SSC Eights!

    Points: 964

    > update MyTable

    > set NewZip = ZipCode,

    > @LastKey = NameKey

    That may well work in this example, but it is not something I'd like to use for production use. You are relying on @LastKey being set to the last (alphabetically) NameKey value updated. Your updating a set, it could take any of the values from that set.

  • JH Park

    SSC Enthusiast

    Points: 160

    Ians, I believe you can use it for production.

    The last value assigned to @LastKey is the largest NameKey in the set because the update query definitely uses the index [LargeTable__IE1].

  • ians

    SSC Eights!

    Points: 964

    Can you point me at the documentation that guarantees that behaviour?

  • Ninja's_RGR'us

    SSC Guru

    Points: 294069

    You are actually both right.

    But just to make sure of this behavior, you can use an index hint in the query.  This pretty much guarantees the results.

  • ians

    SSC Eights!

    Points: 964

    We cant both be right.

    From BOL:-

    Setting Variables and Columns

    Variable names can be used in UPDATE statements to show the old and new values affected. This should only be used when the UPDATE statement affects a single record; if the UPDATE statement affects multiple records, the variables only contain the values for one of the updated rows.

    It just looks wrong to me, trying to set a single variable to many values from a table is a bad idea.

  • Mike Dougherty-384281

    SSCrazy

    Points: 2764

    Regardless of the "perfect" SQL for this task, 35 minutes is sufficiently better than 20 hours that the solution in the article is certainly good enough. If it generalizes better than an ideal but single-purpose query, then (imo) it has better reusability.

    What is valuable in this article is that a problem is broken down and explained so we can see WHY the solution works so much better than the original example.

    I would be interested in the performance/timing of setting the default value to a UDF at creation of the column? (which could be removed after the initial load)

  • Adrian Hains

    Ten Centuries

    Points: 1322

    Haven't tried it, but couldn't you just use a ranking function over ordering by the clustered index key to get a pseudo identity column? I can't think of any drawback to this approach, except that you can't do it in SS2K

  • Ninja's_RGR'us

    SSC Guru

    Points: 294069

    Yes you can both be right.  The value will be set to the last row updated, however in a few rare cases the, update may be done in something that looks like a random order, using the index hint "should" aleviate that possibility.

  • Brian Munier

    SSCrazy

    Points: 2815

    One other caveat I thought of with either solution, is that if the column will end up being mandatory aka Not Null, there will need to be a check at the end to ensure that there are no null rows that were added after the fact.  Also unless you can guarantee that the code will provide a value a default is almost always needed.

    For the question on using a default when adding the table, I would first crank up the size of the log file by a few GB, and I expect it would possibly be shorter than the 35 minutes, but it only helps if you have the same value to insert or a derivable formula.

    Enjoy, Brian

  • Rob Scholl

    Old Hand

    Points: 300

    This is extra useful since SET ROWCOUNT affecting updates is going away in the next release of SQL. http://msdn2.microsoft.com/en-us/library/ms188774.aspx

  • ScottPletcher

    SSC Guru

    Points: 98206

    Very interesting article, definitely worth seeing.

    Btw, though, shouldn't the columns be NCHAR(2) instead of NVARCHAR(2)?  It's a waste of space and processing overhead to make a 2-byte column variable.

    SQL DBA,SQL Server MVP(07, 08, 09) Prosecutor James Blackburn, in closing argument in the Fatal Vision murders trial: If in the future, you should cry a tear, cry one for them [the murder victims]. If in the future, you should say a prayer, say one for them. And if in the future, you should light a candle, light one for them.

  • Jeremy Brown

    SSCarpal Tunnel

    Points: 4223

    Ha!  Very nice.  I'm facing a similar situation here.  This article demonstrates a nice way of dealing with this situation.  Not to mention the issue of blocking / locking on the table while just such an update is occurring.  Bravo.

Viewing 15 posts - 1 through 15 (of 29 total)

You must be logged in to reply to this topic. Login to reply