September 26, 2011 at 1:30 pm
I have a table that I need to run an update on one column for every record in the db. Essentially, I need to add a prefix value to the existing column data for every record. The update statement would be something like this:
update table1 set colA = 'XY' + colA
My concern is that I don't want to cause the transaction log to expand during the transaction. It's currently set to Full. What is the best way to go about doing the update without causing a big expansion of the log?
September 26, 2011 at 1:31 pm
rbuzzard1 (9/26/2011)
My concern is that I don't want to cause the transaction log to expand during the transaction. It's currently set to Full. What is the best way to go about doing the update without causing a big expansion of the log?
You don't. It's not built to allow you to do that. Even on 'simple' you'd blow the log out, as it'd be a single transaction.
What you need to do is piecemeal the update (a few 100,000 rows at a time), then force a log backup between each pass. It's really the only way you can try to keep this from blowing out the log space.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
September 26, 2011 at 1:45 pm
OK. I can buy that. What is the best way to break it up like you suggest? I've been trying to look into that but see mixed messages on how to do it. I'd appreciate any input.
September 26, 2011 at 1:54 pm
Depends, can you script and paste your schema here? It's gonna depend on your available keys and indexes.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
September 26, 2011 at 2:20 pm
It's pretty simple structure and no index on column being updated.
CREATE TABLE [dbo].[Product](
[ProductUID] [int] IDENTITY(1,1) NOT NULL,
[EntityUID] [int] NOT NULL,
[ProductID] [varchar](30) NULL,
[LastUpdate] [datetime] NOT NULL CONSTRAINT [DF_Visit_LastUpdate_1] DEFAULT (getdate()),
CONSTRAINT [PK_Product] PRIMARY KEY CLUSTERED
(
[ProductUID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
) ON [PRIMARY]
I'm wanting to update the ProductID value as stated in the initial post to simply add a two-char prefix to the existing value for all rows in the table.
September 26, 2011 at 2:24 pm
Can you post the current update statement? Will there be an easy way to tell an updated from from one not yet updated?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 26, 2011 at 2:31 pm
The update statement would look like this:
update product set productid = 'XY' + productid
As is, that would update all records, but apparently I need to know how to break that up into smaller groups, and add a trans log backup after each group of records is updated, in order to keep the log size down.
As far as being able to tell which have been updated, anything with left(productid, 2) = 'XY' has been updated. So, presumably, that would change the update statement to this:
update product set productid = 'XY' + productid
where left(productid, 2) <> 'XY'
September 26, 2011 at 2:43 pm
Untested. Test it yourself first!!!!!
DECLARE @Done bit;
SET @Done = 0;
WHILE (@Done = 0)
BEGIN
UPDATE TOP (X) Product
SET Productid = 'XY' + Productid
WHERE ProductID NOT LIKE 'XY%';
IF @@RowCount = 0
SET @Done = 1;
-- log backup at this point
END
How large X should be (the value in the TOP) I leave to you. Small enough you don't grow the log, large enough you're not updating for a week.
Stick an index on ProductID first (yes, it'll slow the update, but speed finding the rows to update)
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 26, 2011 at 3:06 pm
Thanks very much. I think that will do the trick. I was getting there but was uncertain about it and that really helps me out.
September 26, 2011 at 3:06 pm
I would approach this a little differently:
DECLARE @c INT,
@LastID INT
SET @c = 0
SELECT @LastID = MAX(ProductUID) FROM Product
WHILE @c <= @LastID
BEGIN
UPDATE Product
SETProductID = 'XY' + ProductID
WHERE
ProductUID Between @c AND @c + 2500 - 1 -- avoid overlaps and make it obvious
SET @c = @c + 2500
IF @c%200000 = 0
BEGIN
-- Reference in BoL
-- ms-help://MS.SQLCC.v10/MS.SQLSVR.v10.en/s10de_6tsql/html/89a4658a-62f1-4289-8982-f072229720a1.htm
-- See section (B)
BACKUP Log <DatabaseName>
TO <SaveDevice> --Lookup sp_addumpdevice if necessary
END
END
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
September 27, 2011 at 8:25 am
Evil Kraig F (9/26/2011)
I would approach this a little differently:
Is there an advantage ? Gail's seems easier to read, and I think hers could be restarted where it left off if the process had to be stopped at some point, without risking a double update.
September 27, 2011 at 12:45 pm
homebrew01 (9/27/2011)
Evil Kraig F (9/26/2011)
I would approach this a little differently:Is there an advantage ? Gail's seems easier to read, and I think hers could be restarted where it left off if the process had to be stopped at some point, without risking a double update.
Mine runs down a clustered index and hopefully directly seeks the records in question, without any searching, and should be working with less rows before the top hits (per pass). Gail's will need to scan each pass looking for valid contenders. Hers can be rerun in case of an error, mine should go faster because it is not checking on secondary fields for where conditions and is merely running through a section of the clustered index at a time.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
September 27, 2011 at 12:58 pm
Evil Kraig F (9/27/2011)
homebrew01 (9/27/2011)
Evil Kraig F (9/26/2011)
I would approach this a little differently:Is there an advantage ? Gail's seems easier to read, and I think hers could be restarted where it left off if the process had to be stopped at some point, without risking a double update.
Mine runs down a clustered index and hopefully directly seeks the records in question, without any searching, and should be working with less rows before the top hits (per pass). Gail's will need to scan each pass looking for valid contenders.
Not if the index I recommended is created.
Also, with an update this simple, only the specified number of rows will come back from the index seek. It won't be a fetch all, discard all but the X required.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
September 27, 2011 at 1:07 pm
GilaMonster (9/27/2011)
Evil Kraig F (9/27/2011)
homebrew01 (9/27/2011)
Evil Kraig F (9/26/2011)
I would approach this a little differently:Is there an advantage ? Gail's seems easier to read, and I think hers could be restarted where it left off if the process had to be stopped at some point, without risking a double update.
Mine runs down a clustered index and hopefully directly seeks the records in question, without any searching, and should be working with less rows before the top hits (per pass). Gail's will need to scan each pass looking for valid contenders.
Not if the index I recommended is created.
Also, with an update this simple, only the specified number of rows will come back from the index seek. It won't be a fetch all, discard all but the X required.
Sorry, I'd missed that you recommended a new index be put into play. That's something else I usually avoid, but you're right (of course! 😉 ), it will avoid the other issues.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
Viewing 14 posts - 1 through 14 (of 14 total)
You must be logged in to reply to this topic. Login to reply