Long Running Update statement

Question

Long Running Update statement

jdbrown239

SSCarpal Tunnel

Points: 4393
More actions
February 5, 2015 at 8:25 pm

#314391

Hi
I am running an update against a table with over 100,000,000 rows which takes forever. Is there a better way to write a query like this?
UPDATE [dbo].[Test_Table]
SET [Column_34] =(
CASE
WHEN [Column_2] > 25 AND [Column_2] <= 34 THEN 1
ELSE 0 END)

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1003863 More actions · Answer 1

jdbrown239 (2/5/2015)
Hi
I am running an update against a table with over 100,000,000 rows which takes forever. Is there a better way to write a query like this?
UPDATE [dbo].[Test_Table]
SET [Column_34] =(
CASE
WHEN [Column_2] > 25 AND [Column_2] <= 34 THEN 1
ELSE 0 END)

Yes. Every system has a tipping point. 1M rows might take 6 seconds. As expected, 2M rows might take just twice that (12 seconds) and 3M just 3 times that (18 seconds). But 4 million, for no apparent reason (there are reasons... just too many to talk about here, usually having to do with memory) might take 2 hours.

So, first step is to breakup such massive updates by only updating, say, 2 million rows at a time in an update loop.

Another thing that can really slow down an update is if the columns you're updating have indexes on them.

Last but certainly not least and a very common mistake if it's a "joined" update is writing an "illegal" update where the object of the update isn't in the FROM clause. Post your code and lets have a look.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Eirikur Eiriksson SSC Guru Points: 182882 More actions · Answer 2

Further on Jeff's answer, could you post the DDL for the table and everything that comes with it (indexes, constraints, triggers etc.)?

😎

jdbrown239 SSCarpal Tunnel Points: 4393 More actions · Answer 3

Table DDL

CREATE TABLE [dbo].[Targus_201412_V7](

[PrimaryPhone] [varchar](10) NULL,

[CensusBlock] [varchar](15) NULL,

[City] [varchar](50) NULL,

[State] [varchar](2) NULL,

[ZIP_Code] [varchar](5) NULL,

[ZIP_Plus4] [varchar](4) NULL,

[Income] [varchar](5) NULL,

[AvgHomeValue] [varchar](5) NULL,

[MaritalStatusCode] [varchar](5) NULL,

[AgeHOH] [varchar](5) NULL,

[AgeMember2] [varchar](5) NULL,

[NumberofAdults] [varchar](5) NULL,

[NumberofChildren] [varchar](5) NULL,

[PresenceofChildren] [varchar](5) NULL,

[Age0to3] [varchar](5) NULL,

[Age4to6] [varchar](5) NULL,

[Age7to9] [varchar](5) NULL,

[Age10to12] [varchar](5) NULL,

[Age13to15] [varchar](5) NULL,

[Age16to18] [varchar](5) NULL,

[HispanicSurname] [varchar](5) NULL,

[AsianSurname] [varchar](5) NULL,

[OwnRent] [varchar](5) NULL,

[SDU] [varchar](5) NULL,

[Ported_Q314] [varchar](5) NULL,

[Port_Dt_Q314] [varchar](10) NULL,

[OCN_Q314] [varchar](4) NULL,

[Name_Q314] [varchar](50) NULL,

[TYPE_Q314] [nvarchar](30) NULL,

[Ported_Q414] [varchar](5) NULL,

[Port_Dt_Q414] [varchar](10) NULL,

[OCN_Q414] [varchar](4) NULL,

[Name_Q414] [varchar](50) NULL,

[TYPE_Q414] [nvarchar](30) NULL,

[RECODE_PHONE_TYPE_Q4] [char](60) NULL,

[METRO] [float] NULL,

[METRO_STATUS] [char](10) NULL,

[URBAN] [int] NULL,

[SUBURBAN] [int] NULL,

[RURAL] [int] NULL,

[SETTING_CHK] [int] NULL,

[KIDS0_6] [numeric](18, 0) NULL,

[KIDS7_12] [numeric](18, 0) NULL,

[KIDS13_18] [numeric](18, 0) NULL,

[CKIDS] [numeric](18, 0) NULL,

[DIFF] [numeric](18, 0) NULL,

[DIFF2] [numeric](18, 0) NULL,

[ADULTS] [numeric](18, 0) NULL,

[KIDS] [numeric](18, 0) NULL,

[AGEHOH2] [numeric](18, 0) NULL,

[INCOME0_20] [int] NULL,

[INCOME20_35] [int] NULL,

[INCOME35_50] [int] NULL,

[INCOME50_100] [int] NULL,

[INCOME100] [int] NULL,

[INCOME_CHK] [int] NULL,

[AGE18_25] [int] NULL,

[AGE26_34] [int] NULL,

[AGE35_44] [int] NULL,

[AGE45_54] [int] NULL,

[AGE55_64] [int] NULL,

[AGEHI] [int] NULL,

[AGE_CHK] [int] NULL

) ON [PRIMARY]

GO

SET ANSI_PADDING OFF

GO

Eirikur Eiriksson SSC Guru Points: 182882 More actions · Answer 4

Eirikur Eiriksson

SSC Guru

Points: 182882

February 6, 2015 at 9:00 am

#1776072

Any indexes, constraints, triggers?

😎

jdbrown239 SSCarpal Tunnel Points: 4393 More actions · Answer 5

Index DDL

CREATE NONCLUSTERED INDEX [IX_Agehoh2] ON [dbo].[Targus_201412_V7]

(

[AGEHOH2] ASC

)

INCLUDE ( [AgeHOH]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

GO

Eirikur Eiriksson SSC Guru Points: 182882 More actions · Answer 6

Eirikur Eiriksson

SSC Guru

Points: 182882

February 6, 2015 at 9:03 am

#1776077

Good job, I'll look into this over the weekend

😎

jdbrown239 SSCarpal Tunnel Points: 4393 More actions · Answer 7

jdbrown239

SSCarpal Tunnel

Points: 4393

February 6, 2015 at 9:06 am

#1776078

Thanks for your help.

ScottPletcher SSC Guru Points: 100947 More actions · Answer 8

That statement is going to have the scan the table regardless of what you do.

The critical thing for performance is to pre-allocate (and thus pre-format) enough log space for the entire statement to run without having to add log space.

Thus, we need so see how much free log space is available in that db. Run this command:

DBCC SQLPERF ( LOGSPACE )

and review the results for the relevant db.

Since you're only updating one column, I'd hope the log space required would be ~1GB or less. So, before running the command, make sure you have unused log space of ~1GB.

SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

jdbrown239 SSCarpal Tunnel Points: 4393 More actions · Answer 9

jdbrown239

SSCarpal Tunnel

Points: 4393

February 6, 2015 at 10:24 am

#1776106

I have 15 GB allocated space and 98% free

ScottPletcher SSC Guru Points: 100947 More actions · Answer 10

jdbrown239 (2/6/2015)
I have 15 GB allocated space and 98% free

In the log file?

Yeah, that should be plenty, so the statement should run as fast it can.

Are there triggers, replication or something else involved that might be slowing it down?

SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

Eirikur Eiriksson SSC Guru Points: 182882 More actions · Answer 11

What bothers me here is the fact that this a fairly large heap, makes any kind of batch/portioning effort very costly, any chance that you can change the table to a clustered index? Further on this, what is the unique key?

😎

jdbrown239 SSCarpal Tunnel Points: 4393 More actions · Answer 12

I will have to profile the data for uniqueness to see if a clustered index can be created.

Jason-299789 SSC-Insane Points: 21601 More actions · Answer 13

jdbrown239 (2/8/2015)
I will have to profile the data for uniqueness to see if a clustered index can be created.

Don't get confused between a PK and a clustered index, the Clustered index doesn't need to be unique, its just a way of sorting the table into a given order.

_________________________________________________________________________
SSC Guide to Posting and Best Practices

Jeff Moden SSC Guru Points: 1003863 More actions · Answer 14

Jason-299789 (2/9/2015)
jdbrown239 (2/8/2015)
I will have to profile the data for uniqueness to see if a clustered index can be created.
Don't get confused between a PK and a clustered index, the Clustered index doesn't need to be unique, its just a way of sorting the table into a given order.

Careful now... It's not "just a way of sorting the table into a given order".

Behind the scenes and because the clustered index is included in every non-clustered index, it's a really good idea if the Clustered Index follows the basic guidelines of being narrow, static, not null, ever increasing, and unique. Without the Unique thing, an extra column is added and if duplicates according to the clustered index appear, a uniquefier will be added to the index.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)