Long Running Update statement

  • Hi

    I am running an update against a table with over 100,000,000 rows which takes forever. Is there a better way to write a query like this?

    UPDATE [dbo].[Test_Table]

    SET [Column_34] =(

    CASE

    WHEN [Column_2] > 25 AND [Column_2] <= 34 THEN 1

    ELSE 0 END)

  • jdbrown239 (2/5/2015)


    Hi

    I am running an update against a table with over 100,000,000 rows which takes forever. Is there a better way to write a query like this?

    UPDATE [dbo].[Test_Table]

    SET [Column_34] =(

    CASE

    WHEN [Column_2] > 25 AND [Column_2] <= 34 THEN 1

    ELSE 0 END)

    Yes. Every system has a tipping point. 1M rows might take 6 seconds. As expected, 2M rows might take just twice that (12 seconds) and 3M just 3 times that (18 seconds). But 4 million, for no apparent reason (there are reasons... just too many to talk about here, usually having to do with memory) might take 2 hours.

    So, first step is to breakup such massive updates by only updating, say, 2 million rows at a time in an update loop.

    Another thing that can really slow down an update is if the columns you're updating have indexes on them.

    Last but certainly not least and a very common mistake if it's a "joined" update is writing an "illegal" update where the object of the update isn't in the FROM clause. Post your code and lets have a look.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Further on Jeff's answer, could you post the DDL for the table and everything that comes with it (indexes, constraints, triggers etc.)?

    😎

  • Table DDL

    CREATE TABLE [dbo].[Targus_201412_V7](

    [PrimaryPhone] [varchar](10) NULL,

    [CensusBlock] [varchar](15) NULL,

    [City] [varchar](50) NULL,

    [State] [varchar](2) NULL,

    [ZIP_Code] [varchar](5) NULL,

    [ZIP_Plus4] [varchar](4) NULL,

    [Income] [varchar](5) NULL,

    [AvgHomeValue] [varchar](5) NULL,

    [MaritalStatusCode] [varchar](5) NULL,

    [AgeHOH] [varchar](5) NULL,

    [AgeMember2] [varchar](5) NULL,

    [NumberofAdults] [varchar](5) NULL,

    [NumberofChildren] [varchar](5) NULL,

    [PresenceofChildren] [varchar](5) NULL,

    [Age0to3] [varchar](5) NULL,

    [Age4to6] [varchar](5) NULL,

    [Age7to9] [varchar](5) NULL,

    [Age10to12] [varchar](5) NULL,

    [Age13to15] [varchar](5) NULL,

    [Age16to18] [varchar](5) NULL,

    [HispanicSurname] [varchar](5) NULL,

    [AsianSurname] [varchar](5) NULL,

    [OwnRent] [varchar](5) NULL,

    [SDU] [varchar](5) NULL,

    [Ported_Q314] [varchar](5) NULL,

    [Port_Dt_Q314] [varchar](10) NULL,

    [OCN_Q314] [varchar](4) NULL,

    [Name_Q314] [varchar](50) NULL,

    [TYPE_Q314] [nvarchar](30) NULL,

    [Ported_Q414] [varchar](5) NULL,

    [Port_Dt_Q414] [varchar](10) NULL,

    [OCN_Q414] [varchar](4) NULL,

    [Name_Q414] [varchar](50) NULL,

    [TYPE_Q414] [nvarchar](30) NULL,

    [RECODE_PHONE_TYPE_Q4] [char](60) NULL,

    [METRO] [float] NULL,

    [METRO_STATUS] [char](10) NULL,

    [URBAN] [int] NULL,

    [SUBURBAN] [int] NULL,

    [RURAL] [int] NULL,

    [SETTING_CHK] [int] NULL,

    [KIDS0_6] [numeric](18, 0) NULL,

    [KIDS7_12] [numeric](18, 0) NULL,

    [KIDS13_18] [numeric](18, 0) NULL,

    [CKIDS] [numeric](18, 0) NULL,

    [DIFF] [numeric](18, 0) NULL,

    [DIFF2] [numeric](18, 0) NULL,

    [ADULTS] [numeric](18, 0) NULL,

    [KIDS] [numeric](18, 0) NULL,

    [AGEHOH2] [numeric](18, 0) NULL,

    [INCOME0_20] [int] NULL,

    [INCOME20_35] [int] NULL,

    [INCOME35_50] [int] NULL,

    [INCOME50_100] [int] NULL,

    [INCOME100] [int] NULL,

    [INCOME_CHK] [int] NULL,

    [AGE18_25] [int] NULL,

    [AGE26_34] [int] NULL,

    [AGE35_44] [int] NULL,

    [AGE45_54] [int] NULL,

    [AGE55_64] [int] NULL,

    [AGEHI] [int] NULL,

    [AGE_CHK] [int] NULL

    ) ON [PRIMARY]

    GO

    SET ANSI_PADDING OFF

    GO

  • Any indexes, constraints, triggers?

    😎

  • Index DDL

    CREATE NONCLUSTERED INDEX [IX_Agehoh2] ON [dbo].[Targus_201412_V7]

    (

    [AGEHOH2] ASC

    )

    INCLUDE ( [AgeHOH]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    GO

  • Good job, I'll look into this over the weekend

    😎

  • Thanks for your help.

  • That statement is going to have the scan the table regardless of what you do.

    The critical thing for performance is to pre-allocate (and thus pre-format) enough log space for the entire statement to run without having to add log space.

    Thus, we need so see how much free log space is available in that db. Run this command:

    DBCC SQLPERF ( LOGSPACE )

    and review the results for the relevant db.

    Since you're only updating one column, I'd hope the log space required would be ~1GB or less. So, before running the command, make sure you have unused log space of ~1GB.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • I have 15 GB allocated space and 98% free

  • jdbrown239 (2/6/2015)


    I have 15 GB allocated space and 98% free

    In the log file?

    Yeah, that should be plenty, so the statement should run as fast it can.

    Are there triggers, replication or something else involved that might be slowing it down?

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • What bothers me here is the fact that this a fairly large heap, makes any kind of batch/portioning effort very costly, any chance that you can change the table to a clustered index? Further on this, what is the unique key?

    😎

  • I will have to profile the data for uniqueness to see if a clustered index can be created.

  • jdbrown239 (2/8/2015)


    I will have to profile the data for uniqueness to see if a clustered index can be created.

    Don't get confused between a PK and a clustered index, the Clustered index doesn't need to be unique, its just a way of sorting the table into a given order.

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

  • Jason-299789 (2/9/2015)


    jdbrown239 (2/8/2015)


    I will have to profile the data for uniqueness to see if a clustered index can be created.

    Don't get confused between a PK and a clustered index, the Clustered index doesn't need to be unique, its just a way of sorting the table into a given order.

    Careful now... It's not "just a way of sorting the table into a given order".

    Behind the scenes and because the clustered index is included in every non-clustered index, it's a really good idea if the Clustered Index follows the basic guidelines of being narrow, static, not null, ever increasing, and unique. Without the Unique thing, an extra column is added and if duplicates according to the clustered index appear, a uniquefier will be added to the index.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply