UPDATE query question

  • This query was given to me today and performs very poorly but I'm not sure why.

    UPDATE table1

    SET field1 = (RIGHT(('00000000' + convert(nvarchar(8), ((SELECT count(*) + 1 FROM table1 o WHERE field2 > o. field2 and field3 = 'xxx')))),8))

    WHERE field1 = 0

    If I remove the UPDATE and do a straight select it takes a couple of seconds on a table with 300,000 records. There is a clustered index on field2 and a non-clustered index on field3 with field2 as an included column if that helps.

    Thanks for any suggestions.

  • Since you are updating all the records where field1 = 0 to the same static value can you just calculate that value and put in a variable?

    declare @newVal char(8)

    select @newVal = RIGHT(('00000000' + convert(nvarchar(8), ((SELECT count(*) + 1 FROM table1 o WHERE field2 > o. field2 and field3 = 'xxx')))),8)

    --Then do a simple update like

    UPDATE table1

    SET field1 = @newVal

    WHERE field1 = 0

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Looks like a "classic" triangular join to me.

    It might be a better way to use ROW_NUMBER() instead of the COUNT(*) in a subquery approach.

    Would it be possible to post table defa nd some sample data along with your expected result so we have something to play with?

    @sean: I don't think it'll be a static value



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Did your select look like this:

    SELECT field1 = (RIGHT(('00000000' + convert(nvarchar(8), ((SELECT count(*) + 1 FROM table1 o WHERE t.field2 > o. field2 and t.field3 = 'xxx')))),8))

    FROM table1 t

    WHERE t.field1 = 0

    (the aliases may not be quite right)

    The statement has a triangular join in it:

    WHERE t.field2 > o. field2

    For each row in table1 t, n rows in table1 o are scanned where n is anything from 0 to row count -1 (ignoring the filter for now). It's half of a cross join.

    It looks like it's calculating a "row number", ordered by field2, of rows which match the filter.

    Can you confirm that this is the case? It's not always easy to tell with pseudocode. If it is the case, then you'll be pleased to know that there are far better ways to do this.


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • LutzM (5/4/2011)


    Looks like a "classic" triangular join to me.

    It might be a better way to use ROW_NUMBER() instead of the COUNT(*) in a subquery approach.

    Would it be possible to post table defa nd some sample data along with your expected result so we have something to play with?

    @sean: I don't think it'll be a static value

    You are absolutely right...having missed the triangular join it looked the count(*) would return the same value for each row. 😉

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • ChrisM@home (5/4/2011)


    Did your select look like this:

    SELECT field1 = (RIGHT(('00000000' + convert(nvarchar(8), ((SELECT count(*) + 1 FROM table1 o WHERE t.field2 > o. field2 and t.field3 = 'xxx')))),8))

    FROM table1 t

    WHERE t.field1 = 0

    (the aliases may not be quite right)

    The statement has a triangular join in it:

    WHERE t.field2 > o. field2

    For each row in table1 t, n rows in table1 o are scanned where n is anything from 0 to row count -1 (ignoring the filter for now). It's half of a cross join.

    It looks like it's calculating a "row number", ordered by field2, of rows which match the filter.

    Can you confirm that this is the case? It's not always easy to tell with pseudocode. If it is the case, then you'll be pleased to know that there are far better ways to do this.

    Yes that is what I understood the problem to be from the analyst who gave it to me. I knew it had to be a common problem. Is there an easy solution?

  • raistlinx (5/4/2011)


    ...

    Yes that is what I understood the problem to be from the analyst who gave it to me. I knew it had to be a common problem. Is there an easy solution?

    Yes. See my previous post (hint: ROW_NUMBER). 😉



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • LutzM (5/4/2011)


    raistlinx (5/4/2011)


    ...

    Yes that is what I understood the problem to be from the analyst who gave it to me. I knew it had to be a common problem. Is there an easy solution?

    Yes. See my previous post (hint: ROW_NUMBER). 😉

    First time using (seeing!) ROW_NUMBER so bear with me. I think I want something like this:

    WITH table1

    AS

    ( SELECT *, row_number() over(order by field2) as rnum from table1 WHERE field3 = 'xxx' )

    UPDATE table1

    SET field1 = rnum

    GO

    I'm not at work so I can't run it to verify until the morning. But is that the basic idea?

  • I'd like to ask for table def and sample data to verifiy if the query is exactly what you're looking for.

    But there's definitely one thing I'd change: replace the name of the cte to be something different than your original table name.

    I'd be confused with the double name. I'm not sure if SQL Server would be, too 😉

    Another thought would be not to use SELECT * but rather include only the columns needed.

    Other than that, it should work.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Unfortunately I don't have the table def with me at home, but the idea is to simply update field1 to be a new rownumber based on the count of rows.

    If no one has any other ideas I'll start with this tomorrow (just typed it up in notepad for now) and tinker from there.

    WITH CTE

    AS

    ( SELECT field1, row_number() over(order by field2) as rnum from table1 WHERE field3 = 'xxx' )

    UPDATE CTE

    SET field1 = right('00000000' + convert(nvarchar(8), rnum + 1), 8)

    WHERE field1 = 0

    GO

    I'm still not sure if I need to use the PARTITION clause or not? I saw another example of something similar like this:

    update table1 set field1 = (select row_number() over (partition by field3 order by

    field2) from table1 where field3 = 'xxx' AND field1 = 0)

    Does this look right as well?

  • I don't think your latest reply will return the same result since you'd only number the rows with filed1=0 instead of all related rows.

    Regarding PARTITION BY: it owuldn't make a difference in your case since there's nothing to partition by due to the field3 = 'xxx' filter.

    It would be different without the filter since you'd end up with separat numbered groups (all starting with 1) for each different value of field3.

    So, I'd vote for the first approach in your latest reply (but untested...):

    WITH CTE

    AS

    ( SELECT field1, row_number() over(order by field2) as rnum from table1 WHERE field3 = 'xxx' )

    UPDATE CTE

    SET field1 = right('00000000' + convert(nvarchar(8), rnum + 1), 8)

    WHERE field1 = 0

    GO



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Yes I thought about the filter on field3 after I posted that, I think you are right. I think I may have left out the field1=0 from my first reply but that is something that is desired. So I'll go with the following as suggested! Thanks for the help.

    WITH CTE

    AS

    ( SELECT field1, row_number() over(order by field2) as rnum from table1 WHERE field3 = 'xxx' )

    UPDATE CTE

    SET field1 = right('00000000' + convert(nvarchar(8), rnum + 1), 8)

    WHERE field1 = 0

    GO

  • Here is the table def for the table inquestion including indexes. The query is still chugging away on 300,000 rows after 45 mins. Any suggestions on how to speed this up?

    CREATE TABLE table1(

    field1 [nvarchar](8) NULL,

    field2 [nvarchar](8) NULL,

    field3 [nvarchar](3) NULL,

    field4 [nvarchar](3) NULL,

    field5 [nvarchar](12) NULL,

    field6 [nvarchar](5) NULL,

    field7 [nvarchar](8) NULL,

    field8 [int] IDENTITY(1,1) NOT NULL,

    field9 [datetime] NOT NULL,

    field10 [nchar](25) NULL

    )

    GO

    ALTER TABLE table1 ADD CONSTRAINT def1 DEFAULT (getdate()) FOR field9

    GO

    ALTER TABLE table1 ADD CONSTRAINT def2 DEFAULT (suser_sname()) FOR field10

    GO

    CREATE NONCLUSTERED INDEX ncindx1 ON table1 (field3) INCLUDE (field2, field1)

    WITH CTE

    AS

    ( SELECT field1, row_number() over(order by field2) as rnum from table1 WHERE field3 = 'xxx' )

    UPDATE CTE

    SET field1 = right('00000000' + convert(nvarchar(8), rnum + 1), 8)

    WHERE field1 = 0

    GO

  • Where is your clustered index?

    Also, you could test if it helps to move field2 from include to the actual index.

    Would you please attach the actual execution plan as .slplan file? There seems to be something wrong... 45min is totally unacceptable. It might be there's some locking/blocking going on. But let's start with the execution plan first.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Before I ran it this morning I put a clustered index on field2. However the client doesn't think it needs to do this now so this is on hold for now. I may come back to it on my own this weekend, if I do (and see the same results) I will post an execution plan then.

    Thanks for the help.

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply