Updating 566787 Records takes forever

  • Hi,

    I am writing a script to update bad data in a column of a table. The Table has 566787 rows and this script is taking forever to complete. It updated 1000 records in 5 mins which I think is too much.

    There are two triggers and 4 indexes on this table but for now I have deleted the indexes and disabled the triggers to speed up the process

    -- The script updates the FullPath column of the table according to the values

    -- in Title column and hirNodeParent Column

    DECLARE

    @hirNode int,

    @hirNodeParent int,

    @Title varchar(500),

    @Return varchar(500),

    @LasthirNode int,

    @Node int,

    @FullPath varchar(1000)

    Select hirNode into #hirNodes from hirNodes

    While exists (Select * From #hirNodes)

    Begin

    Set @hirNode = (Select Top 1 hirnode from #hirNodes)

    Set @Node = @hirNode

    Set @FullPath = (select FullPath from hirNodes where hirNode = @hirNode)

    SET @Return = ''

    WHILE @hirNode > 0

    BEGIN

    SELECT @hirNodeParent = hirNodeParent, @Title = Title

    FROM hirNodes

    WHERE hirNode = @hirNode

    SET @LasthirNode = @hirNode

    SET @Return = @Title + '\' + @Return

    SET @hirNode = @hirNodeParent

    END

    SET @Return = '\\' + @Return

    if(@FullPath <> @Return)

    begin

    Update hirNodes set FullPath = @Return where hirNode = @Node

    end

    Delete from #hirNodes where hirNode = @Node

    print(@Node)

    End

    Drop table #hirNodes

    --Table Script

    CREATE TABLE [dbo].[hirNodes](

    [hirNode] [int] IDENTITY(1,1) NOT NULL,

    [hirNodeParent] [int] NULL,

    [hirLevel] [int] NOT NULL,

    [Title] [varchar](500) NULL,

    [Brief] [varchar](16) NULL,

    [Description] [varchar](256) NULL,

    [Active] [bit] NOT NULL,

    [DisplayOrder] [varchar](50) NULL,

    [FullPath] [varchar](500) NULL,

    CONSTRAINT [PK_hirNodes_1] PRIMARY KEY CLUSTERED

    (

    [hirNode] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]

    ) ON [PRIMARY]

  • The problem is that this is RBAR as your only ever updating 1 row at a time instead of trying to update 566787 records in one set based operation, meaning that the triggers fire 566787 times instead of once etc etc.

    If you would provide some sample data prior to the update and what the expected result after the update should be along with the trigger and index definitions there might be a quicker set based operation to performing the task.

  • This looks like you're trying to build a hierachy list using a while loop.

    Can you provide some sample data for #HirNodes, as I think you're using a sledgehammer to crack nut, and a Recursive CTE may help but with out sample data its guesswork.

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

  • What you are trying to do will obviously takes time since you have used loop statement which means it will consume more time.

    can you give some sample records at-least 5 records for this table.

    Thanks!

  • I just added the Sample data..

    Do you want the code for Triggers?

    The triggers only fire on Update of hirNodeParent column and Title column and I am only updating the full path column and anyway I have disabled the triggers for now.

    I am attaching the code for indexes but for now I have deleted them too

    /****** Object: Index [idx_hirNodes_fullPath] Script Date: 10/31/2012 19:39:16 ******/

    CREATE NONCLUSTERED INDEX [idx_hirNodes_fullPath] ON [dbo].[hirNodes]

    (

    [FullPath] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]

    GO

    /****** Object: Index [idx_hirNodes_hirLevel] Script Date: 10/31/2012 19:39:52 ******/

    CREATE NONCLUSTERED INDEX [idx_hirNodes_hirLevel] ON [dbo].[hirNodes]

    (

    [hirLevel] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]

    GO

    /****** Object: Index [idx_hirNodes_hirNodeParent] Script Date: 10/31/2012 19:41:47 ******/

    CREATE NONCLUSTERED INDEX [idx_hirNodes_hirNodeParent] ON [dbo].[hirNodes]

    (

    [hirNodeParent] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]

    GO

    /****** Object: Index [idx_hirNodes_Title] Script Date: 10/31/2012 19:47:55 ******/

    CREATE NONCLUSTERED INDEX [idx_hirNodes_Title] ON [dbo].[hirNodes]

    (

    [Title] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]

    GO

  • Definately looks like a job for a recursive CTE in order to build the Path then a simple update after, this is a simple test rig for a select, but the update isnt too bad.

    drop table #hirNode

    go

    Create Table #hirNode(

    hirNode Int

    ,hirNodeParent int

    ,Title varchar(100)

    )

    Insert into #hirNode

    values (1,NULL,'Start')

    ,(2,1,'Second')

    ,(3,2,'Third')

    ,(4,NULL,'Start2')

    ,(5,4,'Second2');

    WITH BuildPath(hirNode,hirNodeParent, NodeLevel, NodePath)

    AS (SELECT hirNode,

    hirNodeParent,

    1,

    CONVERT(varchar(255), '\\'+Title)

    FROM #hirNode

    WHERE hirNodeParent IS NULL --Start at the Top

    UNION ALL

    SELECT

    h.hirNode,

    h.hirNodeParent,

    NodeLevel + 1,

    CONVERT (varchar(255), RTRIM(NodePath) + '\' + Title)

    FROM #hirNode AS h

    JOIN BuildPath AS p ON h.hirNodeParent = p.hirNode

    )

    Select * from BuildPath

    The output for NodePath is

    \\Start

    \\Start\Second

    \\Start\Second\Third

    \\Start2

    \\Start2\Second2

    It might need ordering but its just an example, is that what you are looking for?

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

  • Jason this looks right. But how to update FullPath column using Recursive CTE? I have to update all the rows of the table for FullPath column. Actually the right way would be to check if the FullPath column has correct value if not then update it.

  • this should work

    drop table #hirNode

    go

    Create Table #hirNode(

    hirNode Int

    ,hirNodeParent int

    ,Title varchar(100)

    )

    Insert into #hirNode

    values (1,NULL,'Start')

    ,(2,1,'Second')

    ,(3,2,'Third')

    ,(4,NULL,'Start2')

    ,(5,4,'Second2');

    WITH BuildPath(hirNode,hirNodeParent, NodeLevel, NodePath)

    AS (SELECT hirNode,

    hirNodeParent,

    1,

    CONVERT(varchar(255), '\\'+Title)

    FROM #hirNode

    WHERE hirNodeParent IS NULL --Start at the Top

    UNION ALL

    SELECT

    h.hirNode,

    h.hirNodeParent,

    NodeLevel + 1,

    CONVERT (varchar(255), RTRIM(NodePath) + '\' + Title)

    FROM #hirNode AS h

    JOIN BuildPath AS p ON h.hirNodeParent = p.hirNode

    )

    Update

    hr

    Set

    FullPath=bp.NodePath

    From

    hirNodes hr

    JOIN BuildPath bp on hr.hirNode=bp.hirNode

    but you should check it

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

  • This works like a charm.. updated 70k records in 24 sec.

    Thanks Jason 🙂

  • Glad to Help,

    You could add a where clause to check the NodePath with the FullPath column eg

    WHERE hr.FullPath!=NodePath

    Which should reduce the dataset, and save you having unnecessary updates.

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

  • I just found out that the live server is SQL 2000 and so I can't use Recursive CTE :crying:

    I have created this new script that would work on SQL Server 2000. This is way slower than recursive CTE. Please suggest any way of optimizing it.

    CREATE TABLE #tbl([hirNode]INT,

    [hirNodeParent]INT,

    [FullPath]varchar(1000))

    INSERT INTO #tbl

    SELECT hirNode, hirNodeParent, CONVERT(varchar(1000), '\\'+Title + '\')

    FROM hirNodes

    WHERE hirNodeParent IS NULL

    WHILE EXISTS(SELECT * FROM hirNodes AS t2 join #tbl t1

    on t2.hirNodeParent = t1.hirNode

    where t2.hirNode NOT IN (SELECT hirNode FROM #tbl))

    BEGIN

    INSERT INTO #tbl

    SELECT t2.hirNode,

    t2.hirNodeParent,

    CONVERT (varchar(255), RTRIM(t1.FullPath) + Title + '\')

    FROM hirNodes AS t2,

    #tbl t1

    WHERE t2.hirNodeParent = t1.hirNode

    AND t2.hirNode NOT IN (SELECT hirNode

    FROM #tbl)

    END

    SELECT hirNodes.hirNode, hirNodes.FullPath, #tbl.FullPath FROM hirNodes

    join #tbl on hirNodes.hirNode = #tbl.hirNode

    where hirNodes.FullPath <> #tbl.FullPath

    Drop Table #tbl

    Thanks,

    Kavita

  • Kavita,

    I know this is not a perfect solution, but it might work for you on SQL Server 2000. I am assuming your hierarchies are at most 7 levels deep. (If they are deeper, you can alter the script to accomodate for additional depth; though this would increase the processing time too)

    The idea is to build the list of nodes involved in each hierarchy and concatenate their titles upfront and then use this info to do an update.

    I ran this script on my laptop's sql server 2008 for around 540000 rows and it took around 40-45 seconds.

    Let me know how this works out for you.

    ----Build Hierarchy

    SELECT IDENTITY(INT, 1, 1) AS RowId, H1.hirNode As L1, H2.hirnode AS L2, H3.hirNode AS L3, H4.hirNode AS L4, H5.hirNode AS L5, H6.hirNode As L6, H7.hirNode AS L7,

    H1.Title AS LT1,

    H1.Title + '\\' + H2.Title AS LT2,

    H1.Title + '\\' + H2.Title + '\\' + H3.Title AS LT3,

    H1.Title + '\\' + H2.Title + '\\' + H3.Title + '\\' + H4.Title As LT4,

    H1.Title + '\\' + H2.Title + '\\' + H3.Title + '\\' + H4.Title + '\\' + H5.Title AS LT5,

    H1.Title + '\\' + H2.Title + '\\' + H3.Title + '\\' + H4.Title + '\\' + H5.Title + '\\' + H6.Title AS LT6,

    H1.Title + '\\' + H2.Title + '\\' + H3.Title + '\\' + H4.Title + '\\' + H5.Title + '\\' + H6.Title + '\\' + H7.Title AS LT7

    INTO #a

    from dbo.HirNodes H1

    LEFT JOIN dbo.HirNodes H2 ON (H1.hirNode = H2.hirNodeParent)

    LEFT JOIN dbo.HirNodes H3 ON (H2.hirNode = H3.hirNodeParent)

    LEFT JOIN dbo.HirNodes H4 ON (H3.hirNode = H4.hirNodeParent)

    LEFT JOIN dbo.HirNodes H5 ON (H4.hirNode = H5.hirNodeParent)

    LEFT JOIN dbo.HirNodes H6 ON (H5.hirNode = H6.hirNodeParent)

    LEFT JOIN dbo.HirNodes H7 ON (H6.hirNode = H7.hirNodeParent)

    --SELECT * FROM #a

    ----From the above, use only those rows where the complete hierarchy is represented

    SELECT A1.*

    INTO #b

    FROM #a A1

    LEFT JOIN #a A2 ON (A1.L1 = A2.L2)

    WHERE A2.L1 IS NULL

    --SELECT * FROM #b

    SELECT DISTINCT A.Node, A.Title AS FullPath

    INTO #c

    FROM (

    SELECT B1.L1 AS Node, B1.LT1 AS Title FROM #b B1

    UNION

    SELECT B2.L2 AS Node, B2.LT2 AS Title FROM #b B1 INNER JOIN #b B2 ON (B1.L1 = B2.L1)

    UNION

    SELECT B3.L3 AS Node, B3.LT3 AS Title FROM #b B2 INNER JOIN #b B3 ON (B2.L2 = B3.L2)

    UNION

    SELECT B4.L4 AS Node, B4.LT4 AS Title FROM #b B3 INNER JOIN #b B4 ON (B3.L3 = B4.L3)

    UNION

    SELECT B5.L5 As Node, B5.LT5 AS Title FROM #b B4 INNER JOIN #b B5 ON (B4.L4 = B5.L4)

    UNION

    SELECT B6.L6 AS Node, B6.LT6 AS Title FROM #b B5 INNER JOIN #b B6 ON (B5.L5 = B6.L5)

    UNION

    SELECT B7.L7 AS Node, B7.LT7 AS Title FROM #b B6 INNER JOIN #b B7 ON (B6.L6 = B7.L6)

    ) AS A

    WHERE A.Node IS NOT NULL

    /*

    ---- Use this to get a complete list of hierarchy for each node

    SELECT A.RowId, A.R1, A.Node, A.Title AS FullPath, A.Lvl

    INTO #c

    FROM (

    SELECT B1.RowId, B1.L1 AS R1, B1.L1 AS Node, B1.LT1 AS Title, 1 As Lvl FROM #b B1

    UNION

    SELECT B1.RowId, B1.L1 AS R1, B2.L2 AS Node, B2.LT2 AS Title, 2 As Lvl FROM #b B1 INNER JOIN #b B2 ON (B1.L1 = B2.L1)

    UNION

    SELECT B2.RowId, B2.L1 AS R1, B3.L3 AS Node, B3.LT3 AS Title, 3 As Lvl FROM #b B2 INNER JOIN #b B3 ON (B2.L2 = B3.L2)

    UNION

    SELECT B3.RowId, B3.L1 AS R1, B4.L4 AS Node, B4.LT4 AS Title, 4 As Lvl FROM #b B3 INNER JOIN #b B4 ON (B3.L3 = B4.L3)

    UNION

    SELECT B4.RowId, B4.L1 AS R1, B5.L5 As Node, B5.LT5 AS Title, 5 As Lvl FROM #b B4 INNER JOIN #b B5 ON (B4.L4 = B5.L4)

    UNION

    SELECT B5.RowId, B5.L1 AS R1, B6.L6 AS Node, B6.LT6 AS Title, 6 As Lvl FROM #b B5 INNER JOIN #b B6 ON (B5.L5 = B6.L5)

    UNION

    SELECT B6.RowId, B6.L1 AS R1, B7.L7 AS Node, B7.LT7 AS Title, 7 As Lvl FROM #b B6 INNER JOIN #b B7 ON (B6.L6 = B7.L6)

    ) AS A

    WHERE A.Node IS NOT NULL

    ORDER BY RowId, R1, Lvl

    */

    --SELECT COUNT(*) FROM #c

    --SELECT COUNT(*) FROM dbo.HirNodes

    ----Use Hierarchy

    UPDATE H SET H.FullPath = C.FullPath

    --SELECT H.hirNode, H.hirNodeParent, H.hirLevel, H.Title, H.FullPath, C.FullPath

    FROM dbo.HirNodes H

    INNER JOIN #c C ON (H.hirNode = C.Node)

    DROP TABLE #a, #b

    DROP TABLE #c

  • Thanks Sam.

    It IS faster though its not returning the correct fullPath.. I am working on it.

    I'll let you know how it goes

  • Hi Kavita,

    Sorry the CTE didnt help on the client site due to having SQL 2000 installed, just file it away for future reference, and beat the client with a big stick until they upgrade to SQL2008 or later.:-D

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

  • I apologize Sam. Its working perfectly fine.

    I ran it for 460K records and it took ~5mins which is really not bad.

    Thank You!

    Jason - I wish but it was a great learning experience. I am excited to use it in the future.

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply