Slow Update Processing - Badly Written Code ?

  • I'm at a new place, trying to find my way around. There's an import/update process that runs very slowly and I don't see an obvious cause.

    The process bring in records of people that are connected, similar to Facebook friends ie: Joe became friends with Mary on Jan 1. And Mary became friends with Joe on Feb 1. So there are 2 records initially, but I want to merge them into 1 record making them friends with each other and putting the Jan 1 date as a created date, and Feb 1 as Modified date.

    I have the merged records already populated in table FriendExport_NoDup, so just the update of FriendExport_NoDup from the duplicate data in FriendExport_Temp is the problem.

    Here is the code. I think it does the right thing, but takes a VERY long time to process. There is one query that updates the CreateDate, and a second that updates a status code and ModifiedDate. I'm sure it could be rolled into 1, but that;s how it is for now.

    I notice the ID numbers are NVARCHAR, would that hurt performance for joining ? I could change them to int.

    EDIT: I changed the 2 ID columns in the tables to INT, and it seems to run faster, but still much slower than I would expect.

    -- Table1 Original File containing duplicates

    CREATE TABLE dbo.FriendExport_Temp(

    Mem_ID nvarchar(50) NULL,

    ConnectionID nvarchar(50) NULL,

    Status nvarchar(50) NULL,

    DateUpdated datetime NULL

    ) ON PRIMARY

    CREATE NONCLUSTERED INDEX IX_ConnectionID ON dbo.FriendExport_Temp

    (ConnectionID ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON PRIMARY

    CREATE NONCLUSTERED INDEX IX_DateUpdated ON dbo.FriendExport_Temp

    (DateUpdated ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON PRIMARY

    CREATE NONCLUSTERED INDEX IX_MEM_ID ON dbo.FriendExport_Temp

    (Mem_ID ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON PRIMARY

    -- Table 2 Duplicates removed

    CREATE TABLE dbo.FriendExport_NoDup(

    Mem_ID nvarchar(50) NULL,

    ConnectionID nvarchar(50) NULL,

    Status nvarchar(50) NULL,

    DateUpdated datetime NULL,

    CreatedDateTime datetime NULL,

    ModifiedDateTime datetime NULL ) ON PRIMARY

    CREATE NONCLUSTERED INDEX IX_ConnectionID ON dbo.FriendExport_NoDup

    (ConnectionID ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON PRIMARY

    CREATE NONCLUSTERED INDEX IX_CreatedDate ON dbo.FriendExport_NoDup

    (CreatedDateTime ASC,

    Mem_ID ASC,

    ConnectionID ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON PRIMARY

    CREATE NONCLUSTERED INDEX IX_MEM_ID ON dbo.FriendExport_NoDup

    (Mem_ID ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON PRIMARY

    truncate table FriendExport_Temp

    insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100671869,100667739,1,'2011-01-04 09:47:18.000'

    insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100671909,100667739,3,'2011-01-04 04:41:21.000'

    insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100672019,100667739,1,'2011-01-04 10:07:57.000'

    insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100672409,100667739,1,'2011-01-03 15:14:25.000'

    insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100673659,100667739,1,'2011-01-03 15:14:27.000'

    insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100675249,100667739,1,'2011-01-03 15:14:29.000'

    insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 101654749,100667739,1,'2011-01-04 09:52:41.000'

    insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 101670839,100667739,1,'2011-01-04 10:14:47.000'

    insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 101678529,100667739,1,'2011-01-04 10:26:17.000'

    insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 101714309,100667739,1,'2011-01-04 11:15:56.000'

    insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 101783579,100667739,1,'2011-01-04 12:58:32.000'

    insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 101813709,100667739,1,'2011-01-04 13:48:52.000'

    insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 101830369,100667739,1,'2011-01-04 14:01:22.000'

    insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100667739,100671869,1,'2011-01-04 09:47:18.000'

    insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100667739,100671909,3,'2011-01-04 04:41:21.000'

    insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100667739,100672019,1,'2011-01-04 10:07:57.000'

    insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100667739,100672409,1,'2011-01-03 15:14:25.000'

    insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100667739,100673659,1,'2011-01-03 15:14:27.000'

    insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100667739,100675249,1,'2011-01-03 15:14:29.000'

    insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100667739,101654749,1,'2011-01-04 09:52:41.000'

    insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100667739,101670839,1,'2011-01-04 10:14:47.000'

    insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100667739,101678529,1,'2011-01-04 10:26:17.000'

    insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100667739,101714309,1,'2011-01-04 11:15:56.000'

    insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100667739,101783579,1,'2011-01-04 12:58:32.000'

    insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100667739,101813709,1,'2011-01-04 13:48:52.000'

    insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100667739,101830369,1,'2011-01-04 14:01:22.000'

    select * from FriendExport_Temp

    truncate table FriendExport_NoDup

    insert into FriendExport_NoDup (Mem_ID,ConnectionID,Status,DateUpdated, CreatedDateTime, ModifiedDateTime) select 100671869,100667739,1,'2011-01-04 09:47:18.000',NULL,NULL

    insert into FriendExport_NoDup (Mem_ID,ConnectionID,Status,DateUpdated, CreatedDateTime, ModifiedDateTime) select 100671909,100667739,1, '2011-01-04 04:41:21.000',NULL,NULL

    insert into FriendExport_NoDup (Mem_ID,ConnectionID,Status,DateUpdated, CreatedDateTime, ModifiedDateTime) select 100672019,100667739,1,'2011-01-04 10:07:57.000',NULL,NULL

    insert into FriendExport_NoDup (Mem_ID,ConnectionID,Status,DateUpdated, CreatedDateTime, ModifiedDateTime) select 100672409,100667739,1,'2011-01-03 15:14:25.000',NULL,NULL

    insert into FriendExport_NoDup (Mem_ID,ConnectionID,Status,DateUpdated, CreatedDateTime, ModifiedDateTime) select 100673659,100667739,1,'2011-01-03 15:14:27.000',NULL,NULL

    insert into FriendExport_NoDup (Mem_ID,ConnectionID,Status,DateUpdated, CreatedDateTime, ModifiedDateTime) select 100675249,100667739,1,'2011-01-03 15:14:29.000',NULL,NULL

    insert into FriendExport_NoDup (Mem_ID,ConnectionID,Status,DateUpdated, CreatedDateTime, ModifiedDateTime) select 101654749,100667739,1,'2011-01-04 09:52:41.000',NULL,NULL

    insert into FriendExport_NoDup (Mem_ID,ConnectionID,Status,DateUpdated, CreatedDateTime, ModifiedDateTime) select 101670839,100667739,1,'2011-01-04 10:14:47.000',NULL,NULL

    insert into FriendExport_NoDup (Mem_ID,ConnectionID,Status,DateUpdated, CreatedDateTime, ModifiedDateTime) select 101678529,100667739,1,'2011-01-04 10:26:17.000',NULL,NULL

    insert into FriendExport_NoDup (Mem_ID,ConnectionID,Status,DateUpdated, CreatedDateTime, ModifiedDateTime) select 101714309,100667739,1,'2011-01-04 11:15:56.000',NULL,NULL

    insert into FriendExport_NoDup (Mem_ID,ConnectionID,Status,DateUpdated, CreatedDateTime, ModifiedDateTime) select 101783579,100667739,1,'2011-01-04 12:58:32.000',NULL,NULL

    insert into FriendExport_NoDup (Mem_ID,ConnectionID,Status,DateUpdated, CreatedDateTime, ModifiedDateTime) select 101813709,100667739,1,'2011-01-04 13:48:52.000',NULL,NULL

    insert into FriendExport_NoDup (Mem_ID,ConnectionID,Status,DateUpdated, CreatedDateTime, ModifiedDateTime) select 101830369,100667739,1,'2011-01-04 14:01:22.000',NULL,NULL

    select * from FriendExport_NoDup

    /* Update 1 record in FriendExport_NoDup with information from 2 records in FriendExport_Temp

    (1) update "ModifiedDateTime" with the most recent "DateUpdated"

    (2) Update "STATUS" with the Status from the most recent "DateUpdated"

    (3) Update "CreatedDateTime" with the earliest "DateUpdated"

    Processing batches so I can track how fast it's updating

    */

    UpdateMore:

    WAITFOR DELAY '00:00:01'

    -- update modified data & status

    update top (500) FriendExport_NoDup

    set ModifiedDateTime = (select top 1 dateUpdated

    from FriendExport_Temp FE

    where (mem_id = FE.MEM_ID and FND.ConnectionID = FE.ConnectionID)

    or (mem_id = FE.ConnectionID and FE.MEM_ID = FND.ConnectionID)

    order by dateUpdated desc)

    , status = (select top 1 status from FriendExport_Temp FE

    where (mem_id = FE.MEM_ID and FND.ConnectionID = FE.ConnectionID)

    or (mem_id = FE.ConnectionID and FE.MEM_ID = FND.ConnectionID)

    order by dateUpdated desc)

    from FriendExport_NoDup FND

    join FriendExport_Temp TMP on (TMP.mem_id = FND.MEM_ID and TMP.ConnectionID = FND.ConnectionID)

    or (TMP.mem_id = FND.ConnectionID and TMP.ConnectionID = FND.MEM_ID)

    where ModifiedDateTime is null

    if @@rowcount > 0 goto UpdateMore

    ------

    -- update created date

    update top (500) FriendExport_NoDup

    set createdDateTime = (select top 1 dateUpdated --min(dateUpdated)

    from FriendExport_Temp FE

    where (FND.mem_id = FE.MEM_ID and FND.ConnectionID = FE.ConnectionID)

    or (FND.mem_id = FE.ConnectionID and FE.MEM_ID = FND.ConnectionID)

    order by FE.dateUpdated )

    from FriendExport_NoDup FND

    join FriendExport_Temp TMP on (TMP.mem_id = FND.MEM_ID and TMP.ConnectionID = FND.ConnectionID)

    or (TMP.mem_id = FND.ConnectionID and TMP.ConnectionID = FND.MEM_ID)

    where createdDateTime is null

  • homebrew i think this is equivilent to your first update query, but should theoretically perform better.

    compare the SELECT to teh UPDATE , and tell me if it looks like it woudl update the same number of rows you are expecting:

    --visualizing the matching rows

    SELECT FND.MEM_ID,FND.ConnectionID,FND.ModifiedDateTime,FND.[status],FE.RW,FE.MEM_ID,FE.ConnectionID,FE.ModifiedDateTime, FE.[status]

    FROM FriendExport_NoDup FND

    INNER JOIN (SELECT

    row_number() over (PARTITION BY mem_id ORDER BY mem_id,dateUpdated DESC) AS RW,

    MEM_ID,

    ConnectionID,

    dateUpdated,

    [status]

    FROM FriendExport_Temp) FE

    on (FE.mem_id = FND.MEM_ID and FE.ConnectionID = FND.ConnectionID)

    or (FE.mem_id = FND.ConnectionID and FE.ConnectionID = FND.MEM_ID)

    where FE.ModifiedDateTime is null

    AND FE.RW = 1

    --the update candidate

    update FND

    set FND.ModifiedDateTime = FE.dateUpdated,

    status = FE.status

    --SELECT FND.MEM_ID,FND.ConnectionID,FND.ModifiedDateTime,FND.[status],FE.RW,FE.MEM_ID,FE.ConnectionID,FE.ModifiedDateTime, FE.[status]

    FROM FriendExport_NoDup FND

    INNER JOIN (SELECT

    row_number() over (PARTITION BY mem_id ORDER BY mem_id,dateUpdated DESC) AS RW,

    MEM_ID,

    ConnectionID,

    dateUpdated,

    [status]

    FROM FriendExport_Temp) FE

    on (FE.mem_id = FND.MEM_ID and FE.ConnectionID = FND.ConnectionID)

    or (FE.mem_id = FND.ConnectionID and FE.ConnectionID = FND.MEM_ID)

    where FE.ModifiedDateTime is null

    AND FE.RW = 1

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Thanks, I will give that a try.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply