Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Slow Update Processing - Badly Written Code ? Expand / Collapse
Author
Message
Posted Tuesday, February 19, 2013 7:01 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 2:48 PM
Points: 2,832, Visits: 8,507
I'm at a new place, trying to find my way around. There's an import/update process that runs very slowly and I don't see an obvious cause.

The process bring in records of people that are connected, similar to Facebook friends ie: Joe became friends with Mary on Jan 1. And Mary became friends with Joe on Feb 1. So there are 2 records initially, but I want to merge them into 1 record making them friends with each other and putting the Jan 1 date as a created date, and Feb 1 as Modified date.

I have the merged records already populated in table FriendExport_NoDup, so just the update of FriendExport_NoDup from the duplicate data in FriendExport_Temp is the problem.

Here is the code. I think it does the right thing, but takes a VERY long time to process. There is one query that updates the CreateDate, and a second that updates a status code and ModifiedDate. I'm sure it could be rolled into 1, but that;s how it is for now.

I notice the ID numbers are NVARCHAR, would that hurt performance for joining ? I could change them to int.
EDIT: I changed the 2 ID columns in the tables to INT, and it seems to run faster, but still much slower than I would expect.


-- Table1 Original File containing duplicates

CREATE TABLE dbo.FriendExport_Temp(
Mem_ID nvarchar(50) NULL,
ConnectionID nvarchar(50) NULL,
Status nvarchar(50) NULL,
DateUpdated datetime NULL
) ON PRIMARY

CREATE NONCLUSTERED INDEX IX_ConnectionID ON dbo.FriendExport_Temp
( ConnectionID ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON PRIMARY

CREATE NONCLUSTERED INDEX IX_DateUpdated ON dbo.FriendExport_Temp
( DateUpdated ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON PRIMARY

CREATE NONCLUSTERED INDEX IX_MEM_ID ON dbo.FriendExport_Temp
( Mem_ID ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON PRIMARY

-- Table 2 Duplicates removed
CREATE TABLE dbo.FriendExport_NoDup(
Mem_ID nvarchar(50) NULL,
ConnectionID nvarchar(50) NULL,
Status nvarchar(50) NULL,
DateUpdated datetime NULL,
CreatedDateTime datetime NULL,
ModifiedDateTime datetime NULL ) ON PRIMARY

CREATE NONCLUSTERED INDEX IX_ConnectionID ON dbo.FriendExport_NoDup
( ConnectionID ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON PRIMARY

CREATE NONCLUSTERED INDEX IX_CreatedDate ON dbo.FriendExport_NoDup
( CreatedDateTime ASC,
Mem_ID ASC,
ConnectionID ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON PRIMARY

CREATE NONCLUSTERED INDEX IX_MEM_ID ON dbo.FriendExport_NoDup
( Mem_ID ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON PRIMARY



truncate table FriendExport_Temp

insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100671869, 100667739, 1, '2011-01-04 09:47:18.000'
insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100671909, 100667739, 3, '2011-01-04 04:41:21.000'
insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100672019, 100667739, 1, '2011-01-04 10:07:57.000'
insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100672409, 100667739, 1, '2011-01-03 15:14:25.000'
insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100673659, 100667739, 1, '2011-01-03 15:14:27.000'
insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100675249, 100667739, 1, '2011-01-03 15:14:29.000'
insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 101654749, 100667739, 1, '2011-01-04 09:52:41.000'
insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 101670839, 100667739, 1, '2011-01-04 10:14:47.000'
insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 101678529, 100667739, 1, '2011-01-04 10:26:17.000'
insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 101714309, 100667739, 1, '2011-01-04 11:15:56.000'
insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 101783579, 100667739, 1, '2011-01-04 12:58:32.000'
insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 101813709, 100667739, 1, '2011-01-04 13:48:52.000'
insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 101830369, 100667739, 1, '2011-01-04 14:01:22.000'
insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100667739, 100671869, 1, '2011-01-04 09:47:18.000'
insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100667739, 100671909, 3, '2011-01-04 04:41:21.000'
insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100667739, 100672019, 1, '2011-01-04 10:07:57.000'
insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100667739, 100672409, 1, '2011-01-03 15:14:25.000'
insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100667739, 100673659, 1, '2011-01-03 15:14:27.000'
insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100667739, 100675249, 1, '2011-01-03 15:14:29.000'
insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100667739, 101654749, 1, '2011-01-04 09:52:41.000'
insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100667739, 101670839, 1, '2011-01-04 10:14:47.000'
insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100667739, 101678529, 1, '2011-01-04 10:26:17.000'
insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100667739, 101714309, 1, '2011-01-04 11:15:56.000'
insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100667739, 101783579, 1, '2011-01-04 12:58:32.000'
insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100667739, 101813709, 1, '2011-01-04 13:48:52.000'
insert into FriendExport_Temp (Mem_ID,ConnectionID,Status,DateUpdated) select 100667739, 101830369, 1, '2011-01-04 14:01:22.000'

select * from FriendExport_Temp


truncate table FriendExport_NoDup

insert into FriendExport_NoDup (Mem_ID,ConnectionID,Status,DateUpdated, CreatedDateTime, ModifiedDateTime) select 100671869, 100667739, 1 ,'2011-01-04 09:47:18.000', NULL, NULL
insert into FriendExport_NoDup (Mem_ID,ConnectionID,Status,DateUpdated, CreatedDateTime, ModifiedDateTime) select 100671909, 100667739, 1, '2011-01-04 04:41:21.000', NULL, NULL
insert into FriendExport_NoDup (Mem_ID,ConnectionID,Status,DateUpdated, CreatedDateTime, ModifiedDateTime) select 100672019, 100667739, 1, '2011-01-04 10:07:57.000', NULL, NULL
insert into FriendExport_NoDup (Mem_ID,ConnectionID,Status,DateUpdated, CreatedDateTime, ModifiedDateTime) select 100672409, 100667739, 1, '2011-01-03 15:14:25.000', NULL, NULL
insert into FriendExport_NoDup (Mem_ID,ConnectionID,Status,DateUpdated, CreatedDateTime, ModifiedDateTime) select 100673659, 100667739, 1, '2011-01-03 15:14:27.000', NULL, NULL
insert into FriendExport_NoDup (Mem_ID,ConnectionID,Status,DateUpdated, CreatedDateTime, ModifiedDateTime) select 100675249, 100667739, 1, '2011-01-03 15:14:29.000', NULL, NULL
insert into FriendExport_NoDup (Mem_ID,ConnectionID,Status,DateUpdated, CreatedDateTime, ModifiedDateTime) select 101654749, 100667739, 1, '2011-01-04 09:52:41.000', NULL, NULL
insert into FriendExport_NoDup (Mem_ID,ConnectionID,Status,DateUpdated, CreatedDateTime, ModifiedDateTime) select 101670839, 100667739, 1, '2011-01-04 10:14:47.000', NULL, NULL
insert into FriendExport_NoDup (Mem_ID,ConnectionID,Status,DateUpdated, CreatedDateTime, ModifiedDateTime) select 101678529, 100667739, 1, '2011-01-04 10:26:17.000', NULL, NULL
insert into FriendExport_NoDup (Mem_ID,ConnectionID,Status,DateUpdated, CreatedDateTime, ModifiedDateTime) select 101714309, 100667739, 1, '2011-01-04 11:15:56.000', NULL, NULL
insert into FriendExport_NoDup (Mem_ID,ConnectionID,Status,DateUpdated, CreatedDateTime, ModifiedDateTime) select 101783579, 100667739, 1, '2011-01-04 12:58:32.000', NULL, NULL
insert into FriendExport_NoDup (Mem_ID,ConnectionID,Status,DateUpdated, CreatedDateTime, ModifiedDateTime) select 101813709, 100667739, 1, '2011-01-04 13:48:52.000', NULL, NULL
insert into FriendExport_NoDup (Mem_ID,ConnectionID,Status,DateUpdated, CreatedDateTime, ModifiedDateTime) select 101830369, 100667739, 1, '2011-01-04 14:01:22.000', NULL, NULL

select * from FriendExport_NoDup


/* Update 1 record in FriendExport_NoDup with information from 2 records in FriendExport_Temp
(1) update "ModifiedDateTime" with the most recent "DateUpdated"
(2) Update "STATUS" with the Status from the most recent "DateUpdated"
(3) Update "CreatedDateTime" with the earliest "DateUpdated"

Processing batches so I can track how fast it's updating
*/
UpdateMore:
WAITFOR DELAY '00:00:01'
-- update modified data & status
update top (500) FriendExport_NoDup
set ModifiedDateTime = (select top 1 dateUpdated
from FriendExport_Temp FE
where (mem_id = FE.MEM_ID and FND.ConnectionID = FE.ConnectionID)
or (mem_id = FE.ConnectionID and FE.MEM_ID = FND.ConnectionID)
order by dateUpdated desc)
, status = (select top 1 status from FriendExport_Temp FE
where (mem_id = FE.MEM_ID and FND.ConnectionID = FE.ConnectionID)
or (mem_id = FE.ConnectionID and FE.MEM_ID = FND.ConnectionID)
order by dateUpdated desc)
from FriendExport_NoDup FND
join FriendExport_Temp TMP on (TMP.mem_id = FND.MEM_ID and TMP.ConnectionID = FND.ConnectionID)
or (TMP.mem_id = FND.ConnectionID and TMP.ConnectionID = FND.MEM_ID)
where ModifiedDateTime is null
if @@rowcount > 0 goto UpdateMore

------
-- update created date
update top (500) FriendExport_NoDup
set createdDateTime = (select top 1 dateUpdated --min(dateUpdated)
from FriendExport_Temp FE
where (FND.mem_id = FE.MEM_ID and FND.ConnectionID = FE.ConnectionID)
or (FND.mem_id = FE.ConnectionID and FE.MEM_ID = FND.ConnectionID)
order by FE.dateUpdated )
from FriendExport_NoDup FND
join FriendExport_Temp TMP on (TMP.mem_id = FND.MEM_ID and TMP.ConnectionID = FND.ConnectionID)
or (TMP.mem_id = FND.ConnectionID and TMP.ConnectionID = FND.MEM_ID)
where createdDateTime is null




Post #1421632
Posted Tuesday, February 19, 2013 8:59 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 1:11 PM
Points: 12,905, Visits: 32,161
homebrew i think this is equivilent to your first update query, but should theoretically perform better.
compare the SELECT to teh UPDATE , and tell me if it looks like it woudl update the same number of rows you are expecting:
--visualizing the matching rows
SELECT FND.MEM_ID,FND.ConnectionID,FND.ModifiedDateTime,FND.[status],FE.RW,FE.MEM_ID,FE.ConnectionID,FE.ModifiedDateTime, FE.[status]
FROM FriendExport_NoDup FND
INNER JOIN (SELECT
row_number() over (PARTITION BY mem_id ORDER BY mem_id,dateUpdated DESC) AS RW,
MEM_ID,
ConnectionID,
dateUpdated,
[status]
FROM FriendExport_Temp) FE
on (FE.mem_id = FND.MEM_ID and FE.ConnectionID = FND.ConnectionID)
or (FE.mem_id = FND.ConnectionID and FE.ConnectionID = FND.MEM_ID)
where FE.ModifiedDateTime is null
AND FE.RW = 1

--the update candidate
update FND
set FND.ModifiedDateTime = FE.dateUpdated,
status = FE.status
--SELECT FND.MEM_ID,FND.ConnectionID,FND.ModifiedDateTime,FND.[status],FE.RW,FE.MEM_ID,FE.ConnectionID,FE.ModifiedDateTime, FE.[status]
FROM FriendExport_NoDup FND
INNER JOIN (SELECT
row_number() over (PARTITION BY mem_id ORDER BY mem_id,dateUpdated DESC) AS RW,
MEM_ID,
ConnectionID,
dateUpdated,
[status]
FROM FriendExport_Temp) FE
on (FE.mem_id = FND.MEM_ID and FE.ConnectionID = FND.ConnectionID)
or (FE.mem_id = FND.ConnectionID and FE.ConnectionID = FND.MEM_ID)
where FE.ModifiedDateTime is null
AND FE.RW = 1



Lowell

--There is no spoon, and there's no default ORDER BY in sql server either.
Actually, Common Sense is so rare, it should be considered a Superpower. --my son
Post #1421696
Posted Thursday, February 21, 2013 10:28 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 2:48 PM
Points: 2,832, Visits: 8,507
Thanks, I will give that a try.


Post #1422710
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse