Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Slow self join Expand / Collapse
Author
Message
Posted Tuesday, May 6, 2014 10:21 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Yesterday @ 1:19 AM
Points: 20, Visits: 101
I have a stored procedure which updates a table with the code below.
It is rather slow (table with 360,000 rows).
Any suggestions?

Thanks,
Julian


-- ADD average FTE Hrs for the Week based table with daily data
UPDATE DRPDATA
SET FTEHrsAVG =
(
SELECT AVG(B.FTEHrs)
FROM DRPDATA B
WHERE A.MWID = B.MWID
AND
DATEPART(ISO_WEEK,A.BEGINDATUM)= DATEPART(ISO_WEEK,B.BEGINDATUM)
AND
A.[Org eenheid code] = B.[Org eenheid code]
AND
B.FlexVast = 'VAST'
GROUP BY B.MWID,b.[Org eenheid code], DATEPART(ISO_WEEK,B.BEGINDATUM)
)
FROM DRPDATA A;


Post #1568073
Posted Tuesday, May 6, 2014 10:35 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 5:09 AM
Points: 12,901, Visits: 32,137
that looks like an accidental cross join to me.

it explicitly says UPDATE DRPDATA, but since the FROM has the table name as being aliased,and the unaliased table is not referenced anywhere else, there is an implied cross join like this:

UPDATE DRPDATA
SET ....
FROM DRPDATA A ,DRPDATA


the UPDATE should be UPDATE ALIASNAME explicitly.


	-- ADD average FTE Hrs for the Week based table with daily data
UPDATE A
SET A.FTEHrsAVG =
(
SELECT AVG(B.FTEHrs)
FROM DRPDATA B
WHERE A.MWID = B.MWID
AND
DATEPART(ISO_WEEK,A.BEGINDATUM)= DATEPART(ISO_WEEK,B.BEGINDATUM)
AND
A.[Org eenheid code] = B.[Org eenheid code]
AND
B.FlexVast = 'VAST'
GROUP BY B.MWID,b.[Org eenheid code], DATEPART(ISO_WEEK,B.BEGINDATUM)
)
FROM DRPDATA A;



Lowell

--There is no spoon, and there's no default ORDER BY in sql server either.
Actually, Common Sense is so rare, it should be considered a Superpower. --my son
Post #1568081
Posted Tuesday, May 6, 2014 11:10 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Yesterday @ 8:36 AM
Points: 337, Visits: 746
Some other things to consider:

1. Do you have indices on the join columns? (MWID, [Org Eenheid Code], BEGINDATUM)
2. If you have an index on BEGINDATUM, the expression

DATEPART(ISO_WEEK,A.BEGINDATUM)= DATEPART(ISO_WEEK,B.BEGINDATUM) 

will stop SQL server from using that index (Since it's a function call it's not a SARG). To get around that, try to reformulate it as

BEGINDATUM > <beginning of the week> AND BEGINDATUM < <beginning of following week>

Post #1568101
Posted Tuesday, May 6, 2014 11:22 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Today @ 5:06 AM
Points: 2,002, Visits: 5,464
In this case I would consider using a temporary table
Post #1568111
Posted Tuesday, May 6, 2014 11:30 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Yesterday @ 1:19 AM
Points: 20, Visits: 101
Hello Lowell,

Thanks for your response.
For some reason the original code is 15 seconds faster (2.25 min iso 2.5 min)....

Cheers,
Julian
Post #1568115
Posted Tuesday, May 6, 2014 11:32 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Yesterday @ 1:19 AM
Points: 20, Visits: 101
The code is part of a stored procedure which bulk inserts data from a csv file, using a Staging table.
Thanks for the hint. Job for later.
Cheers,
Julian

gbritton1 (5/6/2014)
Some other things to consider:

1. Do you have indices on the join columns? (MWID, [Org Eenheid Code], BEGINDATUM)
2. If you have an index on BEGINDATUM, the expression

DATEPART(ISO_WEEK,A.BEGINDATUM)= DATEPART(ISO_WEEK,B.BEGINDATUM) 

will stop SQL server from using that index (Since it's a function call it's not a SARG). To get around that, try to reformulate it as

BEGINDATUM > <beginning of the week> AND BEGINDATUM < <beginning of following week>

Post #1568116
Posted Tuesday, May 6, 2014 11:47 AM This worked for the OP Answer marked as solution
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Today @ 2:47 AM
Points: 1,031, Visits: 6,730
Without sample data to test against...

Try this.
;WITH Updater AS (
SELECT
FTEHrsAVG,
NewFTEHrsAVG = AVG(CASE WHEN FlexVast = 'VAST' THEN FTEHrs ELSE NULL END)
OVER(PARTITION BY MWID, [Org eenheid code], DATEPART(ISO_WEEK,A.BEGINDATUM))
FROM DRPDATA o
WHERE EXISTS (
SELECT 1 FROM DRPDATA i
WHERE i.MWID = o.MWID
AND DATEPART(ISO_WEEK,i.BEGINDATUM)= DATEPART(ISO_WEEK,o.BEGINDATUM)
AND i.[Org eenheid code] = o.[Org eenheid code]
AND i.FlexVast = 'VAST')
)
UPDATE Updater SET FTEHrsAVG = NewFTEHrsAVG

You might not need the EXISTS check, it depends on what you want to do with unaffected partitions.
If the number of affected rows is relatively high and unaffected partitions should have FTEHrsAVG set to NULL, then try without the EXISTS check - it should be very quick.



Low-hanging fruit picker and defender of the moggies





For better assistance in answering your questions, please read this.




Understanding and using APPLY, (I) and (II) Paul White

Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Post #1568120
Posted Tuesday, May 6, 2014 2:04 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Tuesday, July 22, 2014 10:43 PM
Points: 39, Visits: 388
ChrisM@home (5/6/2014)
Without sample data to test against...

Try this.
;WITH Updater AS (
SELECT
FTEHrsAVG,
NewFTEHrsAVG = AVG(CASE WHEN FlexVast = 'VAST' THEN FTEHrs ELSE NULL END)
OVER(PARTITION BY MWID, [Org eenheid code], DATEPART(ISO_WEEK,A.BEGINDATUM))
FROM DRPDATA o
WHERE EXISTS (
SELECT 1 FROM DRPDATA i
WHERE i.MWID = o.MWID
AND DATEPART(ISO_WEEK,i.BEGINDATUM)= DATEPART(ISO_WEEK,o.BEGINDATUM)
AND i.[Org eenheid code] = o.[Org eenheid code]
AND i.FlexVast = 'VAST')
)
UPDATE Updater SET FTEHrsAVG = NewFTEHrsAVG

You might not need the EXISTS check, it depends on what you want to do with unaffected partitions.
If the number of affected rows is relatively high and unaffected partitions should have FTEHrsAVG set to NULL, then try without the EXISTS check - it should be very quick.


That is amazing. Would you perhaps have a reference to other examples of CTE usage like these?
Post #1568177
Posted Wednesday, May 7, 2014 2:21 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 4:44 AM
Points: 6,750, Visits: 13,893
Bouke Bruinsma (5/6/2014)
ChrisM@home (5/6/2014)
Without sample data to test against...

Try this.
;WITH Updater AS (
SELECT
FTEHrsAVG,
NewFTEHrsAVG = AVG(CASE WHEN FlexVast = 'VAST' THEN FTEHrs ELSE NULL END)
OVER(PARTITION BY MWID, [Org eenheid code], DATEPART(ISO_WEEK,A.BEGINDATUM))
FROM DRPDATA o
WHERE EXISTS (
SELECT 1 FROM DRPDATA i
WHERE i.MWID = o.MWID
AND DATEPART(ISO_WEEK,i.BEGINDATUM)= DATEPART(ISO_WEEK,o.BEGINDATUM)
AND i.[Org eenheid code] = o.[Org eenheid code]
AND i.FlexVast = 'VAST')
)
UPDATE Updater SET FTEHrsAVG = NewFTEHrsAVG

You might not need the EXISTS check, it depends on what you want to do with unaffected partitions.
If the number of affected rows is relatively high and unaffected partitions should have FTEHrsAVG set to NULL, then try without the EXISTS check - it should be very quick.


That is amazing. Would you perhaps have a reference to other examples of CTE usage like these?


Thanks. That's why I lurk here - to learn stuff like this. It's what I'd recommend to you too. When something interesting or new comes up, copy it and pull it apart locally to see how it works - and if you can improve on it.


“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Exploring Recursive CTEs by Example Dwain Camps
Post #1568324
Posted Wednesday, May 7, 2014 1:42 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Tuesday, July 22, 2014 10:43 PM
Points: 39, Visits: 388
I totally agree. I pick up something useful everytime I browse around the forums. And I could give quite a bit of examples of practical usage in my work of stuff I learned here.
So yeah, i'll be lurking the forums.
Post #1568679
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse