Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Speeding up updates that use STDEV (standard deviation) Expand / Collapse
Author
Message
Posted Monday, March 2, 2009 1:39 PM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Monday, March 10, 2014 7:19 AM
Points: 83, Visits: 251
I am adding a column to a table in my database that tracks the rolling standard deviation over a specified period of time.

So far, I have been able to implement this, but it is really slow! I'm hoping someone might have a hint as to how I might be able to speed this procedure up a bit...

Or maybe I'm heading in the complete wrong direction and SQL is not the right tool to be using for this. I know it's not meant to be a statistics workhorse, but I didn't think STDEV was really that intense of a statistical computation...

Here's information on the tables and the code I'm using so far - this is for a rolling stdev calculation over the 30 days prior.

At the start of the procedure, myTable contains rows like:
IDNumber INT
Date DATETIME
OHdiff FLOAT
OLdiff FLOAT
StDev_OH_30Day FLOAT
StDev_OL_30Day FLOAT

The StDev columns are NULL at the beginning, these are what I will be filling in with the procedure below:
NOTE: @lastUpdate, @updateThrough, @minIDNumber and @maxIDNumber are set by another part of my query, and are just making sure that I'm not duplicating work that's already been done...

	UPDATE a
SET StDev_OH_30Day=(SELECT STDEV(b.OHdiff)
FROM myTable b
WHERE b.IDNumber= a.IDNumber
AND b.MarketDate < a.MarketDate
AND b.MarketDate > a.MarketDate-31 )
, StDev_OL_30Day=(SELECT STDEV(b.OLdiff)
FROM myTable b
WHERE b.IDNumber = a.IDNumber
AND b.MarketDate < a.MarketDate
AND b.MarketDate > a.MarketDate-31 )
FROM myTable a
WHERE
a.MarketDate BETWEEN @lastUpdate+.1 AND @updateThrough
AND a.IDNumber BETWEEN @minIDNumber AND @maxIDNumber

It takes about 10 minutes to run this query on a little less than 90K rows. Increasing the time period for all the stdev calcs from 30 days to 90 days increases the processing time to 15 minutes.

This is meant to be part of a sequence of other updates, so the snappier I can make it, the better =)

Any ideas are always appreciated - Thanks!
Post #666871
Posted Monday, March 2, 2009 2:25 PM


SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Thursday, May 29, 2014 9:29 AM
Points: 214, Visits: 647
Might be a cool time to try something built in the SQL CLR to see if you can get any perf gains.




Lee Everest

Post #666910
Posted Monday, March 2, 2009 3:46 PM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Monday, March 10, 2014 7:19 AM
Points: 83, Visits: 251
Sorry, but I'm not sure I really understand what you mean. I'm still really new to SQL.

Do you mean defining a user-defined function with C or some other language like in the examples on this page?http://msdn.microsoft.com/en-us/library/w2kae45k(VS.80).aspx
Post #666955
Posted Monday, March 2, 2009 4:34 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 9:20 AM
Points: 12,916, Visits: 32,080
can you show us the execution plan when you run the update? that will tell us exactly what the update is doing, and allow us to offer suggestions like indexes on the date column, etc that might speed up the update

Lowell

--There is no spoon, and there's no default ORDER BY in sql server either.
Actually, Common Sense is so rare, it should be considered a Superpower. --my son
Post #666967
Posted Monday, March 2, 2009 4:39 PM


SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Thursday, May 29, 2014 9:29 AM
Points: 214, Visits: 647
Exactly. And it doesn't need to be just limited to one clr function...just an idea.




Lee Everest

Post #666968
Posted Tuesday, March 3, 2009 9:17 AM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Monday, March 10, 2014 7:19 AM
Points: 83, Visits: 251
I would love suggestions on how to improve my indexing!

Right now I have 3 indexes on my table:
1) A clustered PK index on a RowID column
2) A non-clustered index on Date, then IDNumber which includes the StDev cols (and a few other cols not involved in this update)
3) A non-clustered index on IDNumber then Date which includes the StDev cols (and a few other cols not involved in this update)

Was that what you were looking for?

Thanks!

Post #667383
Posted Tuesday, March 3, 2009 3:55 PM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Yesterday @ 10:06 AM
Points: 181, Visits: 717
I'm not sure why you have two separate subqueries to calculate the two standard deviations. It looks to me like it could be one.

I would also change the whole thing to a CTE. It should be much faster that way.
Post #667725
Posted Wednesday, March 4, 2009 8:42 AM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Monday, March 10, 2014 7:19 AM
Points: 83, Visits: 251
Sorry, but I'm not sure I follow... I'm learning a lot just from trying to understand what everyone here is suggesting.

Would you mind explaining a bit further how I might be able to put the two separate subqueries into one as part of an update statement?

Thanks!
Post #668327
Posted Wednesday, March 4, 2009 9:39 AM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Monday, March 10, 2014 7:19 AM
Points: 83, Visits: 251
In trying to combine the two subqueries into one, I got this far, but have encountered the error messages:
The multi-part identifier "a.IDNumber" could not be bound.
The multi-part identifier "a.MarketDate" could not be bound.

	UPDATE a
SET StDev_OH_30Day=T.StDev_OHdiff
, StDev_OL_30Day=T.StDev_OLdiff
FROM myTable,
(SELECT STDEV(b.OHdiff) AS StDev_OHdiff
, STDEV(b.OLdiff) AS StDev_OLdiff
FROM myTable b
WHERE b.IDNumber = a.IDNumber
AND b.MarketDate < a.MarketDate
AND b.MarketDate > a.MarketDate-31 ) AS T
WHERE
a.MarketDate BETWEEN @lastUpdate+.1 AND @updateThrough
AND a.IDNumber BETWEEN @minIDNumber AND @maxIDNumber



If anyone's got any ideas or suggestions, I would really appreciate it.

Thanks!
Post #668422
Posted Wednesday, March 4, 2009 11:43 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Yesterday @ 10:06 AM
Points: 181, Visits: 717
Here's a sample of it as a CTE - note the ; before it. You will need that if this is part of a procedure that does other things.
;
WITH get_stddev AS
( SELECT a.idnumber ,
a.marketdate ,
STDEV(b.OHdiff) stddev_oh ,
STDEV(b.OLdiff) stddev_ol

FROM myTable b
JOIN myTable a
ON
(
b.IDNumber = a.IDNumber
AND b.MarketDate < a.MarketDate
AND b.MarketDate > a.MarketDate-31
)
WHERE a.MarketDate BETWEEN @lastUpdate + .1 AND @updateThrough
AND a.IDNumber BETWEEN @minIDNumber AND @maxIDNumber
GROUP BY a.idnumber,
a.marketdate
) ----- end cte
UPDATE c
SET StDev_OH_30Day = stddev_oh ,
StDev_OL_30Day = stddev_ol
FROM mytable c
JOIN get_stddev gs
ON
(
gs.IDNumber = c.IDNumber
AND c.marketdate = gs.marketdate )

I urge you to test this thoroughly. I wasn't completely sure of your data set-up. It's possible this needs some changes for your tables.
Post #668549
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse