May 5, 2009 at 2:06 pm
I have an application with known X and Y points, I'd like to run a function that would calculate m & b in y = mx +b then determine and return a desired Y (for a given X).
Is there perhaps a source for statistical functions in SQL that I could use for this and other stats calculations?
I basically worked this up by hand but it's not elegant... I'd also like r^2... and other things like 3rd order polynomials.
Perhaps a Freeware or commercial UDF set?? :w00t:
Thanks!
-Brian
May 5, 2009 at 2:09 pm
You're almost certainly going to be better off looking for .NET versions, and plugging them in as CLR functions.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
May 5, 2009 at 2:47 pm
I wrote a UDF to do Least Squares Linear Regression back in 2001 but the performance was so bad compared to simple SQL that I threw it away. The run time difference for SQL was 1/20th of a UDF.
Try the below SQL replacing these strings:
"dbo.T1" with the real table name.]
"X1" with the X column name
"Y1" with the Y column name
SELECTSlopeCoefficient
,(Y1_Mean - ( SlopeCoefficient * X1_Mean ) ) AS SlopeIntercept
FROM(
SELECT X1_Mean
,Y1_Mean
,( ( PopCnt * SUM(X1_Y1_Product) ) - ( SUM(X1) * SUM(X1) ) )
/ ( ( PopCnt * SUM(X1_DeviationSquared) ) - POWER(SUM(X1),2 ) )
as SlopeCoefficient
FROM(
SELECTT1.X1
,T1.Y1
,T1_Base.PopCnt
,T1_Base.X1_Mean
,T1_Base.Y1_Mean
,T1.X1 - X1_Mean AS X1_Deviation
,T1.Y1 - Y1_Mean AS Y1_Deviation
,(T1.X1 - X1_Mean ) * ( T1.Y1 - Y1_Mean ) AS X1_Y1_Product
,POWER( (T1.X1 - X1_Mean ) , 2 ) AS X1_DeviationSquared
,POWER( ( T1.Y1 - Y1_Mean ) , 2 ) AS Y1_DeviationSquared
FROM(
select AVG( CAST( T1.X1 as float ) ) as X1_Mean
,AVG( CAST( T1.Y1 as float ) ) as Y1_Mean
,COUNT( * ) as PopCnt
fromdbo.T1
) as T1_Base
CROSS JOIN dbo.T1
) as T1_D
GROUP BY X1_Mean
,Y1_Mean
)as t1_S
SQL = Scarcely Qualifies as a Language
May 5, 2009 at 2:55 pm
What would you like the UDF interface to look like?
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
May 5, 2009 at 3:29 pm
Thanks. The CLR approach sounds like the best option.
May 5, 2009 at 7:48 pm
Yes, I think that CLR is probably your best solution, but not because it is necessarily faster for any direct solution (I frankly doubt that it is), but because it is practically impossible to efficiently generalize any solution for this in T-SQL.
For instance, probably the most desirable form for a general least-squares solution would be as an aggregate function, but we cannot write custom aggregate functions in T-SQL, so the only way to present such a solution is through CLR.
[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]
Viewing 6 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply