Using User Defined functions with SQL Window operations

Question

Post reply

Using User Defined functions with SQL Window operations

sql_only

Old Hand

Points: 361
More actions
January 3, 2016 at 6:07 am

#310580

I have this table type
CREATE TYPE [dbo].[ty_custom] AS TABLE
(
cl_buffer INT NOT NULL,
cl_series INT NOT NULL
)
Which is an input in this function
CREATE FUNCTION [dbo].[fn_custom] (@tb_rho ty_custom READONLY)
RETURNS DECIMAL(18,8)
WITH EXECUTE AS CALLER
AS
BEGIN
-- the function code executes okay
END
that I would like to use like this
DECLARE @tb_custom ty_custom
INSERT INTO @tb_custom (cl_series,cl_buffer) VALUES(0,2)
INSERT INTO @tb_custom (cl_series,cl_buffer) VALUES(3,7)
INSERT INTO @tb_custom (cl_series,cl_buffer) VALUES(8,1)
SELECT(dbo.fn_custom) OVER(ORDER BY cl_buffer DESC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
in the same way that the inbuilt functions
SUM();
AVG();
are used.
How should I approach this?
You're all familiar with 'there's an app for that'... here's another... there's a CTE for that! 🙂

Viewing 11 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

Eirikur Eiriksson SSC Guru Points: 182951 More actions · Answer 1

This is not possible in this way, the over clause cannot be used with user defined functions.

😎

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 2

Scalar UDFs are HORRIBLY HORRIBLY BAD!!! Take the logic out and either inline it in your SELECT or make it an INLINE Table Valued Function and go with that. I have had situations at clients where I had to use THREE temp tables (NOT TABLE VARIABLES!!) and still got a FOUR ORDER OF MAGNITUDE performance improvement overall.

I wrote a chapter for the SQL Server MVP Deep Dives 2 book entitled "Death by UDF". It is very appropriately named (and the best chapter in the book, if I do say so myself)! 😀 I also have a SQL Saturday session of the same name. There are umpteen ways they screw you.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

sql_only Old Hand Points: 361 More actions · Answer 3

Eirikur Eiriksson (1/3/2016)
This is not possible in this way, the over clause cannot be used with user defined functions.
😎

Okay thanks, no wonder there is nothing online about this

You're all familiar with 'there's an app for that'... here's another... there's a CTE for that! 🙂

sql_only Old Hand Points: 361 More actions · Answer 4

TheSQLGuru (1/3/2016)
Scalar UDFs are HORRIBLY HORRIBLY BAD!!! Take the logic out and either inline it in your SELECT or make it an INLINE Table Valued Function and go with that. I have had situations at clients where I had to use THREE temp tables (NOT TABLE VARIABLES!!) and still got a FOUR ORDER OF MAGNITUDE performance improvement overall.
I wrote a chapter for the SQL Server MVP Deep Dives 2 book entitled "Death by UDF". It is very appropriately named (and the best chapter in the book, if I do say so myself)! 😀 I also have a SQL Saturday session of the same name. There are umpteen ways they screw you.

Okay thanks. Duly noted.:hehe:

You're all familiar with 'there's an app for that'... here's another... there's a CTE for that! 🙂

Hugo Kornelis SSC Guru Points: 64790 More actions · Answer 5

As a more generic response, if you want to use your own function in T-SQL the same way that aggregate functions like SUM are used, then you have no other option but to write a CLR User-defined Aggregate function.

I don't think you can use a table-valued data type as input for these, but perhaps you don't have to - I have a feeling that you had introduced that data type in an attempt to solve your actual problem.

I have never looked at how much of the windowing functions is supported with CLR user-defined aggregates. It is possible that there are limitations there, but you can always give it a shot. You will obviously not get the specific performance tricks that SQL Server can sometimes use for native aggregates based on its understanding of those estimates, but if it works at all, I expect at least a decent performance. Especially in the case you mention (the ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW window).

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

sql_only Old Hand Points: 361 More actions · Answer 6

It is a pity SQL Server has no inbuilt aggregate functions for correlation. 😉 I have to build one from scratch.

My function takes two columns so I will need to implement a UDT and UDA

This is how I implement CLR

using System;

using System.IO;

using System.IO.Compression;

using System.Data.SqlTypes;

using System.Collections.Generic;

using Microsoft.SqlServer.Server;

namespace transit.web

{

[Serializable]

[

SqlUserDefinedType

(

Format.UserDefined,

IsByteOrdered = true,

IsFixedLength = false,

MaxByteSize = 8000,

Name = "ty_rho",

ValidationMethodName = "Parse"

)

]

public struct Set : INullable, IBinarySerialize

{

public override string ToString()

{

throw new Exception("The method or operation is not implemented. ");

}

public bool IsNull

{

get

{

return false;

}

public static Set Null

{

get

{

throw new Exception("The method or operation is not implemented. ");

}

public static Set Parse(SqlDecimal d)

{

throw new Exception("The method or operation is not supported. ");

}

public Decimal FirstColumn;

public Decimal SecondColumn;

void IBinarySerialize.Read(BinaryReader r)

{

FirstColumn = r.ReadDecimal();

SecondColumn = r.ReadDecimal();

}

void IBinarySerialize.Write(BinaryWriter w)

{

w.Write(FirstColumn);

w.Write(SecondColumn);

}

public partial class UserDefinedFunctions

{

[SqlFunction]

public static Set SetColumns(Decimal First, Decimal Second)

{

Set _s = new Set();

_s.FirstColumn = First;

_s.SecondColumn = Second;

return _s;

}

[Serializable]

[

SqlUserDefinedAggregate

(

Format.UserDefined,

IsInvariantToNulls = true,

IsInvariantToDuplicates = false,

IsInvariantToOrder = false,

MaxByteSize = 8000

)

]

public struct Correlation

{

public List<Set> Columns;

private List<Decimal> FirstColumn, SecondColumn;

public struct Items

{

public Decimal[] item;

public Int64[] rank;

};

[SqlFunctionAttribute(IsDeterministic = true)]

public void Init()

{

Columns = new List<Set>();

FirstColumn = new List<Decimal>();

SecondColumn = new List<Decimal>();

}

[SqlFunctionAttribute(IsDeterministic = true)]

public void Accumulate(Set ColumnsSet)

{

if (ColumnsSet.IsNull)

{

return;

}

this.Columns.Add(ColumnsSet);

}

[SqlFunctionAttribute(IsDeterministic = true)]

public void Merge(Correlation Set)

{

this.Columns.AddRange(Set.Columns);

}

[SqlFunctionAttribute(IsDeterministic = true)]

public SqlDecimal Terminate()

{

Int64 _rows, _rows_shift, _size;

Decimal _coefficient;

_rows = 0;

_rows_shift = 0;

_coefficient = Decimal.Zero;

if (_rows <= 1) { return(0); }

_size = Math.Min(_rows, Math.Min(FirstColumn.Count, SecondColumn.Count));

if (_size == 0) { return(0); }

if (_rows + _rows_shift > _size) { return(0); }

Items _first_items, _second_items;

Decimal[] _first_ranks, _second_ranks;

_first_ranks = new Decimal[_size];

_second_ranks = new Decimal[_size];

_first_items.item = new Decimal[_size];

_first_items.rank = new Int64[_size];

_second_items.item = new Decimal[_size];

_second_items.rank = new Int64[_size];

FirstColumn.CopyTo(_first_items.item, 0);

FirstColumn.CopyTo(_first_ranks, 0);

Array.Sort<Decimal>(_first_ranks);

SecondColumn.CopyTo(_second_items.item, 0);

SecondColumn.CopyTo(_second_ranks, 0);

Array.Sort<Decimal>(_second_ranks);

for (Int64 a = 0; a < _size; a++)

{

for (Int64 b = 0; b < _size; b++)

{

if (_first_items.item[a] == _first_ranks) { _first_items.rank[a] = (int)(b + 1); }

}

for (Int64 c = 0; c < _size; c++)

{

for (Int64 d = 0; d < _size; d++)

{

if (_second_items.item[c] == _second_ranks[d]) { _second_items.rank[c] = (int)(d + 1); }

}

_coefficient = 0;

for (int e = 0; e < _size; e++)

{

Int64 _shift = Math.Abs(_first_items.rank[e] - _second_items.rank[e]);

_coefficient += (_shift);

}

_coefficient /= Convert.ToDecimal(Math.Pow(_size, 2.0) - 1.0);

_coefficient *= 4;

_coefficient = _coefficient - 1;

_coefficient = _coefficient * -1;

return (_coefficient);

}

And this is how I attempt to use the UDA

USE master

GO

EXEC sp_configure 'clr_enabled',1

GO

RECONFIGURE

GO

USE catalogSQL

GO

IF OBJECT_ID(N'[dbo].[ag_rho]') IS NOT NULL

DROP AGGREGATE [dbo].[ag_rho]

GO

IF OBJECT_ID(N'rho') IS NOT NULL

DROP ASSEMBLY rho

GO

CREATE ASSEMBLY rho

AUTHORIZATION dbo

FROM 'C:\...\rho.dll' --file path

WITH PERMISSION_SET = SAFE

GO

CREATE AGGREGATE [dbo].[ag_rho](@first ty_rho) RETURNS DECIMAL

EXTERNAL NAME rho.[transit.web.Correlation]

DECLARE @tb_rho TABLE (cl_first DECIMAL,cl_second DECIMAL)

INSERT INTO @tb_rho (cl_first,cl_second) VALUES (2.4,3.2);

INSERT INTO @tb_rho (cl_first,cl_second) VALUES (12.9,2.1);

INSERT INTO @tb_rho (cl_first,cl_second) VALUES (3.0,16.7);

INSERT INTO @tb_rho (cl_first,cl_second) VALUES (2.5,1.9);

INSERT INTO @tb_rho (cl_first,cl_second) VALUES (8.4,10.7);

INSERT INTO @tb_rho (cl_first,cl_second) VALUES (2.1,3.8);

SELECT [dbo].[ag_rho]([dbo].[ty_rho](cl_first,cl_second)) FROM @tb_rho

It is still not working. Does the UDT need to be in CLR?

You're all familiar with 'there's an app for that'... here's another... there's a CTE for that! 🙂

sql_only Old Hand Points: 361 More actions · Answer 7

It turns out that even if this UDA is assembled, it can only be used with partition by in the over clause.

Will seek a strictly sql solution, no longer need CLR...

You're all familiar with 'there's an app for that'... here's another... there's a CTE for that! 🙂

Hugo Kornelis SSC Guru Points: 64790 More actions · Answer 8

I am sorry to hear that the CLR aggregate does not support the newer windowing extensions. But you are right that this invalidates the idead and that it makes no sense to look for whatever was wrong in the UDA code.

I must admit that I have no idea what results you need to return. Can you perhaps post the results you expect from the sample data you posted? And perhaps add some explanation of the logic?

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

sql_only Old Hand Points: 361 More actions · Answer 9

Hugo Kornelis (1/5/2016)
I am sorry to hear that the CLR aggregate does not support the newer windowing extensions. But you are right that this invalidates the idead and that it makes no sense to look for whatever was wrong in the UDA code.
I must admit that I have no idea what results you need to return. Can you perhaps post the results you expect from the sample data you posted? And perhaps add some explanation of the logic?

I Was Looking For The Correlation coefficient between two columns of data. You can Look it up on wiki.

I have found a SQL solution (sufficient compromise) that requires no UDF. Will post later.

You're all familiar with 'there's an app for that'... here's another... there's a CTE for that! 🙂

sql_only Old Hand Points: 361 More actions · Answer 10

DECLARE @tb_correlation TABLE (cl_time DATETIME,cl_first DECIMAL,cl_second DECIMAL)

--insert your data

;WITH te_correlation

AS

(

SELECT cl_time,

POWER(RANK() OVER (ORDER BY cl_first)-RANK() OVER (ORDER BY cl_second),2) AS corr

FROM @tb_correlation

ORDER BY cl_time

OFFSET 0 ROWS

)

SELECT

cl_time,

SUM(corr) --apply windowing here

FROM te_correlation

Not the original normalized coefficient, but it serves my purpose. 😉

You're all familiar with 'there's an app for that'... here's another... there's a CTE for that! 🙂