SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


A genuine use for a SQL CLR Aggregate


A genuine use for a SQL CLR Aggregate

Author
Message
David.Poole
David.Poole
SSCertifiable
SSCertifiable (7.4K reputation)SSCertifiable (7.4K reputation)SSCertifiable (7.4K reputation)SSCertifiable (7.4K reputation)SSCertifiable (7.4K reputation)SSCertifiable (7.4K reputation)SSCertifiable (7.4K reputation)SSCertifiable (7.4K reputation)

Group: General Forum Members
Points: 7449 Visits: 3279
Comments posted to this topic are about the item A genuine use for a SQL CLR Aggregate

LinkedIn Profile

Newbie on www.simple-talk.com
tcm_mail
tcm_mail
Forum Newbie
Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)

Group: General Forum Members
Points: 3 Visits: 12
Please, can you repair the broken links of the pictures:
http://www.sqlservercentral.com/articles/SQL+CLR/71942/m.gif
http://www.sqlservercentral.com/articles/SQL+CLR/71942/regression.gif
http://www.sqlservercentral.com/articles/SQL+CLR/71942/Regressiongraph.gif
http://www.sqlservercentral.com/articles/SQL+CLR/71942/CLRAggregateExecutionPlan.JPG
http://www.sqlservercentral.com/articles/SQL+CLR/71942/CLRAggWithGroupExecutionPlan.JPG
David.Poole
David.Poole
SSCertifiable
SSCertifiable (7.4K reputation)SSCertifiable (7.4K reputation)SSCertifiable (7.4K reputation)SSCertifiable (7.4K reputation)SSCertifiable (7.4K reputation)SSCertifiable (7.4K reputation)SSCertifiable (7.4K reputation)SSCertifiable (7.4K reputation)

Group: General Forum Members
Points: 7449 Visits: 3279
On the subject of the expensive sort in the execution plan I had a note back from Bob Beauchamp saying that there is an internal stream aggregator operation that causes the sort and unfortunately there is no way around it other than to put the clustered index on the column(s) used in the GROUP BY statement.Crying

LinkedIn Profile

Newbie on www.simple-talk.com
Robert Frasca
Robert Frasca
SSC Eights!
SSC Eights! (804 reputation)SSC Eights! (804 reputation)SSC Eights! (804 reputation)SSC Eights! (804 reputation)SSC Eights! (804 reputation)SSC Eights! (804 reputation)SSC Eights! (804 reputation)SSC Eights! (804 reputation)

Group: General Forum Members
Points: 804 Visits: 776
Thanks for putting this together. It was a fascinating exercise that I really enjoyed. Like many others I have struggled to find a genuine use case for this feature. I agree that SQL Server 2008 made the feature more usable but I'm still not sure the tradeoffs make it worthwhile. Great article.

Bob

"Beliefs" get in the way of learning.
rdw-769613
rdw-769613
Forum Newbie
Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)

Group: General Forum Members
Points: 6 Visits: 83
Another good example to use CLR is GMT/Local/Unix time handling. Trying this in T-SQL is near imposible and within CLR it's simple and faster.
alchemy9
alchemy9
Grasshopper
Grasshopper (22 reputation)Grasshopper (22 reputation)Grasshopper (22 reputation)Grasshopper (22 reputation)Grasshopper (22 reputation)Grasshopper (22 reputation)Grasshopper (22 reputation)Grasshopper (22 reputation)

Group: General Forum Members
Points: 22 Visits: 403
see what happens when you add a materialized view... this will be much quicker..



create VIEW vx_agg WITH SCHEMABINDING

as

SELECT --( COUNT_big(*) * SUM(col1*col2) - ( SUM(col1) * SUM(col2)))/( COUNT_big(*)* SUM(col1*col1)-( SUM(col1)* SUM(col1))) AGG
COUNT_big(*) cnt
,SUM(col1*col2) col1col2
,SUM(col1) col1
,SUM(col1*col1) col1sq
,SUM(col2) col2

FROM dbo.test_table

GO

CREATE UNIQUE clustered INDEX IX_AGG ON vx_agg (cnt)

SELECT ( COUNT(*) * SUM(col1*col2) - ( SUM(col1) * SUM(col2)))/( COUNT(*)* SUM(col1*col1)-( SUM(col1)* SUM(col1)))
FROM dbo.test_table
bbalok
bbalok
Forum Newbie
Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)

Group: General Forum Members
Points: 3 Visits: 75
We have developed a Project Planning and Management system and I have used CLR routines in two main places. The user wanted cost escalation at user defined periods of the project in the planned budget calculation. These calculated amounts had to be used by UI screens and reports and had to be consistent. Each work unit is inspected to see if it falls within one to many escalation periods that occur during the life of the project and how much of the work falls within a period. A planned budget amount is calculated and aggregated based on the cost, amount of work and escalation percent. There are also status codes that are used to filter the work units involved in the calculation, multiple tables of data and many loops through the data so it was easier for me to use a CLR and C# to gather the data I needed and loop through the work units to provide an escalated work unit planned budget.

Did I mention different contract types that use these escalated amounts in different ways to create a budget? I could also encapsulate this logic in this routine so I could return a planned budget amount for every contract type in the system. We allow budgeting at the lowest work breakdown level but have to provide aggregated budget amounts at every parent level so another loop through the data to perform this aggregation to the parent.

Since the CLR works as a Table Valued Function we can write queries against it for UI and reporting. Every UI or report will have the same planned budget value because the math and rounding and contract types are encapsulated in this routine. It seems that if I do division of two numbers in T-Sql , C#, Crystal Reports I get a different number of decimals and value of decimals. I am not sure I could have done this without the use of a CLR routine to perform the looping and provide consistent rounded amounts.

The other place I use it is to provide a cross tab type output but I will save that for another day as I have spouted enough.
Bradley Deem
Bradley Deem
Say Hey Kid
Say Hey Kid (663 reputation)Say Hey Kid (663 reputation)Say Hey Kid (663 reputation)Say Hey Kid (663 reputation)Say Hey Kid (663 reputation)Say Hey Kid (663 reputation)Say Hey Kid (663 reputation)Say Hey Kid (663 reputation)

Group: General Forum Members
Points: 663 Visits: 1248
Thanks for sharing David. I agree most CLR aggregates have not had much success beyond academics. I think part of this limitation is due to the lack of ability to perform an ORDER BY clause with an aggregate.

For example, lets say I create a CLR aggregate that concatenates strings together ie

SELECT dbo.List(myStringColumn,',') -- Where ',' means comma delimited
FROM myTable



Assuming I've implemented it correctly, this could produce output such as
firstRowValue,secondRowValue,thirdRowValue



or maybe parallelism kicks in (on a large set) and produces output where the order changes ie
secondRowValue,firstRowValue,thirdRowValue



Now, if we had we had the ability to perform the ORDER BY clause with the OVER clause ie
SELECT dbo.List(myStringColumn,',') OVER(ORDER BY myStringColumn)
FROM myTable



Then we could guarantee the correct (expected) order of output.


If you believe this would be a useful feature, and I do, please vote for it on connect.
https://connect.microsoft.com/SQLServer/feedback/details/254387/over-clause-enhancement-request-order-by-for-aggregates


In regards to the attribute IsInvariantToOrder, it's not supported, see the following link.
http://msdn.microsoft.com/en-us/library/microsoft.sqlserver.server.sqluserdefinedaggregateattribute.isinvarianttoorder.aspx

Apparently, it will be implemented in the "future" which after 5 years hasn't been far enough into the future.
Solomon Rutzky
Solomon Rutzky
SSCommitted
SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)

Group: General Forum Members
Points: 1679 Visits: 2965
Hello David and thanks for a great article on yet another good use of SQL CLR.

The only issue I have is with your statement that "median" is not a good case for a CLR aggregate as it might "require a substantial portion of the set to be held in memory". I disagree because it is possible to reduce the size of data in memory by compressing it, as I have shown in this article:

http://www.sqlservercentral.com/articles/SS2K5+-+CLR+Integration/3208/

And that article was done before SQL Server 2008 added the ability to go beyond the 8k memory limit (with a MaxSize set to -1) so adding that option to the mix solves all problems (generally) outside of the possible efficiency of the CPU it takes to compress/decompress the dataset. At that point I would argue that if your situation requires a Median calculation (and it appears to for quite a few people) then a little bit of extra processing time is worth the ability to do this in the first place.

SQL# - http://www.SQLsharp.com/
Adam Machanic
Adam Machanic
SSCommitted
SSCommitted (1.5K reputation)SSCommitted (1.5K reputation)SSCommitted (1.5K reputation)SSCommitted (1.5K reputation)SSCommitted (1.5K reputation)SSCommitted (1.5K reputation)SSCommitted (1.5K reputation)SSCommitted (1.5K reputation)

Group: General Forum Members
Points: 1545 Visits: 714
Solomon Rutzky (12/29/2010)And that article was done before SQL Server 2008 added the ability to go beyond the 8k memory limit (with a MaxSize set to -1) so adding that option to the mix solves all problems (generally) outside of the possible efficiency of the CPU it takes to compress/decompress the dataset. At that point I would argue that if your situation requires a Median calculation (and it appears to for quite a few people) then a little bit of extra processing time is worth the ability to do this in the first place.


Have you compared the performance of doing a median calculation w/compression against performing the same calculation using Joe Celko's ROW_NUMBER technique?

http://sqlblog.com/blogs/adam_machanic/archive/2006/12/18/medians-row-numbers-and-performance.aspx

Compression is far from cheap, and I suspect that the ROW_NUMBER solution will scale much better. But that's just a suspicion and I don't have time to test myself at the moment.

--
Adam Machanic
SQL Server MVP
SQLblog.com: THE SQL Server Blog Spot on the Web
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search