• Solomon Rutzky (12/29/2010)


    The only issue I have is with your statement that "median" is not a good case for a CLR aggregate as it might "require a substantial portion of the set to be held in memory". I disagree because it is possible to reduce the size of data in memory by compressing it, as I have shown in this article:

    http://www.sqlservercentral.com/articles/SS2K5+-+CLR+Integration/3208/

    Thanks Solomon, what I was getting at with the Median is that to find the middle value you have to keep track of all the values and my worry (as yet unsubstantiated) is that you could run an aggregate on such a large recordset that you consume all the RAM on your system and destabilize your server.

    It may be that the point where this happens is beyond the bounds of most users.

    The other thing that jumps out at me is that most stats functions seem to rely on what I will call the primitive aggregates; the built in aggregates. Effectively the CLR aggregates provide a short-hand for data analysts so they can focus on what the data is telling them rather than the mechanics.

    I should like to see Microsoft consider adding more in-built statistical functions given the way that the BI world is heading. I should also like to see some focusing on integration with external stats packages such as SPSS and SAS.