Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

CLR Table-Valued Function Example with Full Streaming (STVF / TVF)

By Solomon Rutzky,

When Microsoft first introduced CLR integration in SQL Server 2005, I was quite interested to see what wonderful things could be done that either: a) couldn't be done in regular T-SQL or b) would at least be more efficient than being done in T-SQL. As I started to read about Table-Valued Functions, I was excited to learn that the output from Table-Valued Functions (TVFs) could stream out to the calling process. There were examples in the MSDN documentation and I followed them and they worked. However, I started to notice that while the output itself might have streamed out of the data collected in the TVF, it seemed that you were still required to collect all of that data first in memory before returning it. This was a bit frustrating since it did not seem to truly "stream" in the way that I had expected. Sure CLR Stored Procedures can stream their results back but I wanted to create TVFs so that I could more easily interact with the output by doing WHERE conditions, GROUP BYs, ORDER BYs, etc. which are not possible with Stored Procedures unless you trap the output into a Table Variable or Temp Table (which I did not want to do).


After working with TVFs for some time, I learned more about the IEnumerable interface and what else could be done with it. I was elated to learn that the new TVF structure really could fully stream data out to the calling T-SQL query as Microsoft originally claimed. Of course, I was a little annoyed at the lack of easy to find examples of how to do this. If I search on "streaming clr tvf" then I can find a few examples, but they mostly show the full IEnumerator syntax requiring the MoveNext(), Current(), and Reset() methods. This method might be a bit confusing and most people will likely not start out using the word "streaming" when doing a search for an example. Simply searching on "clr tvf example" shows mostly articles showing the method which collects all data before releasing it. So, I decided it would be helpful for others who are starting out making their own CLR TVFs to have an actual example that will hopefully be easy to find in a search and will show the quick and easy method.

The Original / Standard Method

The method that I first found, and that I suspect a lot of others are also using, centers around creating a Generic List of the intended result row, adding a result row to that list per iteration of a loop, and then returning the entire Generic List which is the full result set.

private struct ReturnValues
{
public int Value;
} private static void FillValues(object obj, out SqlInt32 TheValue)
{
ReturnValues ReturnVals = (ReturnValues)obj;
TheValue = ReturnVals.Value;
}

[Microsoft.SqlServer.Server.SqlFunction(DataAccess = DataAccessKind.None, IsDeterministic = true, IsPrecise = true, SystemDataAccess = SystemDataAccessKind.None, FillRowMethodName = "FillValues", TableDefinition = "IntValue INT")]
public static IEnumerable TVF_Standard(SqlInt32 MaxValue)
{
if (MaxValue.IsNull) {
return new List<ReturnValues>(); // return no rows } List<ReturnValues> AllVals = new List<ReturnValues>(); // container for the Result Set ReturnValues Vals = new ReturnValues(); // each row for (int index = 1; index <= MaxValue.Value; index++)
{
Vals.Value = index;
AllVals.Add(Vals); // add row to the Result Set container
} return AllVals; // return all data at once
}

The Fully Streaming Method

Before I show the quick and easy method I need to mention that there are actually two ways of accomplishing this full streaming: implementing IEnumerator or using "yield". The more involved method is providing a formalized framework to interact with the IEnumerator interface which involves creation of MoveNext(), Current(), and Reset() methods. This method is documented in a couple of places and hence I will not re-document that approach here since I want to focus on the less documented approach.

Now we can get to the fun stuff. All that is really needed to send a result row back per each iteration is using the "yield" command. Most of the time you will simply "yield return" the single-row data structure. Sometimes you might want to exit without sending any data back in which case you can use the "yield break" command.

private struct ReturnValues
{
public int Value;
} private static void FillValues(object obj, out SqlInt32 TheValue)
{
ReturnValues ReturnVals = (ReturnValues)obj;
TheValue = ReturnVals.Value;
}

[Microsoft.SqlServer.Server.SqlFunction(DataAccess = DataAccessKind.None, IsDeterministic = true, IsPrecise = true, SystemDataAccess = SystemDataAccessKind.None, FillRowMethodName = "FillValues", TableDefinition = "IntValue INT")]
public static IEnumerable TVF_Streaming(SqlInt32 MaxValue)
{
if (MaxValue.IsNull) {
yield break; // return no rows } // we do not need the Generic List of <ReturnValues> ReturnValues Vals = new ReturnValues(); // each row for (int index = 1; index <= MaxValue.Value; index++)
{
Vals.Value = index;
yield return Vals; // return row per each itteration
} // we do not need to return everything at once
}

I put comments in the code to direct your attention to the differences. The main difference is that the "Standard" function has a Generic List that it builds up and returns at the end while the "Streaming" function does not have the Generic List and instead returns each row as it is created via the "yield return" command.

In order to test this easily, I have attached a SQL script that will create the Assembly which contains both of the above CLR Table-Valued Functions as well as their T-SQL wrapper functions. I have also attached a ZIP file containing the Visual Studio 2008 Project if you want to compile this yourself or see the full C# code. Please see the "Resources" section at the bottom of the article for these files.

Testing the first one (TVF_Standard) with a large value of 20,000,000 shows (at least on my machine ;-) that it runs out of memory with the following error:

SELECT MAX(IntValue) FROM Test.dbo.TVF_Standard(20000000)

Msg 6532, Level 16, State 49, Line 1
.NET Framework execution was aborted by escalation policy because of out of memory.

System.Threading.ThreadAbortException: Thread was being aborted.
System.Threading.ThreadAbortException:

at System.Collections.Generic.List`1.set_Capacity(Int32 value)
at System.Collections.Generic.List`1.EnsureCapacity(Int32 min)
at TestFunctions.TVF_Standard(SqlInt32 MaxValue)

Testing the second one (TVF_Streaming) with a value of 40,000,000 -- a value twice as great at the value that didn't work with the TVF_Standard function -- works with no problems.

SELECT MAX(IntValue) FROM Test.dbo.TVF_Streaming(40000000)
-- returns: 40000000

And I did include the "yield break" syntax in the C# function example so the queries below show each of the two functions working with a NULL input:

SELECT MAX(IntValue) FROM Test.dbo.TVF_Standard(NULL)

SELECT MAX(IntValue) FROM Test.dbo.TVF_Streaming(NULL)

Both of those queries return NULL.

Sadly, Not Always

As a final note to recommending "yield return", it should be mentioned that it does not work in all cases. Sometimes, when using the SqlChars and SqlBytes datatypes for input parameters you might run into the following error:

Msg 6260, Level 16, State 1, Line 1
An error occurred while getting new row from user defined Table Valued Function :

System.InvalidOperationException: Accessing members of an object from a wrong thread.
System.InvalidOperationException:

at System.Data.SqlServer.Internal.ClrLevelContext.XvarProxyRead(CClrXvarProxy* pXvarProxy, UInt64 iPosition, Byte* pbBuffer, UInt32 cbCount)
at System.Data.SqlServer.Internal.ClrLevelContext.System.Data.SqlServer.Internal.IXvarProxyAccessor.XvarProxyRead(CClrXvarProxy* , UInt64 , Byte* , UInt32 )
at System.Data.SqlServer.Internal.StreamOnBlobHandle.Read(Byte* pbBuffer, UInt64 offset, UInt32 count)
at System.Data.SqlServer.Internal.XvarWlobStream.Read(Char[] buffer, Int32 offset, Int32 count)
at System.Data.SqlTypes.SqlChars.get_Value()

This is a bug that was introduced in SQL Server 2008 and according to a Microsoft blog, is fixed in the next version:


https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=382235

Copyright © 2009 Solomon Rutzky

SQL# (SQLsharp)

Resources:

StreamingTVF_CreateAssemblyAndFunctions.sql | ExampleSTVF.zip
Total article views: 7131 | Views in the last 30 days: 20
 
Related Articles
FORUM

ReturnValue vs. Output Parameter

ReturnValue vs. Output Parameter

FORUM

Help Needed - Function Returning Table

Function Returning Table

FORUM

Return result of dynamic query from function

Return result of dynamic query from function

FORUM

Getting "OLE DB provider "STREAM" for linked server "(null)" returned message "Query timeout expired"."

Getting "OLE DB provider "STREAM" for linked server "(null)" returned message "Query timeout expired...

ARTICLE

Excel Function Returns Cell Address

This article details an Excel 2010 function to return the cell address of min and max functions.

 
Contribute

Join the most active online SQL Server Community

SQL knowledge, delivered daily, free:

Email address:  

You make SSC a better place

As a member of SQLServerCentral, you get free access to loads of fresh content: thousands of articles and SQL scripts, a library of free eBooks, a weekly database news roundup, a great Q & A platform… And it’s our huge, buzzing community of SQL Server Professionals that makes it such a success.

Join us!

Steve Jones
Editor, SQLServerCentral.com

Already a member? Jump in:

Email address:   Password:   Remember me: Forgotten your password?
Steve Jones