CLR Table-Valued Function Example with Full Streaming (STVF)

  • Solomon: OK, that's good. Thanks for the direction on that and I'll let you know when ever I get the chance to pursue this further...

    -- Thanks Again, 🙂

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • I have tried using the yield return method in one of my test projects but I am getting the error below:

    "An error occurred while getting new row from user defined Table Valued Function :

    System.InvalidOperationException: Data access is not allowed in this context. Either the context is a function or method not marked with DataAccessKind.Read or SystemDataAccessKind.Read, is a callback to obtain data from FillRow method of a Table Valued Function, or is a UDT validation method."

    I have the DataAccessKind.Read property set of the UDF and the SqlConnection is closed before the yield return is called. The SqlConnection is using "Context Connection=true".

    I have read that you cannot use a sqlconnection in the fillrow function in SQL server 2008 but my sqlconnection is used in the UDF and it is closed before the calls to yield return.

    Any ideas?

    Regards

    Per

  • perh (1/4/2010)


    I have tried using the yield return method in one of my test projects but I am getting the error below:

    "An error occurred while getting new row from user defined Table Valued Function :

    System.InvalidOperationException: Data access is not allowed in this context. Either the context is a function or method not marked with DataAccessKind.Read or SystemDataAccessKind.Read, is a callback to obtain data from FillRow method of a Table Valued Function, or is a UDT validation method."

    I have the DataAccessKind.Read property set of the UDF and the SqlConnection is closed before the yield return is called. The SqlConnection is using "Context Connection=true".

    Hello Per. What table are you reading from? Is it a user table or a system table? If a system table is used then you also need to set SystemDataAccessKind. Have you tried this function with the regular, non-yield return syntax? I am not sure that this error relates specifically to yield return. If neither of these help, do you mind attaching the code?

    Take care,

    Solomon...

    SQL#https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
    Sql Quantum Lifthttps://SqlQuantumLift.com/ ( company )
    Sql Quantum Leaphttps://SqlQuantumLeap.com/ ( blog )
    Info sitesCollations     •     Module Signing     •     SQLCLR

  • Hello Solomon,

    The table where I read data from is in a local database. I tried to add the SystemDataAccessKind.Read as well but with no difference.

    I found an article about this on http://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=442200

    and tested the suggestion of using a connectionstring with "enlist=false". So I changed my SqlConnection from

    "Context Connection=true"

    to

    "data source=SENOVOPC005\\MSSQLSERVER2008;initial catalog=ReportTest;integrated security=SSPI;enlist=false"

    and then it worked.

    But I want to use "Context connection=true" as it is a much nicer way for the CLR to connect to the local database.

    Regards

    Per

  • I forgot to answer your other question. Yes it works if I create the whole collection and then returns it without using the yield return functionality.

    Bu I would like to use the yield return as the function might get memory problems if to much data is asked for.

    /Per

  • Hello again Per. I found this, which seems related but might not be the same issue since you say that you closed the connection before returning:

    http://social.msdn.microsoft.com/Forums/en-IE/sqlnetfx/thread/e9e7c707-87fb-49cc-bd0b-d426ed53a534

    For now, I would just stick with the connection string that works since it is still better than filling up memory with the entire collection before releasing it back to the calling query.

    Take care,

    Solomon...

    PS, if you want, you can just send me the code and I can try to look at it. Just click on my name to the left to see my details and then select "Send Email".

    SQL#https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
    Sql Quantum Lifthttps://SqlQuantumLift.com/ ( company )
    Sql Quantum Leaphttps://SqlQuantumLeap.com/ ( blog )
    Info sitesCollations     •     Module Signing     •     SQLCLR

  • I like this article a lot. In fact, it is one of the very best I have come across on SSC. Congratulations are definitely in order, both for style and content.

    It is slightly sad that only a small fraction of the potential audience will appreciate exactly what you have done here. I think that's probably a reflection of how few SQL people have a good grasp of .NET code and a willingness to extend the capabilities of T-SQL in sensible, appropriate, and efficient ways. It's so easy to misuse the CLR integration and I guess that makes it all the more refreshing to see a well thought out article on a key subject.

    I certainly do like the concise code made possible by the use of yield, and it certainly helps people to avoid the all-too-common mistake of generating a huge memory object in the Init method and then iterating over that in the FillRow calls.

    That said, I don't tend to use yield in practice, for the following reasons:

    1. The bug you refer to in the article with SqlChars and SqlBytes is a worry. I haven't come across that myself; however that may just be dumb luck. To be fair, I don't know whether that bug only manifests itself with yield. It is a consideration though - especially since SqlChars and SqlBytes are the only streaming input types.

    2. The compiler emits a fair chunk of MSIL when it encounters yield - in fact it generates an entire class. More on that in a bit. I have a natural (and, I hope, healthy) skepticism with this sort of thing: the compiler must generate code which is general in nature and provably correct in all situations. This necessarily means that yield will be somewhat slower than a hand-written equivalent.

    3. The extra code emitted by the compiler cannot be debugged in the usual way - you can't 'step into' it in a debugging session.

    4. Yield cannot be used in a try...catch block. This is an annoying restriction and complicates error handling somewhat.

    Of course you can't argue with the code economy - it can produce extremely neat code - and I do find that appealing.

    The alternative is to hand-code a helper class. This involves more typing for sure, but in the 20 million row test featured at the end of your article the hand-rolled version used 18.7 seconds of CPU on my home machine, compared to 19.8 seconds for the yield version - a 6% difference.

    Here's a rig based on the article's example code showing both a yield and helper-class implementation, just in case any is interested:

    (The helper class code is pretty close to the code emitted by the compiler for yield)

    using System.Data.SqlTypes;

    using Microsoft.SqlServer.Server;

    using System.Collections;

    public partial class TestFunctions

    {

    [Microsoft.SqlServer.Server.SqlFunction

    (

    DataAccess = DataAccessKind.None,

    IsDeterministic = true,

    IsPrecise = true,

    SystemDataAccess = SystemDataAccessKind.None,

    FillRowMethodName = "FillRow",

    TableDefinition = "Value INT"

    )

    ]

    public static IEnumerable StreamingHandCoded(SqlInt32 MaxValue)

    {

    return new Helper(MaxValue.Value);

    }

    [Microsoft.SqlServer.Server.SqlFunction

    (

    DataAccess = DataAccessKind.None,

    IsDeterministic = true,

    IsPrecise = true,

    SystemDataAccess = SystemDataAccessKind.None,

    FillRowMethodName = "FillRow",

    TableDefinition = "Value INT"

    )

    ]

    public static IEnumerable StreamingYield(SqlInt32 MaxValue)

    {

    for (int i = 0; i < MaxValue; i++)

    {

    yield return i;

    };

    }

    public static void FillRow(object i, out SqlInt32 Value)

    {

    Value = new SqlInt32((int)i);

    }

    #region HelperClass

    public sealed class Helper : IEnumerable, IEnumerator

    {

    readonly int maxValue;

    int index = -1;

    public Helper(int maxValue)

    {

    this.maxValue = maxValue;

    }

    #region IEnumerable Members

    IEnumerator IEnumerable.GetEnumerator()

    {

    return this;

    }

    #endregion

    #region IEnumerator Members

    object IEnumerator.Current

    {

    get { return index; }

    }

    bool IEnumerator.MoveNext()

    {

    return ++index < maxValue;

    }

    void IEnumerator.Reset()

    {

    index = -1;

    }

    #endregion

    }

    #endregion

    };

  • Paul:

    Any chance you could explain your Helper class a little? (you know, a Helper Helper? :-D)

    For obviously selfish reasons I am very interested in implementing it in VB.net, and although I don't see anything non-doable in VB, I also can't figure out how it does its magic either, which makes me think that I am missing something important.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • RBarryYoung (1/12/2010)


    Any chance you could explain your Helper class a little? (you know, a Helper Helper? :-D)

    For obviously selfish reasons I am very interested in implementing it in VB.net, and although I don't see anything non-doable in VB, I also can't figure out how it does its magic either, which makes me think that I am missing something important.

    Sure thing Barry! It is, however, 6:36am from the wrong end, so I'm going to grab some sleep first. I'll post back then, assuming someone else hasn't already covered it. Oh and 'hi' by the way 🙂

    Paul

  • Ok Barry here we go. I've tried to cover everything so it's readable for the wider audience - so apologies for the amount of detail:

    1. SQL Server expects an object that implements IEnumerable from the initial call to the static method named in the EXTERNAL NAME part of the CREATE FUNCTION statement. In this case, that's the StreamingHandcoded method.

    2. The StreamingHandcoded method passes the input parameter to a new instance of the nested Helper class. The constructor for the Helper class stores the passed parameter in a read-only member variable. If we had a streaming type as an input parameter (SqlChars or SqlBytes), we would store a reference to that in the same way.

    3. SQL Server calls the GetEnumerator() method on the IEnumerable interface of the Helper object we pass back in order to get the object that will allow it to iterate over the results. The IEnumerable interface specifies that the object passed back will implement the IEnumerator interface. In our case, the Helper class implements both IEnumerable and IEnumerator, so IEnumerable.GetEnumerator() just returns a reference to the same class via 'this'. This avoids the need to create separate classes to handle the two calls.

    4. Now that SQL Server has an object implementing IEnumerator, it can use the interface's Current property and MoveNext method to fetch results (the Reset method is also required by the interface, but SQL Server doesn't call it for TVFs). Current returns the data for the current row, MoveNext moves to the next row of the result set, and returns true until there are no more rows.

    5. The code to do the work required appears in the helper class. This is what makes this way of doing things 'lazy' - no work on the task in hand is done in the method called in step 1. The work is often coded in the MoveNext method, with Current just returning a simple reference to some member variable in the helper class. In this very simple example, the work consists of "++index < maxValue;" - incrementing the member variable and returning a boolean.

    6. SQL Server calls the Current property after every call to MoveNext that returns true. This returns an object (untyped) as defined by the interface. In our case, the returned object is an int (the index variable).

    7. SQL Server then passes this object to the FillRowMethodName specified in the SqlFunction attribute. In this case, that is the static FillRow(object i, out SqlInt32 Value) method on the outer class. That method simply casts the object back to an int, and uses that to construct a SqlInt32 which is returned via the out parameter.

    Paul

  • Paul, thanks for the great compliment as well as the code and explanation of it. I think it will be very helpful to many people.

    Barry, hopefully you can get what you were looking for from this information, including the ability to "break" the iteration.

    Take care,

    Solomon...

    SQL#https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
    Sql Quantum Lifthttps://SqlQuantumLift.com/ ( company )
    Sql Quantum Leaphttps://SqlQuantumLeap.com/ ( blog )
    Info sitesCollations     •     Module Signing     •     SQLCLR

  • Solomon Rutzky (1/12/2010)


    Paul, thanks for the great compliment as well as the code and explanation of it. I think it will be very helpful to many people.

    You're most welcome - it was thoroughly deserved.

    @Barry: you can break the iteration by returning false at any point from the MoveNext() call, as you probably know 🙂

  • I would like to take advantage of a chance to join to the discussion with a question which is not directly related to the title but I cannot miss all of eminent professionals on this thread.

    My question is: is it possible to have CLR table-valued function with variable number of return columns?

  • Matjaz Justin (9/10/2010)


    I would like to take advantage of a chance to join to the discussion with a question which is not directly related to the title but I cannot miss all of eminent professionals on this thread.

    My question is: is it possible to have CLR table-valued function with variable number of return columns?

    Probably... but that can also be done in T-SQL quite easily (although it won't be in a function). Please see the following article for how...

    http://www.sqlservercentral.com/articles/Crosstab/65048/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Matjaz Justin (9/10/2010)


    My question is: is it possible to have CLR table-valued function with variable number of return columns?

    All functions (T-SQL or SQLCLR) must return a deterministic set of columns. The same requirement does not apply to stored procedures, so a SQLCLR stored procedure is a better bet if you truly need this functionality.

    If you would like a more detailed reply from more than one person, please repost your question in the CLR forums on this site.

Viewing 15 posts - 46 through 60 (of 60 total)

You must be logged in to reply to this topic. Login to reply