CLR Table-Valued Function Example with Full Streaming (STVF)

  • TimothyAWiseman

    SSCrazy Eights

    Points: 8819

    Jeff Moden (12/23/2009)


    I'm not picking on the author here... it's a general statement that I bring up to anyone who may be considering the use of CLRS... Don't use the excuse of a lack of knowledge of T-SQL functionality as a justification for CLR usage. You'll frequently pay (yes, there are exceptions) for such a mistake in the form of performance and resource usage. There are very few things that can't actually be done in T-SQL and there are very few things where a CLR will actually win the performance foot race if the T-SQL is properly written for performance. Even when T-SQL does lose the performance footrace, it's many times so close to the performance of CLR's that it's just not worth the time and effort to maintain a separate code base outside of T-SQL.

    A good point and well taken. Still there are some times where CLR can make sense, particularly if you want to do things with Regex for instance.

    Also, Itzik Ben-Gan showed that at least for some types of large data sets CLR outperforms T-SQL in running aggregates. There are some details here: http://www.sqlmag.com/articles/index.cfm?articleid=102336 (subscription required.)

    ---
    Timothy A Wiseman
    SQL Blog: http://timothyawiseman.wordpress.com/

  • RBarryYoung

    SSC Guru

    Points: 143327

    Gift Peddie (12/23/2009)


    RBarryYoung (12/23/2009)


    C# "[font="Courier New"]yield break[/font]"? Yikes!

    I don't suppose there's a way to implement your example in VB.net, is there? (I don't know of any way to directly emulate [font="Courier New"]yield break[/font] in VB.)

    No yield is one those C# only definition because per Microsoft there is no valid reason to define it in VB.

    I already indicated that I knew that, Gift. Despite that fact that there is no deterministic general conversion for C# [font="Courier New"]yield break[/font], it is quite often possible to work out a case-specific method that can enable the same functionality in VB.net. Again, entirely dependent on the specific case, in this case, Solomon's clever method for streaming CLR-TVF data returns. It would be extremely valuable to me to be able to duplicate this performance in VB.net.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • RBarryYoung

    SSC Guru

    Points: 143327

    Jeff Moden (12/23/2009)


    Next question, please... What application would have the need for such a thing where it could not be done using T-SQL? Not trying to be a smart guy here... I really want to know where folks have used or will use this type of thing and why they couldn't do it in T-SQL.

    Going back to my previous post... no, I wouldn't use a stored proc if a native TVF would do either (of course, "it depends"). I was just providing a direct method for how to use the output of an existing stored proc.

    Actually, Jeff, the line-splitting functions and (less often)parsing functions are usually best done as TVF's, whether SQL or CLR.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • RBarryYoung

    SSC Guru

    Points: 143327

    RBarryYoung (12/24/2009)


    Jeff Moden (12/23/2009)


    Next question, please... What application would have the need for such a thing where it could not be done using T-SQL? Not trying to be a smart guy here... I really want to know where folks have used or will use this type of thing and why they couldn't do it in T-SQL.

    Going back to my previous post... no, I wouldn't use a stored proc if a native TVF would do either (of course, "it depends"). I was just providing a direct method for how to use the output of an existing stored proc.

    Actually, Jeff, the line-splitting functions and (less often)parsing functions are usually best done as TVF's, whether SQL or CLR. Unfortunately, as Matt pointed out, CLR TVF's by default do not want to do a streaming return. Solomon's "trick" here is really quite significant. Up till now, I have not even heard any of the "Big Guns" in CLR's say anything about it other than to complain about the situation.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Gift Peddie

    SSC Guru

    Points: 73570

    RBarryYoung (12/24/2009)


    Gift Peddie (12/23/2009)


    RBarryYoung (12/23/2009)


    C# "[font="Courier New"]yield break[/font]"? Yikes!

    I don't suppose there's a way to implement your example in VB.net, is there? (I don't know of any way to directly emulate [font="Courier New"]yield break[/font] in VB.)

    No yield is one those C# only definition because per Microsoft there is no valid reason to define it in VB.

    I already indicated that I knew that, Gift. Despite that fact that there is no deterministic general conversion for C# [font="Courier New"]yield break[/font], it is quite often possible to work out a case-specific method that can enable the same functionality in VB.net. Again, entirely dependent on the specific case, in this case, Solomon's clever method for streaming CLR-TVF data returns. It would be extremely valuable to me to be able to duplicate this performance in VB.net.

    I did not say you don't know that, I actually wanted to point out the yield being an iterator dependence on C# before your post.

    Kind regards,
    Gift Peddie

  • RBarryYoung

    SSC Guru

    Points: 143327

    Gift Peddie (12/24/2009)


    RBarryYoung (12/24/2009)


    Gift Peddie (12/23/2009)


    RBarryYoung (12/23/2009)


    C# "[font="Courier New"]yield break[/font]"? Yikes!

    I don't suppose there's a way to implement your example in VB.net, is there? (I don't know of any way to directly emulate [font="Courier New"]yield break[/font] in VB.)

    No yield is one those C# only definition because per Microsoft there is no valid reason to define it in VB.

    I already indicated that I knew that, Gift. Despite that fact that there is no deterministic general conversion for C# [font="Courier New"]yield break[/font], it is quite often possible to work out a case-specific method that can enable the same functionality in VB.net. Again, entirely dependent on the specific case, in this case, Solomon's clever method for streaming CLR-TVF data returns. It would be extremely valuable to me to be able to duplicate this performance in VB.net.

    I did not say you don't know that, I actually wanted to point out the yield being an iterator dependence on C# before your post.

    Actually you did in fact imply exactly that, Gift:

    Gift Peddie


    RBarryYoung


    C# "[font="Courier New"]yield break[/font]"? Yikes!

    I don't suppose there's a way to implement your example in VB.net, is there? (I don't know of any way to directly emulate [font="Courier New"]yield break[/font] in VB.)

    No yield is one those C# only definition because per Microsoft there is no valid reason to define it in VB.

    Note your use of the word No. In English that means that you are indicating that something that I said was wrong or incorrect. I am unaware of anything that I said that was incorrect.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Gift Peddie

    SSC Guru

    Points: 73570

    RBarryYoung (12/24/2009)


    Gift Peddie (12/24/2009)


    RBarryYoung (12/24/2009)


    Gift Peddie (12/23/2009)


    RBarryYoung (12/23/2009)


    C# "[font="Courier New"]yield break[/font]"? Yikes!

    I don't suppose there's a way to implement your example in VB.net, is there? (I don't know of any way to directly emulate [font="Courier New"]yield break[/font] in VB.)

    No yield is one those C# only definition because per Microsoft there is no valid reason to define it in VB.

    I already indicated that I knew that, Gift. Despite that fact that there is no deterministic general conversion for C# [font="Courier New"]yield break[/font], it is quite often possible to work out a case-specific method that can enable the same functionality in VB.net. Again, entirely dependent on the specific case, in this case, Solomon's clever method for streaming CLR-TVF data returns. It would be extremely valuable to me to be able to duplicate this performance in VB.net.

    I did not say you don't know that, I actually wanted to point out the yield being an iterator dependence on C# before your post.

    Actually you did in fact imply exactly that, Gift:

    Gift Peddie


    RBarryYoung


    C# "[font="Courier New"]yield break[/font]"? Yikes!

    I don't suppose there's a way to implement your example in VB.net, is there? (I don't know of any way to directly emulate [font="Courier New"]yield break[/font] in VB.)

    No yield is one those C# only definition because per Microsoft there is no valid reason to define it in VB.

    Note your use of the word No. In English that means that you are indicating that something that I said was wrong or incorrect. I am unaware of anything that I said that was incorrect.

    Thanks for the English lesson but that was not my intention and, English lesson given to a second generation English very funny.

    And I thought this was a SQL Server forum. Note to self proofread before post.

    Kind regards,
    Gift Peddie

  • Solomon Rutzky

    SSCoach

    Points: 16104

    RBarryYoung (12/24/2009)


    Actually, Jeff, the line-splitting functions and (less often)parsing functions are usually best done as TVF's, whether SQL or CLR. Unfortunately, as Matt pointed out, CLR TVF's by default do not want to do a streaming return. Solomon's "trick" here is really quite significant. Up till now, I have not even heard any of the "Big Guns" in CLR's say anything about it other than to complain about the situation.

    Ok, so English lesson's aside ;-), I really cannot lay claim to this being my technique. I was hoping to indicate in the article that this is just something that I came across in my workings with SQLCLR. I think I picked it up from my last article dealing with Streaming INTO SQL Server from an application and figured I would give the "yield return" syntax a shot in a TVF and was very pleasantly surprised when it worked. However, I have since done some searches on "yield return" and have found that a few other people have been doing this so maybe it is documented somewhere. Unfortunately the examples are all in blog posts about something not working correctly and not attempting to show a good way of doing TVFs. Hence nobody looking for "how to do a streaming TVF" (or maybe, "how to do a TVF properly" since I am not sure why anyone would choose to store the entire collection in memory, unless maybe you needed to do secondary logic over the set) will ever find those posts. This is why I felt it was important to do this article: because, as you mentioned, for some odd reason nobody else has.

    Now, I don't really know much about VB.Net but I did mention in my article that there is the standard implementation of IEnumerator that does the same thing. I would have to assume that while the examples are in C# that they would work the same in VB.Net since they do not rely upon the "yield" construct. Here are two:

    1) http://www32.brinkster.com/srisamp/sqlArticles/article_46.htm (bottom half of the page)

    2) http://sqlblog.com/blogs/adam_machanic/archive/2009/04/28/sqlclr-string-splitting-part-2-even-faster-even-more-scalable.aspx

    Hopefully one or both of those helps you do the same thing in VB.Net. True, it won't be a single-line like "yield return", but I think the end-result performance gain / memory reduction will be the same.

    Take care,

    Solomon...

    SQL#https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
    Sql Quantum Lifthttps://SqlQuantumLift.com/ ( company )
    Sql Quantum Leaphttps://SqlQuantumLeap.com/ ( blog )
    Info sitesCollations     •     Module Signing     •     SQLCLR

  • Jeff Moden

    SSC Guru

    Points: 994555

    timothyawiseman (12/24/2009)


    Jeff Moden (12/23/2009)


    I'm not picking on the author here... it's a general statement that I bring up to anyone who may be considering the use of CLRS... Don't use the excuse of a lack of knowledge of T-SQL functionality as a justification for CLR usage. You'll frequently pay (yes, there are exceptions) for such a mistake in the form of performance and resource usage. There are very few things that can't actually be done in T-SQL and there are very few things where a CLR will actually win the performance foot race if the T-SQL is properly written for performance. Even when T-SQL does lose the performance footrace, it's many times so close to the performance of CLR's that it's just not worth the time and effort to maintain a separate code base outside of T-SQL.

    A good point and well taken. Still there are some times where CLR can make sense, particularly if you want to do things with Regex for instance.

    Also, Itzik Ben-Gan showed that at least for some types of large data sets CLR outperforms T-SQL in running aggregates. There are some details here: http://www.sqlmag.com/articles/index.cfm?articleid=102336 (subscription required.)

    Heh... you apparently haven't read the "running total" article I wrote, have you? 😉

    And Matt Miller and I did a fair amount of test several years back and the T-SQL methods and CLR Regex methods pretty much took turns winning even on some of the Regex Replace stuff. I believe those tests (posted on this forum somewhere) have been lost because both Matt and I have looked for them and can't find them.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."
    When you put the right degree of spin on it, the number 3|8 is also a glyph that describes the nature of a DBAs job. 😉

    Helpful Links:
    How to post code problems
    Create a Tally Function (fnTally)

  • Jeff Moden

    SSC Guru

    Points: 994555

    Matt Miller (#4) (12/23/2009)


    I had attempted to use something like this to bring in data from an external file. One of those "ugly delimited" files, kind of delimited and kind of not, that I needed to bring in. This would have offered a way to parse it and clean it up in one single shot, and I would simply need to insert the results from the TVF into my destination table, skipping the whole staging table, clean up the data, etc...

    I ended up abandoning it because it was trying to build the entire table in mem before returning it, so the CLR function would crash before getting anything out. So - I ended up going in another direction and got it done that way.

    I had also looked at building the old running totals this way, running into the same problem (order is in fact guaranteed this way so no controversy over documented vs not). I got around that using the CLR SP (since you can stream the results without having to build the entire return set first), but still use the "other way" until I am formally forbidden from doing so , since it is still faster than the CLR SP (although not by a whole lot).

    That's one of the problems that I seem to see popping up a lot... CLR's running the system out of memory. I've not asked any DBA in particular, but I'm thinking that's one of the reasons why many DBA's simply say "No! No CLR's. Not on my box."

    Shifting gears, it would be fun to see the file you were trying to split if it's not proprietary and has no private info in it.

    Heh... it's also good to hear that the "other" method for running aggregates beats the CLR method... thanks for the feedback on that. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."
    When you put the right degree of spin on it, the number 3|8 is also a glyph that describes the nature of a DBAs job. 😉

    Helpful Links:
    How to post code problems
    Create a Tally Function (fnTally)

  • Jeff Moden

    SSC Guru

    Points: 994555

    RBarryYoung (12/24/2009)


    Jeff Moden (12/23/2009)


    Next question, please... What application would have the need for such a thing where it could not be done using T-SQL? Not trying to be a smart guy here... I really want to know where folks have used or will use this type of thing and why they couldn't do it in T-SQL.

    Going back to my previous post... no, I wouldn't use a stored proc if a native TVF would do either (of course, "it depends"). I was just providing a direct method for how to use the output of an existing stored proc.

    Actually, Jeff, the line-splitting functions and (less often)parsing functions are usually best done as TVF's, whether SQL or CLR.

    Yep... I agree and because you were involved on the great "splitter" post, you already know I agree. 😉 In SQL Server 2000 and above, I use Inline T-SQL TVF's for the task. We also saw that a super simple CLR could beat that (although the Tally table gave it a really good race on VARCHAR(8000)... One of the few places where CLR's beat T-SQL when there's a solution available in both.

    I do like the idea of someone building a handy "file handler" CLR but it would have the same problems as equivalent code in T-SQL... many OPS folks and many DBA's wouldn't allow it if it could move, delete, create, or rename files unless the code was "protected" from casual use which both can be.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."
    When you put the right degree of spin on it, the number 3|8 is also a glyph that describes the nature of a DBAs job. 😉

    Helpful Links:
    How to post code problems
    Create a Tally Function (fnTally)

  • Jeff Moden

    SSC Guru

    Points: 994555

    Sorry... didn't mean to turn this into a "I hate CLR's" post... I just wanted to know what this "streaming CLR" would be used for and why folks thought it couldn't be done in T-SQL.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."
    When you put the right degree of spin on it, the number 3|8 is also a glyph that describes the nature of a DBAs job. 😉

    Helpful Links:
    How to post code problems
    Create a Tally Function (fnTally)

  • Solomon Rutzky

    SSCoach

    Points: 16104

    Jeff Moden (12/24/2009)


    That's one of the problems that I seem to see popping up a lot... CLR's running the system out of memory. I've not asked any DBA in particular, but I'm thinking that's one of the reasons why many DBA's simply say "No! No CLR's. Not on my box."

    Hey Jeff. I think the problem of CLR's taking up too much memory is more an issue of technique than of CLR usage in general. Hence why I wrote this article: to help people alleviate memory consumption in the majority of CLR TVFs.

    And I would be curious if you would ask these DBA's about why they say "No". It could just as easily be a case of misinformation and general ignorance regarding CLR usage, especially in terms of security.

    Heh... it's also good to hear that the "other" method for running aggregates beats the CLR method... thanks for the feedback on that. 🙂

    To be fair to the conversation, it is not necessarily that the "other" method was a clear winner, just that he did not try this streaming method (or even the other means of streaming it via full IEnumerator implementation) so it was building the whole collection in memory. A fair test would be for Matt to go back to his code for the CLR TVF and to make a few minor modifications for this new "yield return" syntax and THEN we could see which is better (or maybe just the same).

    And just to be clear since forum posts can easily be misread as overreacting due to no tone of voice (outside of emoticons): I do not think that you are picking on me and I am not being defensive. I am just trying to make sure that this side is being fairly represented.

    SQL#https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
    Sql Quantum Lifthttps://SqlQuantumLift.com/ ( company )
    Sql Quantum Leaphttps://SqlQuantumLeap.com/ ( blog )
    Info sitesCollations     •     Module Signing     •     SQLCLR

  • RBarryYoung

    SSC Guru

    Points: 143327

    Jeff Moden (12/24/2009)


    ...That's one of the problems that I seem to see popping up a lot... CLR's running the system out of memory. I've not asked any DBA in particular, but I'm thinking that's one of the reasons why many DBA's simply say "No! No CLR's. Not on my box."...

    I think I saw Paul or Florian talking about that recently: why it happens, how to avoid it, and what version it gets fixed in. I'll see if I can find it...

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • RBarryYoung

    SSC Guru

    Points: 143327

    RBarryYoung (12/24/2009)


    Jeff Moden (12/24/2009)


    ...That's one of the problems that I seem to see popping up a lot... CLR's running the system out of memory. I've not asked any DBA in particular, but I'm thinking that's one of the reasons why many DBA's simply say "No! No CLR's. Not on my box."...

    I think I saw Paul or Florian talking about that recently: why it happens, how to avoid it, and what version it gets fixed in. I'll see if I can find it...

    Hmm, now that I think about it, I think Paul (or Flo, can't remember which, sorry...) said pretty much the same things that Matt and Solomon (and Adam) have been saying about it; "Use Streaming".

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

Viewing 15 posts - 16 through 30 (of 61 total)

You must be logged in to reply to this topic. Login to reply