This topic has been reported for inappropriate content


Is there any justification for really using SQL CLR

  • Adam Machanic

    SSCoach

    Points: 15370

    Paul White (2/2/2010)


    The only SQLCLR feature I haven't found a good use for so far is the SQLCLR trigger. Would anyone care if those were removed?

    Agreed there. I think they COULD be useful, however, if some modifications were made:

    https://connect.microsoft.com/SQLServer/feedback/details/265346/provide-properties-to-help-with-common-tasks-in-clr-triggers

    What I would like to see is a SQLCLR analytic operator - something that streams rows in and out - like the Segment and Sequence Project operators used by the likes of RANK and ROW_NUMBER. A sort of combination UDA and STVF...

    Agreed!

    UDAs are already the fastest way to stream rows from SQL Server into a SQLCLR object - blowing away the context connection method.

    Really? I'm surprised to hear that you've seen that much of a difference between the two. What exactly did you test?

    IEnumerable-wrapped SqlDataAdapters are an alternative, but having to use an external connection is a serious downside (as is the fact that 2008 broke this method AFAIK). In an attempt to get as

    I personally don't think the external connection is much of a problem in most common cases. However, I am messing around with some parallel algorithms at the moment and opening four or eight external connections is kind of a pain, plus if I want to operate on a temp table I have to make it global, which totally sucks. Here's a connect item:

    https://connect.microsoft.com/SQLServer/feedback/details/253160/greater-flexibility-when-working-with-sqldatareaders-in-context-connected-table-valued-udfs

    SQL Server 2008 did not break the ability to do the external connection, it simply changed the rules a bit. You're now forced to include a SqlFunctionAttribute with DataAccessKind set to DataAccessKind.Read. In 2005 that option only applied to the context connection. More info:

    https://connect.microsoft.com/SQLServer/feedback/details/442200/sql-server-2008-clr-tvf-data-access-limitations-break-existing-code

    1. Passing LOB streaming input to a SQLCLR TVF always throws a wrong thread exception (!)

    Always?? I have methods that do this that don't throw exceptions; are you referring to UDAs and not TVFs here? I haven't tested there...

    2. UDA output is only produced from the Terminate() method, which is only called when all rows have been passed to the UDA via the Accumulate method (and possibly Merge()). This means that LOB output from a UDA has to be fully built before it can start to stream. Kinda seems to defeat the purpose of providing a streaming interface, really.

    Agreed, but this is necessary because--at least today--we have no guarantee of the order in which the rows will be passed to the UDA.

    3. SQL Server doesn't call Dispose on any stream wrapped by a SqlChars or SqlBytes object. If a other operators (like hash join) can spill to disk, why can't I?! 😛

    Whether or not Dispose is called on SqlChars/SqlBytes seems like an implementation detail; what are you doing where you're even noticing this?

    By the way, here's a related issue:

    https://connect.microsoft.com/SQLServer/feedback/details/479611/an-event-or-exception-should-be-raised-in-sqlclr-when-an-attention-event-or-similar-occurs

    As far as I can tell, a T-SQL scalar function that does not access data will always be slower than a SQLCLR equivalent. And, since T-SQL scalar functions should never access data, that's not much of a restriction. T-SQL is interpreted whereas SQLCLR is compiled - so T-SQL seems entirely the wrong tool for scalar functions.

    I think rather than saying that a T-SQL scalar function "will always be slower" it's more accurate to say that the T-SQL version "will never be faster". They can be more or less equivalent, especially when the function isn't doing much (e.g. adding 1 to the input). But the more complex the logic gets, the further ahead SQLCLR versions tend to get. Whether T-SQL is the right or wrong tool is a bigger question and herein lies the problem. The SQL Server team didn't make it especially easy for DBAs and others to work with this stuff and for many groups there is too much of a barrier to entry. Here's a Connect item that attempts to address this:

    https://connect.microsoft.com/SQLServer/feedback/details/265266/add-server-side-compilation-ability-to-sql-clr

    SQLCLR UDTs are very powerful and much underused, though their serialization behaviour could use some work. I was hoping for some optimization in this area for 2008 (which allows UDTs to exceed 8000 bytes) but no. It also amuses me slightly that the Microsoft-provided 2008 SQLCLT UDTs use the UNSAFE permission set...

    Agreed again! And here's another Connect item:

    https://connect.microsoft.com/SQLServer/feedback/details/252228/sqlclr-expose-public-serialization-helper-methods-for-user-defined-udt-serialization

    But what I really want is something along the lines of IComparable. Here's one for that:

    https://connect.microsoft.com/SQLServer/feedback/details/252230/sqlclr-provide-the-ability-to-use-icomparable-or-a-similar-mechanism-for-udts

    --
    Adam Machanic
    whoisactive

  • Adam Machanic

    SSCoach

    Points: 15370

    Adam Machanic (2/2/2010)


    Whether or not Dispose is called on SqlChars/SqlBytes seems like an implementation detail; what are you doing where you're even noticing this?

    By the way, here's a related issue:

    https://connect.microsoft.com/SQLServer/feedback/details/479611/an-event-or-exception-should-be-raised-in-sqlclr-when-an-attention-event-or-similar-occurs

    Paul:

    I saw that you left a note in this item so I added some code to test. Would appreciate any insight you can add with regard to what's going on in this situation. Knowing that my code can leak connections, no matter what I do about it--including following Bob Dorr's advice--is not exactly a wonderful feeling (not that it's keeping me up at night either; it's just one more thing we need to monitor so that it can be corrected when and if it happens).

    --
    Adam Machanic
    whoisactive

  • Paul White

    SSC Guru

    Points: 150442

    Hi Adam, and thanks so much for taking the time to reply so fully! I have looked at the connect items you listed (as you saw) but am struggling for the time to work on a worthwhile reply (in-laws visiting from England...)

    I will certainly take a very interested look at that issue (and the other stuff you wrote about) as soon as I get a little time to myself 😉

    Cheers

    Paul

  • Paul White

    SSC Guru

    Points: 150442

    Adam Machanic (2/2/2010)


    Paul White


    UDAs are already the fastest way to stream rows from SQL Server into a SQLCLR object - blowing away the context connection method.

    Really? I'm surprised to hear that you've seen that much of a difference between the two. What exactly did you test?

    Here you go...:-)

    Results averaged over ten runs of the procedure and aggregate over the whole SalesOrderDetail table from the AdventureWorks sample database (2008 SR4):

    Procedure : 995ms worker time

    Aggregate: 487ms worker time

    I disabled parallelism since I thought that might give the aggregate an unfair advantage 😉

    My take is that the aggregate is so fast since it is so directly connected to the query plan (inside the stream aggregate operator) whereas the context connection method is a little more indirect and probably requires more marshalling and stuff...anyway, on to the code. Please feel free to suggest improvements or alternatives if you see any flaws.

    -- This sample database is required

    USE AdventureWorks;

    GO

    -- Turn off stuff we don't want to affect the results

    SET NOCOUNT ON;

    SET STATISTICS IO, TIME OFF;

    GO

    -- CLR functionality required

    IF NOT EXISTS(SELECT * FROM sys.configurations WHERE name = N'clr enabled' AND value_in_use = 1)

    BEGIN

    EXECUTE sp_configure 'clr enabled', 1;

    RECONFIGURE;

    END;

    GO

    -- Drop test objects if they weren't dropped on a previous run

    IF OBJECT_ID(N'AggregateTest', N'AF') IS NOT NULL DROP AGGREGATE dbo.AggregateTest;

    IF OBJECT_ID(N'ProcedureTest', N'PC') IS NOT NULL DROP PROCEDURE dbo.ProcedureTest;

    IF EXISTS (SELECT * FROM sys.assemblies WHERE name = N'InputTest') DROP ASSEMBLY InputTest;

    GO

    -- SQLCLR assmebly containing the test procedure and aggregate

    CREATE ASSEMBLY [InputTest]

    AUTHORIZATION [dbo]

    FROM 

    WITH PERMISSION_SET = SAFE;

    GO

    -- SQLCLR procedure

    CREATE PROCEDURE dbo.ProcedureTest

    AS EXTERNAL NAME InputTest.StoredProcedures.ProcedureTest;

    GO

    -- SQLCLR aggregate

    CREATE AGGREGATE dbo.AggregateTest

    (

    @SalesOrderID INT,

    @SalesOrderDetailID INT,

    @CarrierTrackingNumber NVARCHAR(25),

    @OrderQty SMALLINT,

    @ProductID INT,

    @SpecialOfferID INT,

    @UnitPrice MONEY,

    @UnitPriceDiscount MONEY,

    @LineTotal DECIMAL(38,6),

    @RowGuid UNIQUEIDENTIFIER,

    @ModifiedDate DATETIME

    )

    RETURNS DECIMAL(38,6)

    EXTERNAL NAME InputTest.AggregateTest;

    GO

    -- Flush dirty buffers to disk

    CHECKPOINT;

    -- Dump the buffer pool

    DBCC DROPCLEANBUFFERS;

    -- Dump the system caches

    DBCC FREESYSTEMCACHE('ALL')

    GO

    -- Variables used as 'bit buckets'

    DECLARE @SalesOrderID INT,

    @SalesOrderDetailID INT,

    @CarrierTrackingNumber NVARCHAR(25),

    @OrderQty SMALLINT,

    @ProductID INT,

    @SpecialOfferID INT,

    @UnitPrice MONEY,

    @UnitPriceDiscount MONEY,

    @LineTotal DECIMAL(38,6),

    @RowGuid UNIQUEIDENTIFIER,

    @ModifiedDate DATETIME;

    -- Warm the buffer pool with all pages from the test table

    SELECT @SalesOrderID = [SalesOrderID],

    @SalesOrderDetailID = [SalesOrderDetailID],

    @CarrierTrackingNumber = [CarrierTrackingNumber],

    @OrderQty = [OrderQty],

    @ProductID = [ProductID],

    @SpecialOfferID = [SpecialOfferID],

    @UnitPrice = [UnitPrice],

    @UnitPriceDiscount = [UnitPriceDiscount],

    @LineTotal = [LineTotal],

    @RowGuid = [rowguid],

    @ModifiedDate = [ModifiedDate]

    FROM AdventureWorks.Sales.SalesOrderDetail;

    GO

    -- Run the aggregate test ten times

    SELECT dbo.AggregateTest(

    [SalesOrderID]

    ,[SalesOrderDetailID]

    ,[CarrierTrackingNumber]

    ,[OrderQty]

    ,[ProductID]

    ,[SpecialOfferID]

    ,[UnitPrice]

    ,[UnitPriceDiscount]

    ,[LineTotal]

    ,[rowguid]

    ,[ModifiedDate])

    FROM AdventureWorks.Sales.SalesOrderDetail

    OPTION (MAXDOP 1); -- Parallelism might give the aggregate an unfair advantage

    GO 10

    -- Test the procedure ten times

    EXECUTE dbo.ProcedureTest;

    GO 10

    -- Results

    SELECT ST.text,

    QS.execution_count,

    avg_elapsed_time_ms = QS.total_elapsed_time / QS.execution_count / 1000,

    avg_logical_reads = QS.total_logical_reads / QS.execution_count,

    avg_cpu_time_ms = QS.total_worker_time / QS.execution_count / 1000

    FROM sys.dm_exec_query_stats QS

    CROSS

    APPLY sys.dm_exec_sql_text (QS.sql_handle) ST

    WHERE ST.text LIKE '%aggregate test%'

    AND ST.text NOT LIKE '%sys.dm_exec_query_stats%'

    UNION ALL

    SELECT ST.text,

    PS.execution_count,

    avg_elapsed_time_ms = PS.total_elapsed_time / PS.execution_count / 1000,

    avg_logical_reads = PS.total_logical_reads / PS.execution_count,

    avg_cpu_time_ms = PS.total_worker_time / PS.execution_count / 1000

    FROM sys.dm_exec_procedure_stats PS

    CROSS

    APPLY sys.dm_exec_sql_text (PS.sql_handle) ST

    WHERE ST.text = N'ProcedureTest -- StoredProcedures.ProcedureTest'

    GO

    -- Tidy up

    IF OBJECT_ID(N'AggregateTest', N'AF') IS NOT NULL DROP AGGREGATE dbo.AggregateTest;

    IF OBJECT_ID(N'ProcedureTest', N'PC') IS NOT NULL DROP PROCEDURE dbo.ProcedureTest;

    IF EXISTS (SELECT * FROM sys.assemblies WHERE name = N'InputTest') DROP ASSEMBLY InputTest;

    GO

    The C# code follows for anyone who prefers to compile it for themselves (procedure first, then aggregate):

    using System.Data;

    using System.Data.SqlClient;

    using Microsoft.SqlServer.Server;

    public partial class StoredProcedures

    {

    [Microsoft.SqlServer.Server.SqlProcedure]

    public static void ProcedureTest()

    {

    decimal total = 0M;

    using (SqlConnection conn = new SqlConnection("context connection=true;"))

    {

    SqlCommand comm = new SqlCommand();

    comm.Connection = conn;

    comm.CommandText = @"" +

    "SELECT [SalesOrderID]" +

    ",[SalesOrderDetailID]" +

    ",[CarrierTrackingNumber]" +

    ",[OrderQty]" +

    ",[ProductID]" +

    ",[SpecialOfferID]" +

    ",[UnitPrice]" +

    ",[UnitPriceDiscount]" +

    ",[LineTotal]" +

    ",[rowguid]" +

    ",[ModifiedDate]" +

    "FROM [AdventureWorks].[Sales].[SalesOrderDetail]";

    conn.Open();

    SqlDataReader reader = comm.ExecuteReader();

    int ordinal = reader.GetOrdinal("LineTotal");

    while (reader.Read())

    {

    total += (decimal)reader[ordinal];

    }

    SqlDataRecord sdr = new SqlDataRecord(new SqlMetaData[] { new SqlMetaData("Total", SqlDbType.Decimal, 38, 6) });

    sdr.SetDecimal(0, total);

    SqlContext.Pipe.Send(sdr);

    }

    }

    };

    using System;

    using System.Data.SqlTypes;

    using Microsoft.SqlServer.Server;

    [Serializable]

    [SqlUserDefinedAggregate

    (

    Format.UserDefined,

    MaxByteSize = 16,

    IsInvariantToDuplicates = false,

    IsInvariantToNulls = true,

    IsInvariantToOrder = true,

    IsNullIfEmpty = true

    )

    ]

    public struct AggregateTest : IBinarySerialize

    {

    private decimal _total;

    public void Init()

    {

    _total = 0M;

    }

    public void Accumulate

    (

    SqlInt32 SalesOrderID,

    SqlInt32 SalesOrderDetailID,

    SqlString CarrierTrackingNumber,

    SqlInt16 OrderQty,

    SqlInt32 ProductID,

    SqlInt32 SpecialOfferID,

    SqlMoney UnitPrice,

    SqlMoney UnitPriceDiscount,

    SqlDecimal LineTotal,

    SqlGuid RowGuid,

    SqlDateTime ModifiedDate

    )

    {

    this._total += LineTotal.Value;

    }

    public void Merge(AggregateTest Other)

    {

    this._total += Other._total;

    }

    public SqlDecimal Terminate()

    {

    return new SqlDecimal(this._total);

    }

    #region IBinarySerialize Members

    void IBinarySerialize.Read(System.IO.BinaryReader r)

    {

    this._total = r.ReadDecimal();

    }

    void IBinarySerialize.Write(System.IO.BinaryWriter w)

    {

    w.Write(this._total);

    }

    #endregion

    }

    Paul

    edit: for code formatting, as usual

  • Paul White

    SSC Guru

    Points: 150442

    Adam Machanic (2/2/2010)


    Paul White (2/2/2010)


    Passing LOB streaming input to a SQLCLR TVF always throws a wrong thread exception

    Always?? I have methods that do this that don't throw exceptions; are you referring to UDAs and not TVFs here? I haven't tested there...

    I should have known better than to say 'always' - I know it wasn't clear, but I was referring to my recent experiences trying to pass the streaming LOB output from one SQLCLR component (an aggregate, as it happens) to a SQLCLR TVF. Passing 'ordinary' LOBs into a SQLCLR TVF is fine, of course 🙂

    Sorry about the confusion there - I am still quite emotional about the amount of time and effort I 'wasted' on pursuing that idea only to be hit by the 'wrong thread' nonsense.

  • Paul White

    SSC Guru

    Points: 150442

    Adam Machanic (2/2/2010)


    Paul White (2/2/2010)


    ...This means that LOB output from a UDA has to be fully built before it can start to stream. Kinda seems to defeat the purpose of providing a streaming interface, really.

    Agreed, but this is necessary because--at least today--we have no guarantee of the order in which the rows will be passed to the UDA.

    Absolutely - and the sooner IsInvariantToOrder is implemented the better.

    What I was getting at though, is that without this order guarantee, I am quite happy to do a ROW_NUMBER OVER in the outer query to provide an ordering context for the UDA. If I am sensible with indexing, I can engineer it such that the ranking function is an extremely low-cost addition to the plan.

    Now that a UDA can accept multiple parameters, I can write: SELECT dbo.UDA(@value, @row_number) and handle ordering issues inside the Accumulate method. The ranking function will tend to order the rows, and while I can't rely on that order being preserved, I can optimize for it.

    So, I can write the Accumulate method to 'stream' if rows are received in natural or reverse sequence. Worst case, the plan somehow generates a 'random' row order into the UDA (a hash partitioning exchange for example) and my optimization fails. The UDA can then degrade gracefully, falling back to a less efficient algorithm that inevitably consumes more memory.

    This does complicate the code, but the potential benefits made it seem worthwhile.

  • Paul White

    SSC Guru

    Points: 150442

    Adam Machanic (2/2/2010)


    Paul White (2/2/2010)


    SQL Server doesn't call Dispose on any stream wrapped by a SqlChars or SqlBytes object.

    Whether or not Dispose is called on SqlChars/SqlBytes seems like an implementation detail; what are you doing where you're even noticing this?

    Experimenting... :w00t:

    Let me start by saying that this was entirely for my own learning purposes, and never intended for serious use!

    It's the row-ordering input to a UDA problem again. Worst case, as I mentioned, is that rows arrive in an order which maximizes the number of rows I need to cache in Accumulate() - and maybe Merge(). Math is not my strong point, but I think the worst case involves me caching (N/2) - 1 rows for an N-row input set. For a large number of rows, I'm going to hurt the server if I try to do that in memory.

    SqlChars and SqlBytes can wrap any System.IO.Stream - so in the very worst case, with millions of rows arriving in the worst order possible, I'd like my UDA to act a bit like a hash join that exceeds its memory grant: use a bit of disk space, and ultimately bail out completely by writing the whole set to disk via a (buffered) FileStream.

    Bailing to disk is a bit pointless if I just have to read the whole file into memory inside Terminate(), so I wanted to return a SqlBytes wrapping the still-open FileStream from Terminate, and let the TVF deal with it.

    This is where the need for a Dispose() comes in - to release the file handle and enable the file to delete itself (assuming the stream was created with that option). It just would have been nice if SQL Server had been a good citizen and always called Dispose() on classes derived from Stream. Stream does implement IDisposable, after all...

    Talking of implementation details, it might interest/horrify you to know that I did get this to work - passing a FileStream reference inside SqlBytes from a UDA to a TVF.

    It turns out that SQL Server reads the stream inside the SqlBytes asynchronously - via the BeginRead and EndRead methods. By creating a new class that inherited from FileStream, and overriding the EndRead method, I was able to call Dispose() at the correct time. In the overridden method, I just check the Position of the stream against its length, and dispose it if it is at the end.

    Once dispose was called, the file closed and deleted itself, and everything worked. Yes, I know, I know, you don't have to say anything; I'm aware of how crazy this is, but it was, after all, done in the name of Science!

    Paul

  • Paul White

    SSC Guru

    Points: 150442

    Last post for tonight (which is now very much this morning). I read the Connect items with great interest, and voted on all except the trigger one. I am still mulling that one over. The ThreadAbortException thing will be tomorrow evenings project I think. Nice to have something interesting to look at.

    Paul

  • Paul White

    SSC Guru

    Points: 150442

    Adam Machanic (2/3/2010)


    I saw that you left a note in this item so I added some code to test. Would appreciate any insight you can add with regard to what's going on in this situation. Knowing that my code can leak connections, no matter what I do about it--including following Bob Dorr's advice--is not exactly a wonderful feeling (not that it's keeping me up at night either; it's just one more thing we need to monitor so that it can be corrected when and if it happens).

    After a false start, I remembered the reason and solution from a few years ago:

    A ThreadAbortException is thrown when the client sends an Attention (to cancel the batch), but it is only received by the foreground thread. When the SqlCommand is executing, it is running on the foreground thread, and totally ignores the exception (which makes sense if you think about it).

    So, what we need to do is get the execution onto a background thread, and wait for it to complete. This way, our code waits on the foreground thread and can catch the ThreadAbortException. Handling it is easy - we just call Cancel on the SqlCommand and tidy up.

    The procedure responds instantly to the Attention, and cleans up the connection correctly.

    I posted code on your Connect item, but I'll reproduce it here just in case you look at this reply first. It's not beautiful code, but it'll have to do.

    using System;

    using System.Data;

    using System.Data.SqlClient;

    using System.Data.SqlTypes;

    using Microsoft.SqlServer.Server;

    using System.Threading;

    public partial class StoredProcedures

    {

    [Microsoft.SqlServer.Server.SqlProcedure]

    public static void start_waiting()

    {

    using (SWImpl swi = new SWImpl())

    {

    swi.start_waiting();

    }

    }

    private class SWImpl : IDisposable

    {

    ManualResetEvent mre;

    SqlConnection conn;

    SqlCommand comm;

    public void start_waiting()

    {

    SqlConnectionStringBuilder sb = new SqlConnectionStringBuilder();

    sb.DataSource = @".\SQL2008";

    sb.IntegratedSecurity = true;

    sb.Enlist = false;

    conn = new SqlConnection(sb.ConnectionString);

    conn.Open();

    comm = new SqlCommand();

    comm.Connection = conn;

    comm.CommandText = "WAITFOR DELAY '00:00:15'";

    // Queue the command execution on a background thread

    // The manual reset event will be signalled when it completes

    mre = new ManualResetEvent(false);

    ThreadPool.QueueUserWorkItem(new WaitCallback(delegate { comm.ExecuteNonQuery(); mre.Set(); }));

    try

    {

    // Wait for the command to complete

    mre.WaitOne(30000);

    }

    catch (System.Threading.ThreadAbortException)

    {

    // Maybe an attention, in any case, cancel the command

    comm.Cancel();

    }

    finally

    {

    // Clean up

    this.Dispose();

    }

    }

    #region IDisposable Members

    public void Dispose()

    {

    if (this.comm != null)

    {

    this.comm.Dispose();

    }

    if (this.conn != null)

    {

    SqlConnection.ClearPool(this.conn);

    this.conn.Dispose();

    }

    this.comm = null;

    this.conn = null;

    }

    #endregion

    }

    };

    Paul

  • Paul White

    SSC Guru

    Points: 150442

    Pedro DeRose [MSFT] (1/22/2010)


    Which, I think, may be a good slogan for SQLCLR: better than cutting boards with a hammer. 🙂

    Do you mind if I adopt that gem into my signature? Love it.

  • Adam Machanic

    SSCoach

    Points: 15370

    Paul White (2/5/2010)


    So, what we need to do is get the execution onto a background thread, and wait for it to complete.

    Hi Paul,

    Might want to re-read what Bob Dorr has to say about this technique:

    Tricky but Stupid - Don't do this.

    😀

    http://blogs.msdn.com/psssql/archive/2009/12/15/how-it-works-are-you-handling-cancels-correctly-in-your-sqlclr-code.aspx

    In Bob's case he was only working with a single background thread; in this example using the background thread is actually 50% worse because we're already spinning up a new thread for the linkback connection. So now there are three threads in play, per request... Not going to work too well if your app has to handle a lot of concurrent requests.

    --
    Adam Machanic
    whoisactive

  • Adam Machanic

    SSCoach

    Points: 15370

    Paul White (2/4/2010)


    Results averaged over ten runs of the procedure and aggregate over the whole SalesOrderDetail table from the AdventureWorks sample database (2008 SR4):

    Procedure : 995ms worker time

    Aggregate: 487ms worker time

    Your example shows that an aggregate can aggregate faster than a stored procedure, but is that really surprising? The stored procedure needs to do data access over the context connection, whereas the aggregate simply takes values directly from the query processor. What I expected when you mentioned this was some kind of streaming example where the procedure and aggregate would each spit out a set of rows that had been manipulated in some way--some kind of scenario where someone might actually consider whether to use an aggregate or a procedure.

    --
    Adam Machanic
    whoisactive

  • Adam Machanic

    SSCoach

    Points: 15370

    Paul White (2/4/2010)


    Now that a UDA can accept multiple parameters, I can write: SELECT dbo.UDA(@value, @row_number) and handle ordering issues inside the Accumulate method. The ranking function will tend to order the rows, and while I can't rely on that order being preserved, I can optimize for it.

    This is certainly an interesting and promising idea and I'll have to play with it a bit. I'm not sure, however, whether it helps to solve windowing problems, which is what I had in mind for ordered aggregates. If you need a 1:1 "aggregation," like a running sum, you have to group by something distinct like the row number. But if you do that the QP will spin up a new instance of the aggregate per row, which means you won't have access to the previous row's value. And if you group by anything else you'll have fewer rows in the output than you will in the input. Have you gotten around that somehow? Or do you have some other class of problem where you're able to leverage this?

    --
    Adam Machanic
    whoisactive

  • Adam Machanic

    SSCoach

    Points: 15370

    Paul White (2/4/2010)


    SqlChars and SqlBytes can wrap any System.IO.Stream - so in the very worst case, with millions of rows arriving in the worst order possible, I'd like my UDA to act a bit like a hash join that exceeds its memory grant: use a bit of disk space, and ultimately bail out completely by writing the whole set to disk via a (buffered) FileStream.

    Agreed, but don't you think the SQLCLR hosting environment should handle this itself and spill to tempdb just like everything else in SQL Server does? Pedro, are you listening? 🙂

    --
    Adam Machanic
    whoisactive

  • Paul White

    SSC Guru

    Points: 150442

    Adam Machanic (2/5/2010)


    Hi Paul,

    Might want to re-read what Bob Dorr has to say about this technique:

    Tricky but Stupid - Don't do this.

    😀

    http://blogs.msdn.com/psssql/archive/2009/12/15/how-it-works-are-you-handling-cancels-correctly-in-your-sqlclr-code.aspx

    In Bob's case he was only working with a single background thread; in this example using the background thread is actually 50% worse because we're already spinning up a new thread for the linkback connection. So now there are three threads in play, per request... Not going to work too well if your app has to handle a lot of concurrent requests.

    Well yes that is true - and I am sorry if I misunderstood what you were trying to achieve.

    When I read Bob's remarks, I did treat that as a separate issue because he was talking about doing rather more esoteric things than just calling Execute on a SqlCommand.

    For the code presented in your connect item, I still think this is the best we can do currently. ThreadPool isn't too bad since its size is fixed and the threads tend to be around anyway. One can always code a manager class to stop things getting too out of hand.

    Paul

Viewing 15 posts - 31 through 45 (of 55 total)

You must be logged in to reply to this topic. Login to reply