CLR, to be or not to be, that is the question!

  • Hi,

    A developer approached me in regards for sys.fn_varbintohexstr function they are using at the moment saying it is slow in production: 60 seconds for 15k records (though fast in dev: 3 seconds) and the time increases with each run until the server is rebooted. He wanted to replace this unsupported and discontinued in SQL 2008 function with an assembly, but for this CLR has to be enabled on a production server, which is not desirable, as this will open the gate for the whole swarm of assemblies into the production server.

    Just wondering if there are any other options for this function? Could this functionality be developed on an application server using C# or something? Or is there anything available in Integration Services?

    Thanks.

  • Roust i'm still trying to get a working example going, it's early and pre-cafienne for me.

    perforance like this will be slow because you are using a scalar function 15K times; i think the thing to do is to get a copy of the function, change it to be a table value function instead, and then use cross apply against the data;

    i've see this become an order of magnitude faster in some other examples;

    i feel like Amadeus Motzarts protege, where i understand the concept in theory, but cannot do it myself...just smart enough and all that...

    I'll post what i build soon, but this might inspire others.

    sp_helptext fn_varbintohexstr

    --results

    create function sys.fn_varbintohexstr

    (

    @pbinin varbinary(max)

    )

    returns nvarchar(max)

    as

    begin

    return sys.fn_varbintohexsubstring(1,@pbinin,1,0)

    end

    sp_helptext fn_varbintohexsubstring

    --results

    create function sys.fn_varbintohexsubstring (

    @fsetprefix bit = 1 -- append '0x' to the output

    ,@pbinin varbinary(max) -- input binary stream

    ,@startoffset int = 1 -- starting offset

    ,@cbytesin int = 0 -- length of input to consider, 0 means total length

    )

    returns nvarchar(max)

    as

    begin

    declare @pstrout nvarchar(max)

    ,@i int

    ,@firstnibble int

    ,@secondnibble int

    ,@tempint int

    ,@hexstring char(16)

    --

    -- initialize and validate

    --

    if (@pbinin IS NOT NULL)

    begin

    select @i = 0

    ,@cbytesin = case when (@cbytesin > 0 and @cbytesin <= DATALENGTH(@pbinin) ) then @cbytesin else DATALENGTH(@pbinin) end

    ,@pstrout = case when (@fsetprefix = 1) then N'0x' else N'' end

    ,@hexstring = '0123456789abcdef'

    --the output limit for nvarchar(max) is 2147483648 (2^31) bytes, that is 1073741824 (2^30) unicode characters

    if ( ((@cbytesin * 2) + 2 > 1073741824) or ((@cbytesin * 2) + 2 < 1) or ( @cbytesin is null ))

    return NULL

    if ( ( @startoffset > DATALENGTH(@pbinin) ) or ( @startoffset < 1 ) or ( @startoffset is null ))

    return NULL

    --

    -- adjust the length to process based on start offset and

    -- total length

    --

    if ((DATALENGTH(@pbinin) - @startoffset + 1) < @cbytesin)

    select @cbytesin = DATALENGTH(@pbinin) - @startoffset + 1

    --

    -- do for each byte

    --

    while (@i < @cbytesin)

    begin

    --

    -- Each byte has two nibbles

    -- which we convert to character

    --

    select @tempint = cast(substring(@pbinin, @i + @startoffset, 1) as int)

    select @firstnibble = @tempint / 16

    select @secondnibble = @tempint % 16

    --

    -- we need to do an explicit cast with substring

    -- for proper string conversion.

    --

    select @pstrout = @pstrout +

    cast(substring(@hexstring, (@firstnibble+1), 1) as nvarchar) +

    cast(substring(@hexstring, (@secondnibble+1), 1) as nvarchar)

    select @i = @i + 1

    end

    end

    -- All done

    return @pstrout

    end

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Another issue we have with this function is that it fails after few days without SQL instance restart. When we restart it it works until SQL Server uses its maximum memory and then the job fails again. Why this would happen?

  • Roust_m (3/1/2010)


    Hi,

    A developer approached me in regards for sys.fn_varbintohexstr function they are using at the moment saying it is slow in production: 60 seconds for 15k records (though fast in dev: 3 seconds) and the time increases with each run until the server is rebooted. He wanted to replace this unsupported and discontinued in SQL 2008 function with an assembly, but for this CLR has to be enabled on a production server, which is not desirable, as this will open the gate for the whole swarm of assemblies into the production server.

    Just wondering if there are any other options for this function? Could this functionality be developed on an application server using C# or something? Or is there anything available in Integration Services?

    Thanks.

    As you said, you could write a CLR for this and it would probably do well... but you really don't need to. The reason why the MS function is so comparatively slow is because their code uses a bloody While Loop (Lowell's post is a copy of the MS subfunction that makes it work).

    I have a faster solution (I think) but I need to know... what's the maximum number of bytes you're looking for?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Never mind... decided to make a "direct" replacement so you don't necessarily have to change code. Read the comment in the code header below...

    CREATE FUNCTION dbo.fn_varbintohexstr

    /*****************************************************************************************

    Purpose:

    Replacement for undocumented Master.sys.fn_varbintohexstr function.

    Can probably be created as "sys.fn_varbintohexstr" instead of "dbo.fn_varbintohexstr"

    to create a direct replacement that requires no changes to code. Just need to grant

    privs to the public to use it.

    Revision History:

    Rev 00 - 03 Mar 2010 - Initial creation and test

    *****************************************************************************************/

    (@pbinin VARBINARY(MAX))

    RETURNS TABLE AS

    RETURN ( --======= Nested CTE provides base numbers from 1 to 10E16.

    WITH E1(N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1),--10

    E2(N) AS (SELECT 1 FROM E1 a, E1 b), --100

    E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10,000

    E8(N) AS (SELECT 1 FROM E4 a, E4 b), --100,000,000

    E16(N) AS (SELECT 1 FROM E8 a, E8 b), --10,000,000,000,000,000

    cteTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY N) FROM E16)

    SELECT '0x'

    + ( --=== Converts at the character level and concatenates result

    SELECT SUBSTRING('0123456789abcdef',SUBSTRING(@pbinin, t.N,1)/16+1,1)

    + SUBSTRING('0123456789abcdef',SUBSTRING(@pbinin, t.N,1)%16+1,1)

    FROM cteTally t

    WHERE t.N <= LEN(@pbinin)

    FOR XML PATH('')

    ) AS HexString

    )

    ;

    Might as well (short) test it while we're at it... I didn't test it past 8k bytes but this is what Lowell meant when he was talking about the use of CROSS APPLY...

    --=======================================================================================================

    -- This section just sets up a test table. This is NOT a part of the solution.

    --=======================================================================================================

    --===== Conditionally drop the test table to make test reruns easier.

    IF OBJECT_ID('TempDB..#VarbinaryStuff','U') IS NOT NULL

    DROP TABLE #VarbinaryStuff

    ;

    --===== Create the test table

    CREATE TABLE #VarbinaryStuff

    (

    RowNum INT IDENTITY(1,1),

    VarbinaryValue VARBINARY(8000)

    )

    ;

    --===== Populate the test table with some data

    INSERT INTO #VarbinaryStuff

    (VarbinaryValue)

    SELECT CONVERT(VARBINARY(8000),'This is a test of VarBinary to HEX.') UNION ALL

    SELECT CONVERT(VARBINARY(8000),'So is this.') UNION ALL

    SELECT CONVERT(VARBINARY(8000),'Now is the time for all good men to come to the <extreme> aid of their country.')

    ;

    --=======================================================================================================

    -- Show what's in the test table and what the function returns (the solution)

    --=======================================================================================================

    --===== Show what's in the test table and what the function returns

    SELECT vbs.*, hex.*

    FROM #VarbinaryStuff vbs

    CROSS APPLY (SELECT * FROM dbo.fn_varbintohexstr(vbs.VarbinaryValue)) hex

    ;

    Let me know how it works out for you, please.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I forgot to mention that if you don't know what a Tally table is (CTE in this case) or how it works to replace certain While Loops, please read the following article...

    http://www.sqlservercentral.com/articles/T-SQL/62867/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Relatively short test (15k conversions of NEWID()) on my box came in at about 5 seconds on an 8 year old p4 1.8GHz... should do a lot better on your production and dev servers.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Ah... one final note. If you can live within the bounds of VARBINARY(8000) and change the input parameter datatype from VARCHAR(MAX) to VARCHAR(8000), then you'll get roughly 3 times the speed from the function.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Roust_m (3/2/2010)


    Another issue we have with this function is that it fails after few days without SQL instance restart. When we restart it it works until SQL Server uses its maximum memory and then the job fails again. Why this would happen?

    The function it self won't do that. How you're using it might. For example, if you're using it with sp_OA* routines, there may be a memory leak (dunno if they've fixed that yet or not).

    My question would be, what gives you an indiction that the function is causing the problem?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/3/2010)


    Roust_m (3/1/2010)


    Hi,

    A developer approached me in regards for sys.fn_varbintohexstr function they are using at the moment saying it is slow in production: 60 seconds for 15k records (though fast in dev: 3 seconds) and the time increases with each run until the server is rebooted. He wanted to replace this unsupported and discontinued in SQL 2008 function with an assembly, but for this CLR has to be enabled on a production server, which is not desirable, as this will open the gate for the whole swarm of assemblies into the production server.

    Just wondering if there are any other options for this function? Could this functionality be developed on an application server using C# or something? Or is there anything available in Integration Services?

    Thanks.

    As you said, you could write a CLR for this and it would probably do well... but you really don't need to. The reason why the MS function is so comparatively slow is because their code uses a bloody While Loop (Lowell's post is a copy of the MS subfunction that makes it work).

    I have a faster solution (I think) but I need to know... what's the maximum number of bytes you're looking for?

    Ok, below is the query we use. It uses GetHash function which in turn calls fn_varbintohexstr in the following fashion:

    RETURN (SELECT sys.fn_varbintohexstr(HashBytes('SHA1', @ValueToCompute)))

    SELECT Model.GetHash

    (

    ISNULL(CONVERT(VARCHAR, Model.Party_Address.Party_ID), '')

    + UPPER(LTRIM(RTRIM(ISNULL(Model.Party_Address.Address_Type_Code, ''))))

    + CASE WHEN Model.Party_Address.Address_Line1_Text IS NULL OR LEN(RTRIM(LTRIM(Model.Party_Address.Address_Line1_Text))) = 0 THEN '' ELSE UPPER(LTRIM(RTRIM(Model.Party_Address.Address_Line1_Text))) + ' ' END

    + CASE WHEN Model.Party_Address.Address_Line2_Text IS NULL OR LEN(RTRIM(LTRIM(Model.Party_Address.Address_Line2_Text))) = 0 THEN '' ELSE UPPER(LTRIM(RTRIM(Model.Party_Address.Address_Line2_Text))) + ' ' END

    + CASE WHEN Model.Party_Address.Suburb_Text IS NULL OR LEN(RTRIM(LTRIM(Model.Party_Address.Suburb_Text))) = 0 THEN '' ELSE UPPER(LTRIM(RTRIM(Model.Party_Address.Suburb_Text))) + ' ' END

    + CASE WHEN Model.Party_Address.State_Text IS NULL OR LEN(RTRIM(LTRIM(Model.Party_Address.State_Text))) = 0 THEN '' ELSE UPPER(LTRIM(RTRIM(Model.Party_Address.State_Text))) + ' ' END

    + CASE WHEN Model.Party_Address.Post_Code IS NULL OR LEN(RTRIM(LTRIM(Model.Party_Address.Post_Code))) = 0 THEN '' ELSE UPPER(LTRIM(RTRIM(Model.Party_Address.Post_Code))) + ' ' END

    + CASE WHEN Model.Party_Address.Country_Code IS NULL OR LEN(RTRIM(LTRIM(Model.Party_Address.Country_Code))) = 0 THEN '' ELSE UPPER(LTRIM(RTRIM(Model.Party_Address.Country_Code))) END

    )

    , Model.Party_Address.Party_Address_ID

    FROM Model.Party_Address WITH (NOLOCK)

    INNER JOIN Translation.Party WITH (NOLOCK) ON Model.Party_Address.Party_ID = Translation.Party.Party_ID

    WHERE (Translation.Party.Source_Code = @SourceCode)

    So the size of the value is pretty big...

  • Jeff Moden (3/3/2010)


    Never mind... decided to make a "direct" replacement so you don't necessarily have to change code. Read the comment in the code header below...

    CREATE FUNCTION dbo.fn_varbintohexstr

    /*****************************************************************************************

    Purpose:

    Replacement for undocumented Master.sys.fn_varbintohexstr function.

    Can probably be created as "sys.fn_varbintohexstr" instead of "dbo.fn_varbintohexstr"

    to create a direct replacement that requires no changes to code. Just need to grant

    privs to the public to use it.

    Revision History:

    Rev 00 - 03 Mar 2010 - Initial creation and test

    *****************************************************************************************/

    (@pbinin VARBINARY(MAX))

    RETURNS TABLE AS

    RETURN ( --======= Nested CTE provides base numbers from 1 to 10E16.

    WITH E1(N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1),--10

    E2(N) AS (SELECT 1 FROM E1 a, E1 b), --100

    E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10,000

    E8(N) AS (SELECT 1 FROM E4 a, E4 b), --100,000,000

    E16(N) AS (SELECT 1 FROM E8 a, E8 b), --10,000,000,000,000,000

    cteTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY N) FROM E16)

    SELECT '0x'

    + ( --=== Converts at the character level and concatenates result

    SELECT SUBSTRING('0123456789abcdef',SUBSTRING(@pbinin, t.N,1)/16+1,1)

    + SUBSTRING('0123456789abcdef',SUBSTRING(@pbinin, t.N,1)%16+1,1)

    FROM cteTally t

    WHERE t.N <= LEN(@pbinin)

    FOR XML PATH('')

    ) AS HexString

    )

    ;

    Might as well (short) test it while we're at it... I didn't test it past 8k bytes but this is what Lowell meant when he was talking about the use of CROSS APPLY...

    --=======================================================================================================

    -- This section just sets up a test table. This is NOT a part of the solution.

    --=======================================================================================================

    --===== Conditionally drop the test table to make test reruns easier.

    IF OBJECT_ID('TempDB..#VarbinaryStuff','U') IS NOT NULL

    DROP TABLE #VarbinaryStuff

    ;

    --===== Create the test table

    CREATE TABLE #VarbinaryStuff

    (

    RowNum INT IDENTITY(1,1),

    VarbinaryValue VARBINARY(8000)

    )

    ;

    --===== Populate the test table with some data

    INSERT INTO #VarbinaryStuff

    (VarbinaryValue)

    SELECT CONVERT(VARBINARY(8000),'This is a test of VarBinary to HEX.') UNION ALL

    SELECT CONVERT(VARBINARY(8000),'So is this.') UNION ALL

    SELECT CONVERT(VARBINARY(8000),'Now is the time for all good men to come to the <extreme> aid of their country.')

    ;

    --=======================================================================================================

    -- Show what's in the test table and what the function returns (the solution)

    --=======================================================================================================

    --===== Show what's in the test table and what the function returns

    SELECT vbs.*, hex.*

    FROM #VarbinaryStuff vbs

    CROSS APPLY (SELECT * FROM dbo.fn_varbintohexstr(vbs.VarbinaryValue)) hex

    ;

    Let me know how it works out for you, please.

    Thanks, I will check it out.

  • Jeff Moden (3/3/2010)


    Roust_m (3/2/2010)


    Another issue we have with this function is that it fails after few days without SQL instance restart. When we restart it it works until SQL Server uses its maximum memory and then the job fails again. Why this would happen?

    The function it self won't do that. How you're using it might. For example, if you're using it with sp_OA* routines, there may be a memory leak (dunno if they've fixed that yet or not).

    My question would be, what gives you an indiction that the function is causing the problem?

    It runs for few days, then the job using it fails while executing this query with the function. We restart the instance and it starts working again...

  • Roust_m (3/3/2010)


    Jeff Moden (3/3/2010)


    Roust_m (3/2/2010)


    Another issue we have with this function is that it fails after few days without SQL instance restart. When we restart it it works until SQL Server uses its maximum memory and then the job fails again. Why this would happen?

    The function it self won't do that. How you're using it might. For example, if you're using it with sp_OA* routines, there may be a memory leak (dunno if they've fixed that yet or not).

    My question would be, what gives you an indiction that the function is causing the problem?

    It runs for few days, then the job using it fails while executing this query with the function. We restart the instance and it starts working again...

    It might be the query that's using the function but I don't believe it's the function itself that's causing the problem. Just out of curiosity, can you post the query?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I've posted it already: Post #876546

  • Roust_m (3/3/2010)


    I've posted it already: Post #876546

    I saw that and didn't see the function being used in the code and figured it was an incomplete listing. I don't see anything in the code you posted nor the return statement you posted that would cause what looks to be a memory leak unless the hash function has one in it.

    I'm also curious why you would convert a perfectly good hash-code to text.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply