CLR, to be or not to be, that is the question!

  • Jeff Moden (3/4/2010)


    Roust_m (3/3/2010)


    I've posted it already: Post #876546

    I saw that and didn't see the function being used in the code and figured it was an incomplete listing. I don't see anything in the code you posted nor the return statement you posted that would cause what looks to be a memory leak unless the hash function has one in it.

    I'm also curious why you would convert a perfectly good hash-code to text.

    Ok, as I mention in my post, the query uses GetHash function which in turn calls fn_varbintohexstr in the following fashion:

    RETURN (SELECT sys.fn_varbintohexstr(HashBytes('SHA1', @ValueToCompute)))

    I did not quite understand the question about the perfect hash code conversion. Where am I converting this?

  • Hey Jeff,

    There's a faster method than that in 2005 (and I don't mean CLR):

    USE tempdb;

    GO

    IF OBJECT_ID(N'tempdb..#Binary', N'U')

    IS NOT NULL

    DROP TABLE #Binary;

    GO

    IF OBJECT_ID(N'dbo.ToHex', N'IF')

    IS NOT NULL

    DROP FUNCTION dbo.ToHex;

    GO

    CREATE FUNCTION dbo.ToHex

    (

    @ToConvert VARBINARY(MAX)

    )

    RETURNS TABLE

    WITH SCHEMABINDING

    AS RETURN

    SELECT converted =

    '0x' +

    CONVERT(XML, N'').value('xs:hexBinary(sql:variable("@ToConvert"))', 'VARCHAR(MAX)');

    GO

    CREATE TABLE #Binary

    (

    row_id INTEGER IDENTITY(1,1) NOT NULL,

    value VARBINARY(MAX) NOT NULL

    );

    -- Test data (50,000 GUIDs)

    INSERT #Binary

    (value)

    SELECT CONVERT(VARBINARY(MAX), Random.data)

    FROM (

    SELECT TOP (100000)

    NEWID()

    FROM master.sys.columns C1,

    master.sys.columns C2,

    master.sys.columns C3

    ) Random (data);

    -- TEST

    SET STATISTICS TIME, IO ON;

    DECLARE @HexBucket VARCHAR(MAX);

    SELECT @HexBucket = Hex.converted

    FROM #Binary InputSet

    CROSS

    APPLY dbo.ToHex(InputSet.value) Hex

    -- SQL SERVER 2008 ONLY

    IF CONVERT(SYSNAME, SERVERPROPERTY('ProductVersion')) LIKE N'10.%'

    BEGIN

    SELECT @HexBucket = CONVERT(VARCHAR(MAX), InputSet.value, 1)

    FROM #Binary InputSet;

    END;

    SET STATISTICS TIME, IO OFF;

    GO

    IF OBJECT_ID(N'tempdb..#Binary', N'U')

    IS NOT NULL

    DROP TABLE #Binary;

    GO

    IF OBJECT_ID(N'dbo.ToHex', N'IF')

    IS NOT NULL

    DROP FUNCTION dbo.ToHex;

    GO

    My tests:

    Jeff's 2005 function = 16 seconds

    2005 XML function = 05 seconds

    SQL2K8 native = 0.2 seconds

    Paul

  • Paul White (3/6/2010)


    Hey Jeff,

    There's a faster method than that in 2005 (and I don't mean CLR):

    USE tempdb;

    GO

    IF OBJECT_ID(N'tempdb..#Binary', N'U')

    IS NOT NULL

    DROP TABLE #Binary;

    GO

    IF OBJECT_ID(N'dbo.ToHex', N'IF')

    IS NOT NULL

    DROP FUNCTION dbo.ToHex;

    GO

    CREATE FUNCTION dbo.ToHex

    (

    @ToConvert VARBINARY(MAX)

    )

    RETURNS TABLE

    WITH SCHEMABINDING

    AS RETURN

    SELECT converted =

    '0x' +

    CONVERT(XML, N'').value('xs:hexBinary(sql:variable("@ToConvert"))', 'VARCHAR(MAX)');

    GO

    CREATE TABLE #Binary

    (

    row_id INTEGER IDENTITY(1,1) NOT NULL,

    value VARBINARY(MAX) NOT NULL

    );

    -- Test data (50,000 GUIDs)

    INSERT #Binary

    (value)

    SELECT CONVERT(VARBINARY(MAX), Random.data)

    FROM (

    SELECT TOP (100000)

    NEWID()

    FROM master.sys.columns C1,

    master.sys.columns C2,

    master.sys.columns C3

    ) Random (data);

    -- TEST

    SET STATISTICS TIME, IO ON;

    DECLARE @HexBucket VARCHAR(MAX);

    SELECT @HexBucket = Hex.converted

    FROM #Binary InputSet

    CROSS

    APPLY dbo.ToHex(InputSet.value) Hex

    -- SQL SERVER 2008 ONLY

    IF CONVERT(SYSNAME, SERVERPROPERTY('ProductVersion')) LIKE N'10.%'

    BEGIN

    SELECT @HexBucket = CONVERT(VARCHAR(MAX), InputSet.value, 1)

    FROM #Binary InputSet;

    END;

    SET STATISTICS TIME, IO OFF;

    GO

    IF OBJECT_ID(N'tempdb..#Binary', N'U')

    IS NOT NULL

    DROP TABLE #Binary;

    GO

    IF OBJECT_ID(N'dbo.ToHex', N'IF')

    IS NOT NULL

    DROP FUNCTION dbo.ToHex;

    GO

    My tests:

    Jeff's 2005 function = 16 seconds

    2005 XML function = 05 seconds

    SQL2K8 native = 0.2 seconds

    Thanks, Paul. I guess I need to spend a bit more time with the XML functionality of SQL Server.

    Paul

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Paul White (3/6/2010)


    Hey Jeff,

    There's a faster method than that in 2005 (and I don't mean CLR):

    USE tempdb;

    GO

    IF OBJECT_ID(N'tempdb..#Binary', N'U')

    IS NOT NULL

    DROP TABLE #Binary;

    GO

    IF OBJECT_ID(N'dbo.ToHex', N'IF')

    IS NOT NULL

    DROP FUNCTION dbo.ToHex;

    GO

    CREATE FUNCTION dbo.ToHex

    (

    @ToConvert VARBINARY(MAX)

    )

    RETURNS TABLE

    WITH SCHEMABINDING

    AS RETURN

    SELECT converted =

    '0x' +

    CONVERT(XML, N'').value('xs:hexBinary(sql:variable("@ToConvert"))', 'VARCHAR(MAX)');

    GO

    CREATE TABLE #Binary

    (

    row_id INTEGER IDENTITY(1,1) NOT NULL,

    value VARBINARY(MAX) NOT NULL

    );

    -- Test data (50,000 GUIDs)

    INSERT #Binary

    (value)

    SELECT CONVERT(VARBINARY(MAX), Random.data)

    FROM (

    SELECT TOP (100000)

    NEWID()

    FROM master.sys.columns C1,

    master.sys.columns C2,

    master.sys.columns C3

    ) Random (data);

    -- TEST

    SET STATISTICS TIME, IO ON;

    DECLARE @HexBucket VARCHAR(MAX);

    SELECT @HexBucket = Hex.converted

    FROM #Binary InputSet

    CROSS

    APPLY dbo.ToHex(InputSet.value) Hex

    -- SQL SERVER 2008 ONLY

    IF CONVERT(SYSNAME, SERVERPROPERTY('ProductVersion')) LIKE N'10.%'

    BEGIN

    SELECT @HexBucket = CONVERT(VARCHAR(MAX), InputSet.value, 1)

    FROM #Binary InputSet;

    END;

    SET STATISTICS TIME, IO OFF;

    GO

    IF OBJECT_ID(N'tempdb..#Binary', N'U')

    IS NOT NULL

    DROP TABLE #Binary;

    GO

    IF OBJECT_ID(N'dbo.ToHex', N'IF')

    IS NOT NULL

    DROP FUNCTION dbo.ToHex;

    GO

    My tests:

    Jeff's 2005 function = 16 seconds

    2005 XML function = 05 seconds

    SQL2K8 native = 0.2 seconds

    Paul

    Thanks a lot, this works great. Just one more question though. does this method guarantee uniqueness? Lets say if I convert 10 million different values, can some Hex values be the same for different input values?

    Sorry, forgot to mention, I made a small change. The way I use it is:

    INSERT #Binary (value)

    SELECT HashBytes('SHA1', col1 + col2 + col3 + ... + colN)

    FROM mytable

    SELECT Hex.converted

    FROM #Binary InputSet

    CROSS

    APPLY dbo.ToHex(InputSet.value) Hex

  • Roust_m (3/9/2010)


    Just one more question though. Does this method guarantee uniqueness? Let's say I convert 10 million different values, can some Hex values be the same for different input values?

    The output will be as unique as the input - hex is just another way of representing the same 'number' after all.

    Using HashBytes with the SHA-1 algorithm makes it monumentally unlikely to produce duplicates. Not impossible, though 😉

    The chance of producing two inputs with the same output is 1 in 1.4615 quindecillion.

    That number is 1.4615E+048.

    Or, written out the long way:

    1,461,500,000,000,000,000,000,000,000,000,000,000,000,000,000,000

    Paul

Viewing 5 posts - 16 through 20 (of 20 total)

You must be logged in to reply to this topic. Login to reply