I need some help with a test, please.

  • gah (9/20/2010)


    Jeff Moden (9/20/2010)

    Thank you one an all for your help. 🙂

    Jeff...you are more than welcome, it was the least I could do to repay you for the amount of help you have provided to so many others like myself in the past.

    Kind regards Graham

    I'm humbled, Graham. Thank you for your kind words.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (9/20/2010)


    P.S. I really want to know when you took up mind-reading, Ron. I was about to post and ask for a full run like that. Thanks.

    Jeff thank yourself - unknown to you I have learned a vast amount of good, solid things to do in SQL Server - having come from a background of an interface (GUI) programmer where everything is done RBAR - reading your articles / posts was the dawn of a new day for me.

    And thanks to all the others who have contributed to my learning.

    That and learning to test and understand what was posted by others, when I see something new I have to do just that. Reading Brad Schulz post to this forum got me to wanting to learn and test, and once tested thought why trash it, Jeff might be able to use. The old saying "curiosity killed the cat, satisfaction bought it back"

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

  • bitbucket-25253 (9/20/2010)


    Jeff Moden (9/20/2010)


    P.S. I really want to know when you took up mind-reading, Ron. I was about to post and ask for a full run like that. Thanks.

    Jeff thank yourself - unknown to you I have learned a vast amount of good, solid things to do in SQL Server - having come from a background of an interface (GUI) programmer where everything is done RBAR - reading your articles / posts was the dawn of a new day for me.

    And thanks to all the others who have contributed to my learning.

    That and learning to test and understand what was posted by others, when I see something new I have to do just that. Reading Brad Schulz post to this forum got me to wanting to learn and test, and once tested thought why trash it, Jeff might be able to use. The old saying "curiosity killed the cat, satisfaction bought it back"

    That's mighty kind of you to say, Ron. Looking forward to seeing you this February. Thanks again.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • rob.symonds (9/20/2010)


    Anyone else a little concerned about the 2k8 results for the inline Tally?

    Sorry, I would obviously try this myself but I don't have a 2k8 instance to hand, could someone try a multi-line tally version on 2k8?

    Below is Jeff's original just with the ML Tally inserted

    Don't worry Jeff, I didn't forget to attach the results of the query in the original post.

    --===== Do this in a nice safe place that everyone has

    USE tempdb;

    GO

    --===================================================================

    -- Create a Tally table from 1 to 11000

    --===================================================================

    --===== If the Tally table already exists here, drop it to make reruns easier.

    IF OBJECT_ID('tempdb.dbo.Tally','U') IS NOT NULL

    DROP TABLE dbo.Tally;

    --===== Create and populate the Tally table on the fly.

    -- This ISNULL function makes the column NOT NULL

    -- so we can put a Primary Key on it

    SELECT TOP 11000

    IDENTITY(INT,1,1) AS N

    INTO dbo.Tally

    FROM master.sys.all_columns ac1

    CROSS JOIN master.sys.all_columns ac2

    ;

    --===== Add a CLUSTERED Primary Key to maximize performance

    ALTER TABLE dbo.Tally

    ADD CONSTRAINT PK_Tally_N

    PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100

    ;

    --===== Allow the general public to use it

    GRANT SELECT ON dbo.Tally TO PUBLIC

    ;

    GO

    --=====================================================================================================================

    -- Create the various functions to test

    --=====================================================================================================================

    --===== Do this in a nice safe place that everyone has

    USE tempdb;

    --===== Tally Table (Split8KTally iTVF) ===============================================================================

    IF OBJECT_ID('tempdb.dbo.Split8KTally') IS NOT NULL

    DROP FUNCTION dbo.Split8KTally

    GO

    CREATE FUNCTION dbo.Split8KTally

    (@Parameter VARCHAR(8000), @Delimiter VARCHAR(1))

    RETURNS TABLE

    WITH SCHEMABINDING AS

    RETURN

    SELECT ROW_NUMBER() OVER (ORDER BY N) AS ItemNumber,

    SUBSTRING(@Parameter,N,CHARINDEX(@Delimiter,@Parameter+@Delimiter,N)-N) AS ItemValue

    FROM dbo.Tally

    WHERE N BETWEEN 1 AND LEN(@Parameter)

    AND SUBSTRING(@Delimiter+@Parameter,N,1) = @Delimiter; --Notice how we find the comma

    GO

    --===== Tally Table (Split8KTally iTVF) ===============================================================================

    IF OBJECT_ID('tempdb.dbo.Split8KTally') IS NOT NULL

    DROP FUNCTION dbo.Split8KTally

    GO

    CREATE FUNCTION dbo.Split8KTally

    (@Parameter VARCHAR(8000), @Delimiter VARCHAR(1))

    RETURNS TABLE

    WITH SCHEMABINDING AS

    RETURN

    SELECT ROW_NUMBER() OVER (ORDER BY N) AS ItemNumber,

    SUBSTRING(@Parameter,N,CHARINDEX(@Delimiter,@Parameter+@Delimiter,N)-N) AS ItemValue

    FROM dbo.Tally

    WHERE N BETWEEN 1 AND LEN(@Parameter)

    AND SUBSTRING(@Delimiter+@Parameter,N,1) = @Delimiter; --Notice how we find the comma

    GO

    --===== Tally Table (Split8KTally mlTVF) ===============================================================================

    IF OBJECT_ID('tempdb.dbo.Split8KTally_mlTVF') IS NOT NULL

    DROP FUNCTION dbo.Split8KTally_mlTVF

    GO

    Create FUNCTION [dbo].[Split8KTally_mlTVF]

    (@Parameter varchar(8000), @Delimiter varchar(5))

    RETURNS @RtnValue table (Id int identity(1,1), [Value] varchar(8000))

    AS BEGIN

    SET @Parameter = @Delimiter + @Parameter + @Delimiter

    DECLARE @SplitCharLength int;

    SET @SplitCharLength = len(@Delimiter);

    INSERT INTO @RtnValue (Value)

    SELECT SUBSTRING(@Parameter, (N + @SplitCharLength), CHARINDEX(@Delimiter, @Parameter, N+1) - (N + @SplitCharLength))

    FROM dbo.Tally

    WHERE N < (LEN(@Parameter) - @SplitCharLength)

    AND SUBSTRING(@Parameter, N, @SplitCharLength) = @Delimiter

    RETURN

    END

    GO

    --===== Loop Method 1 (Split8KL1 mlTVF) ===============================================================================

    IF OBJECT_ID('tempdb.dbo.Split8KL1') IS NOT NULL

    DROP FUNCTION dbo.Split8KL1

    GO

    CREATE FUNCTION dbo.Split8KL1

    (@Parameter VARCHAR(8000), @Delimiter VARCHAR(1))

    RETURNS @Result TABLE (ItemNumber INT IDENTITY(1,1), ItemValue VARCHAR(8000))

    WITH SCHEMABINDING AS

    BEGIN

    --===== Declare a variable to remember the position of the current comma

    DECLARE @N INT;

    --===== Add start and end commas to the Parameter so we can handle

    -- single elements

    SELECT @Parameter = @Delimiter + @Parameter + @Delimiter,

    --===== Preassign the current comma as the first character

    @N = 1;

    --===== Loop through and find each comma, then insert the string value

    -- found between the current comma and the next comma. @N is

    -- the position of the current comma.

    WHILE @N < LEN(@Parameter) --Don't include the last comma

    BEGIN

    --==== Do the insert using the value between the commas

    INSERT INTO @Result (ItemValue)

    SELECT SUBSTRING(@Parameter, @N+1, CHARINDEX(@Delimiter, @Parameter, @N+1)-@N-1);

    --==== Find the next comma

    SELECT @N = CHARINDEX(@Delimiter, @Parameter, @N+1);

    END; --END While

    RETURN;

    END; --END Function

    GO

    --===== XML-Brad (Split8KXMLBrad iTVF) ======================================================================================

    IF OBJECT_ID('dbo.Split8KXMLBrad') IS NOT NULL

    DROP FUNCTION dbo.Split8KXMLBrad

    GO

    CREATE FUNCTION dbo.Split8KXMLBrad

    (@Parameter VARCHAR(MAX), @Delimiter VARCHAR(1))

    RETURNS TABLE

    WITH SCHEMABINDING AS

    RETURN

    SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS ItemNumber,

    x.i.value('(./text())[1]','varchar(8000)') AS ItemValue

    from (select XMLList=cast('<i>'+replace(@Parameter,@Delimiter,'</i><i>')+'</i>' as xml).query('.')) a

    cross apply XMLList.nodes('i') x(i)

    ;

    GO

    --=====================================================================================================================

    -- Build some test data from the table that is actually using in the tests

    --=====================================================================================================================

    --===== Build a thousand item CSV row without it being dependent on other test tables

    DECLARE @CSVList VARCHAR(8000);

    SELECT @CSVList = STUFF(

    (

    SELECT TOP 1000 ','+ CAST(SalesOrderDetailID AS VARCHAR(10))

    FROM AdventureWorks.Sales.SalesOrderDetail

    ORDER BY NEWID()

    FOR XML PATH('')

    )

    ,1,1,'');

    --===== Show what we have for a parameter now...

    PRINT @CSVList;

    --=====================================================================================================================

    -- Run the tests using Brad's code

    --=====================================================================================================================

    --===== Commence testing ...

    SET STATISTICS TIME,IO ON;

    --And use it in a query:

    PRINT '--===== Split8KTally ==========================================================================================='

    declare @ProductID int, @ProductName varchar(50), @SalesOrderID int, @OrderDate datetime, @TerritoryName varchar(50);

    select @ProductID=d.ProductID

    ,@ProductName=p.Name

    ,@SalesOrderID=h.SalesOrderID

    ,@OrderDate=h.OrderDate

    ,@TerritoryName=t.Name

    from dbo.Split8KTally(@CSVList,',') a

    join AdventureWorks.Sales.SalesOrderDetail d on cast(a.ItemValue as int)=d.ProductID

    join AdventureWorks.Production.Product p on d.ProductID=p.ProductID

    join AdventureWorks.Sales.SalesOrderHeader h on d.SalesOrderID=h.SalesOrderID

    join AdventureWorks.Sales.SalesTerritory t on h.TerritoryID=t.TerritoryID

    order by d.ProductID,h.SalesOrderID ;

    PRINT '--===== Split8KTally mlTVF ==========================================================================================='

    select @ProductID=d.ProductID

    ,@ProductName=p.Name

    ,@SalesOrderID=h.SalesOrderID

    ,@OrderDate=h.OrderDate

    ,@TerritoryName=t.Name

    from dbo.Split8KTally_mlTVF(@CSVList,',') a

    join AdventureWorks.Sales.SalesOrderDetail d on cast(a.Value as int)=d.ProductID

    join AdventureWorks.Production.Product p on d.ProductID=p.ProductID

    join AdventureWorks.Sales.SalesOrderHeader h on d.SalesOrderID=h.SalesOrderID

    join AdventureWorks.Sales.SalesTerritory t on h.TerritoryID=t.TerritoryID

    order by d.ProductID,h.SalesOrderID ;

    --declare @ProductID int, @ProductName varchar(50), @SalesOrderID int, @OrderDate datetime, @TerritoryName varchar(50);

    PRINT '--===== Split8KXMLBrad ==========================================================================================='

    select @ProductID=d.ProductID

    ,@ProductName=p.Name

    ,@SalesOrderID=h.SalesOrderID

    ,@OrderDate=h.OrderDate

    ,@TerritoryName=t.Name

    from dbo.Split8KXMLBrad(@CSVList,',') a

    join AdventureWorks.Sales.SalesOrderDetail d on cast(a.ItemValue as int)=d.ProductID

    join AdventureWorks.Production.Product p on d.ProductID=p.ProductID

    join AdventureWorks.Sales.SalesOrderHeader h on d.SalesOrderID=h.SalesOrderID

    join AdventureWorks.Sales.SalesTerritory t on h.TerritoryID=t.TerritoryID

    order by d.ProductID,h.SalesOrderID ;

    --declare @ProductID int, @ProductName varchar(50), @SalesOrderID int, @OrderDate datetime, @TerritoryName varchar(50);

    PRINT '--===== Split8KL1 ==========================================================================================='

    select @ProductID=d.ProductID

    ,@ProductName=p.Name

    ,@SalesOrderID=h.SalesOrderID

    ,@OrderDate=h.OrderDate

    ,@TerritoryName=t.Name

    from dbo.Split8KL1(@CSVList,',') a

    join AdventureWorks.Sales.SalesOrderDetail d on cast(a.ItemValue as int)=d.ProductID

    join AdventureWorks.Production.Product p on d.ProductID=p.ProductID

    join AdventureWorks.Sales.SalesOrderHeader h on d.SalesOrderID=h.SalesOrderID

    join AdventureWorks.Sales.SalesTerritory t on h.TerritoryID=t.TerritoryID

    order by d.ProductID,h.SalesOrderID ;

    SET STATISTICS TIME,IO OFF;

    --=====================================================================================================================

    -- Houskeeping

    --=====================================================================================================================

    --===== Do this in a nice safe place that everyone has

    USE tempdb;

    GO

    --===== Drop all the object that we just created

    DROP FUNCTION dbo.Split8KTally, dbo.Split8KL1, dbo.Split8KXMLBrad;

    DROP TABLE dbo.Tally;

    GO

    So I'm sitting in the corner sucking my thumb, twiddling my hair, nervously clutching my favorite blanky while rocking back and forth, occasionally taking a lick from a beer lollipop and muttering a constant stream of "WTF over?". The "reflection" back into the Tally table based function that caused a full Cartesian Join was still fresh in my mind and I asked myself why the "reflection" was occurring in 2008 and not 2005? Then I said "Who cares... let's try something different!" I knew it was worth thinking about because even Brad managed to squeak out a "Whoa!" when it came to the speed of the Tally Table solution in 2005. There had to be a way in 2008.

    Then it dawned on me to try an mlTVF instead of an iTVF so, still clutching my favorite blanky and the beer lollipop, I built one... and it smokes all the code anyone has tried on 2008 yet. Or, at least I thought "yet". That's when I remembered seeing something about a post in this thread and went looking for it. Obviously, I found it (quoted above) and couldn't understand why we didn't actually try it and the answer is that Rob was having problems uploading the results and I flat forgot about it.

    So, here it is... the Tally Table solution that beats the XML and the While Loop in 2008 without any fancy flags or option settings. It's a very old solution normally frowned upon by wiz-kids. Sorry it took so long. I'm not as old as BitBucket but I will blame it all on a bit of CRS. I'm also a wee bit embarassed that I let these kids whip my hiney with XML and While Loops. 😉

    Here are the results from that second bit of code with the 3 original players and the new (old) Tally Table solution...

    Routine

    Split8KTally CPU time = 101573 ms, elapsed time = 57478 ms.

    Split8KXMLBrad CPU time = 732 ms, elapsed time = 706 ms.

    Split8KL1 CPU time = 63 ms, elapsed time = 81 ms.

    Split8KTallyMulti CPU time = 31 ms, elapsed time = 27 ms.

    Here's the test code for anyone who wants to do their own runs.

    --===== Do this in a nice safe place that everyone has

    USE tempdb;

    GO

    --===================================================================

    -- Create a Tally table from 1 to 11000

    --===================================================================

    --===== If the Tally table already exists here, drop it to make reruns easier.

    IF OBJECT_ID('tempdb.dbo.Tally','U') IS NOT NULL

    DROP TABLE dbo.Tally;

    --===== Create and populate the Tally table on the fly.

    -- This ISNULL function makes the column NOT NULL

    -- so we can put a Primary Key on it

    SELECT TOP 11000

    IDENTITY(INT,1,1) AS N

    INTO dbo.Tally

    FROM master.sys.all_columns ac1

    CROSS JOIN master.sys.all_columns ac2

    ;

    --===== Add a CLUSTERED Primary Key to maximize performance

    ALTER TABLE dbo.Tally

    ADD CONSTRAINT PK_Tally_N

    PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100

    ;

    --===== Allow the general public to use it

    GRANT SELECT ON dbo.Tally TO PUBLIC

    ;

    GO

    --=====================================================================================================================

    -- Create the various functions to test

    --=====================================================================================================================

    --===== Do this in a nice safe place that everyone has

    USE tempdb;

    --===== Tally Table (Split8KTally iTVF) ===============================================================================

    IF OBJECT_ID('tempdb.dbo.Split8KTally') IS NOT NULL

    DROP FUNCTION dbo.Split8KTally

    GO

    CREATE FUNCTION dbo.Split8KTally

    (@Parameter VARCHAR(8000), @Delimiter VARCHAR(1))

    RETURNS TABLE

    WITH SCHEMABINDING AS

    RETURN

    SELECT ROW_NUMBER() OVER (ORDER BY N) AS ItemNumber,

    SUBSTRING(@Parameter,N,CHARINDEX(@Delimiter,@Parameter+@Delimiter,N)-N) AS ItemValue

    FROM dbo.Tally

    WHERE N BETWEEN 1 AND LEN(@Parameter)+1

    AND SUBSTRING(@Delimiter+@Parameter,N,1) = @Delimiter; --Notice how we find the comma

    GO

    --===== XML-Brad (Split8KXMLBrad iTVF) ================================================================================

    IF OBJECT_ID('dbo.Split8KXMLBrad') IS NOT NULL

    DROP FUNCTION dbo.Split8KXMLBrad

    GO

    CREATE FUNCTION dbo.Split8KXMLBrad

    (@Parameter VARCHAR(MAX), @Delimiter VARCHAR(1))

    RETURNS TABLE

    WITH SCHEMABINDING AS

    RETURN

    SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS ItemNumber,

    x.i.value('(./text())[1]','varchar(8000)') AS ItemValue

    from (select XMLList=cast('<i>'+replace(@Parameter,@Delimiter,'</i><i>')+'</i>' as xml).query('.')) a

    cross apply XMLList.nodes('i') x(i)

    ;

    GO

    --===== Loop Method 1 (Split8KL1 mlTVF) ===============================================================================

    IF OBJECT_ID('tempdb.dbo.Split8KL1') IS NOT NULL

    DROP FUNCTION dbo.Split8KL1

    GO

    CREATE FUNCTION dbo.Split8KL1

    (@Parameter VARCHAR(8000), @Delimiter VARCHAR(1))

    RETURNS @Result TABLE (ItemNumber INT IDENTITY(1,1), ItemValue VARCHAR(8000))

    WITH SCHEMABINDING AS

    BEGIN

    --===== Declare a variable to remember the position of the current comma

    DECLARE @N INT;

    --===== Add start and end commas to the Parameter so we can handle

    -- single elements

    SELECT @Parameter = @Delimiter + @Parameter + @Delimiter,

    --===== Preassign the current comma as the first character

    @N = 1;

    --===== Loop through and find each comma, then insert the string value

    -- found between the current comma and the next comma. @N is

    -- the position of the current comma.

    WHILE @N < LEN(@Parameter) --Don't include the last comma

    BEGIN

    --==== Do the insert using the value between the commas

    INSERT INTO @Result (ItemValue)

    SELECT SUBSTRING(@Parameter, @N+1, CHARINDEX(@Delimiter, @Parameter, @N+1)-@N-1);

    --==== Find the next comma

    SELECT @N = CHARINDEX(@Delimiter, @Parameter, @N+1);

    END; --END While

    RETURN;

    END; --END Function

    GO

    --===== Tally Table (Split8KTallyMulti iTVF) ==========================================================================

    IF OBJECT_ID('tempdb.dbo.Split8KTallyMulti') IS NOT NULL

    DROP FUNCTION dbo.Split8KTallyMulti

    ;

    GO

    CREATE FUNCTION dbo.Split8KTallyMulti

    (@Parameter VARCHAR(8000), @Delimiter VARCHAR(1))

    RETURNS @Split TABLE (ItemNumber INT, ItemValue VARCHAR(8000))

    WITH SCHEMABINDING AS

    BEGIN

    INSERT INTO @Split

    (ItemNumber,ItemValue)

    SELECT ROW_NUMBER() OVER (ORDER BY N) AS ItemNumber,

    SUBSTRING(@Parameter,N,CHARINDEX(@Delimiter,@Parameter+@Delimiter,N)-N) AS ItemValue

    FROM dbo.Tally

    WHERE N BETWEEN 1 AND LEN(@Parameter)+1

    AND SUBSTRING(@Delimiter+@Parameter,N,1) = @Delimiter; --Notice how we find the comma

    RETURN;

    END;

    GO

    --=====================================================================================================================

    -- Build some test data from the table that is actually using in the tests

    --=====================================================================================================================

    --===== Build a thousand item CSV row without it being dependent on other test tables

    DECLARE @CSVList VARCHAR(8000);

    SELECT @CSVList = STUFF(

    (

    SELECT TOP 1000 ','+ CAST(SalesOrderDetailID AS VARCHAR(10))

    FROM AdventureWorks.Sales.SalesOrderDetail

    ORDER BY NEWID()

    FOR XML PATH('')

    )

    ,1,1,'');

    --===== Show what we have for a parameter now...

    PRINT @CSVList;

    --=====================================================================================================================

    -- Run the tests using Brad's code

    --=====================================================================================================================

    --===== Commence testing ...

    SET STATISTICS TIME,IO ON;

    --And use it in a query:

    PRINT '--===== Split8KTally (iTVF) ==========================================================================================='

    declare @ProductID int, @ProductName varchar(50), @SalesOrderID int, @OrderDate datetime, @TerritoryName varchar(50);

    select @ProductID=d.ProductID

    ,@ProductName=p.Name

    ,@SalesOrderID=h.SalesOrderID

    ,@OrderDate=h.OrderDate

    ,@TerritoryName=t.Name

    from dbo.Split8KTally(@CSVList,',') a

    join AdventureWorks.Sales.SalesOrderDetail d on cast(a.ItemValue as int)=d.ProductID

    join AdventureWorks.Production.Product p on d.ProductID=p.ProductID

    join AdventureWorks.Sales.SalesOrderHeader h on d.SalesOrderID=h.SalesOrderID

    join AdventureWorks.Sales.SalesTerritory t on h.TerritoryID=t.TerritoryID

    order by d.ProductID,h.SalesOrderID ;

    --declare @ProductID int, @ProductName varchar(50), @SalesOrderID int, @OrderDate datetime, @TerritoryName varchar(50);

    PRINT '--===== Split8KXMLBrad (XML)==========================================================================================='

    select @ProductID=d.ProductID

    ,@ProductName=p.Name

    ,@SalesOrderID=h.SalesOrderID

    ,@OrderDate=h.OrderDate

    ,@TerritoryName=t.Name

    from dbo.Split8KXMLBrad(@CSVList,',') a

    join AdventureWorks.Sales.SalesOrderDetail d on cast(a.ItemValue as int)=d.ProductID

    join AdventureWorks.Production.Product p on d.ProductID=p.ProductID

    join AdventureWorks.Sales.SalesOrderHeader h on d.SalesOrderID=h.SalesOrderID

    join AdventureWorks.Sales.SalesTerritory t on h.TerritoryID=t.TerritoryID

    order by d.ProductID,h.SalesOrderID ;

    --declare @ProductID int, @ProductName varchar(50), @SalesOrderID int, @OrderDate datetime, @TerritoryName varchar(50);

    PRINT '--===== Split8KL1 (Loop mlTVF)========================================================================================='

    select @ProductID=d.ProductID

    ,@ProductName=p.Name

    ,@SalesOrderID=h.SalesOrderID

    ,@OrderDate=h.OrderDate

    ,@TerritoryName=t.Name

    from dbo.Split8KL1(@CSVList,',') a

    join AdventureWorks.Sales.SalesOrderDetail d on cast(a.ItemValue as int)=d.ProductID

    join AdventureWorks.Production.Product p on d.ProductID=p.ProductID

    join AdventureWorks.Sales.SalesOrderHeader h on d.SalesOrderID=h.SalesOrderID

    join AdventureWorks.Sales.SalesTerritory t on h.TerritoryID=t.TerritoryID

    order by d.ProductID,h.SalesOrderID ;

    --declare @ProductID int, @ProductName varchar(50), @SalesOrderID int, @OrderDate datetime, @TerritoryName varchar(50);

    PRINT '

    --===== Split8KTallyMulti (mlTVF )====================================================================================='

    select @ProductID=d.ProductID

    ,@ProductName=p.Name

    ,@SalesOrderID=h.SalesOrderID

    ,@OrderDate=h.OrderDate

    ,@TerritoryName=t.Name

    from dbo.Split8KTallyMulti(@CSVList,',') a

    join AdventureWorks.Sales.SalesOrderDetail d on cast(a.ItemValue as int)=d.ProductID

    join AdventureWorks.Production.Product p on d.ProductID=p.ProductID

    join AdventureWorks.Sales.SalesOrderHeader h on d.SalesOrderID=h.SalesOrderID

    join AdventureWorks.Sales.SalesTerritory t on h.TerritoryID=t.TerritoryID

    order by d.ProductID,h.SalesOrderID ;

    SET STATISTICS TIME,IO OFF;

    --=====================================================================================================================

    -- Houskeeping

    --=====================================================================================================================

    --===== Do this in a nice safe place that everyone has

    USE tempdb;

    GO

    --===== Drop all the object that we just created

    DROP FUNCTION dbo.Split8KTally, dbo.Split8KL1, dbo.Split8KXMLBrad, dbo.Split8KTallyMulti;

    DROP TABLE dbo.Tally;

    GO

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • GilaMonster (9/20/2010)


    Jeff

    Would you like a run on 2008 R2? Will have to be this weekend at earliest, cause I'm not planning to upgrade before then.

    Hi Gail... now that I actually have something decent for you to test, I'd love it if you would run both the code from the original post and the code in the post right above this one whenever you get the chance. Thank you much! 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (9/21/2010)


    Then it dawned on me to try an mlTVF instead of an iTVF so, still clutching my favorite blanky and the beer lollipop, I built one... and it smokes all the code anyone has tried on 2008 yet. Or, at least I thought "yet". That's when I remembered seeing something about a post in this thread and went looking for it. Obviously, I found it (quoted above) and couldn't understand why we didn't actually try it and the answer is that Rob was having problems uploading the results and I flat forgot about it.

    I mentioned this approach on Sunday night as well, though I didn't post any hard numbers:

    Brad Schulz (9/19/2010)[/b]


    I tried also converting the Tally approach and the XML approach to multi-line functions as opposed to inline functions (based on a suggestion by Adam Machanic), and also put a "wrapper" around the CLR function to force it to act like a multi-line function, and all approaches came out pretty much on par, though the CLR approach required some more reads than the other approaches because of the "wrapper".

    This brings all the approaches (Tally, XML, RBAR, CLR) to a fairly even playing field, because it creates the same query plan for each of them, and doesn't require any crazy memory grant requests. Essentially the query plan is pretty much as written... in other words, execute the function to get the set of items you want to lookup, and LOOP JOIN that with the table you want to look it up in. The optimizer assumes 1 row as the output of the function, and thus it does a Nested Loops join. The problem is that this can cause a slowdown if your list of items is reeeeeaaallllyyyy looooonnnnnggggg.

    So it brings it down again to: which function will strip the item list fastest? So CLR comes in first (though it requires more reads), then the Tally approach (though the RBAR approach is actually better for a small list of items... Tally better for loooong list of items), and, sadly, XML comes in last (because it requires more CPU).

    Again, as I mentioned on Sunday, I intended to write a follow-up post about this, but I keep putting it off.

    --Brad

  • I must admit I'm a little confused about the conclusions being drawn overall.

    At the risk of stating the obvious, using a multi-statement TVF is exactly equivalent to splitting the input value into a table variable, and then running a separate join query.

    In any case, the net effect is to produce a better plan by counteracting certain costing model limitations by introducing another - namely that table variables are estimated as producing one row (unless OPTION(RECOMPILE) is used).

    There's nothing wrong with that of course - but it does mean that one has to choose between Tally implementations depending on how large the input set is.

  • This seems a good time to highlight some other issues too:

    The Tally solution does not handle trailing blank records or Unicode characters correctly:

    DECLARE @CSV NVARCHAR(MAX) = N'A,B&C,,C,?,',

    @Delim NVARCHAR(255) = N',';

    SELECT S.ItemNumber,

    S.ItemValue

    FROM dbo.Split8KTally(@CSV, @Delim) S;

    There should be six records, not five; and the fifth record contains a '?'.

    The XML solution needs to be modified (as shown on Brad's blog) to handle entitization correctly:

    DECLARE @CSV NVARCHAR(MAX) = N'A,B&C,,C,?,',

    @Delim NVARCHAR(255) = N',';

    SELECT S.ItemNumber,

    S.ItemValue

    FROM dbo.Split8KXMLBrad(@CSV, @Delim) S;

    Other character values are invalid in XML and would cause the method to fail completely. The XML solution also emit NULLs instead of empty strings.

    The SQLCLR solution works correctly:

    DECLARE @CSV NVARCHAR(MAX) = N'A,B&C,,C,?,',

    @Delim NVARCHAR(255) = N',';

    SELECT S.sequence,

    S.item

    FROM dbo.SplitString_Multi(@CSV, @Delim) S;

  • Brad Schulz (9/20/2010)


    CLR is certainly fastest in and of itself, but it requires a big memory grant (almost a 100MB) and brings about more reads than necessary (because of the optimizer kind of treating it like a "black box" that spits out an estimated 1000 rows and therefore constructs a plan with hash matches because of the estimated 360,000 rows that the final plan expects to produce).

    The SQLCLR solution isn't always fastest, but it does have consistently good performance, and significantly out-performs the other methods on non-trivial tests.

    It's not cardinality estimation that's to blame for the memory grant, it's the lack of statistical information. The test rig happens to produce exactly the estimated 1000 rows, but the costing component has no statistical information to reason about the distribution of values. The resulting guesses at join cardinality are responsible for the excessive memory grant.

  • steve-893342 (9/19/2010)


    Good one:-)

    I shall remember that for next time I need to bcp and split at the same time;-)

    :laugh: Well there is some reasoning behind the example that might interest you.

    The SQLCLR solution does not perform anywhere near as well when the results of the split need to be written to a table.

    When used with an INSERT statement, the reason is that the QO needs to provide Halloween protection, since there is a possibility that the TVF reads from the same table that is being inserted to. Even if the TVF is marked as performing no data access, QO does not take that into consideration; or if it does, it isn't prepared to believe us 🙂

    So, when columns are sourced from a SQLCLR TVF, you'll see an Eager Spool (or another blocking operator) between the read and write cursors. Sad, but true.

    The situation with SELECT INTO is much less clear, because the query plan contains nothing to indicate why throughput is so slow.

    The reason for using bcp was to demonstrate that the throughput of the TVF itself is not the issue. Bulk copying rows in this way is much faster from a SQLCLR TVF than from a normal row source (such as a Tally or XML splitter).

    Happily, the bcp-style interface (SqlBulkCopy) is directly accessible from SQLCLR, so I can write a function/procedure to bulk-copy split rows directly to a target table (not a file!). This is immensely quick, as you might imagine.

  • Paul White NZ (9/22/2010)


    steve-893342 (9/19/2010)


    Good one:-)

    I shall remember that for next time I need to bcp and split at the same time;-)

    :laugh: Well there is some reasoning behind the example that might interest you.

    The SQLCLR solution does not perform anywhere near as well when the results of the split need to be written to a table.

    When used with an INSERT statement, the reason is that the QO needs to provide Halloween protection, since there is a possibility that the TVF reads from the same table that is being inserted to. Even if the TVF is marked as performing no data access, QO does not take that into consideration; or if it does, it isn't prepared to believe us 🙂

    So, when columns are sourced from a SQLCLR TVF, you'll see an Eager Spool (or another blocking operator) between the read and write cursors. Sad, but true.

    The situation with SELECT INTO is much less clear, because the query plan contains nothing to indicate why throughput is so slow.

    The reason for using bcp was to demonstrate that the throughput of the TVF itself is not the issue. Bulk copying rows in this way is much faster from a SQLCLR TVF than from a normal row source (such as a Tally or XML splitter).

    Happily, the bcp-style interface (SqlBulkCopy) is directly accessible from SQLCLR, so I can write a function/procedure to bulk-copy split rows directly to a target table (not a file!). This is immensely quick, as you might imagine.

    Paul

    That's a very timely response indeed and extremely well explained. I was just about to show you results comparing the peformance of INSERT queries for the CLR splitter versus a virtual tally in-line splitter, but it looks like you've got all bases covered. Using SqlBulkCopy to circumvent the conventional INSERT query is a good trick indeed and very well thought out. Thanks very much for this info and your other contributions to this thread which have been most informative:-)

    Cheers

    Steve

  • Paul White NZ (9/22/2010)


    The SQLCLR solution works correctly:

    DECLARE @CSV NVARCHAR(MAX) = N'A,B&C,,C,?,',

    @Delim NVARCHAR(255) = N',';

    SELECT S.sequence,

    S.item

    FROM dbo.SplitString_Multi(@CSV, @Delim) S;

    6 rows returned eh. Looks like my persistence paid off after all:-)

    http://www.sqlservercentral.com/Forums/FindPost979750.aspx

  • steve-893342 (9/22/2010)


    6 rows returned eh. Looks like my persistence paid off after all:-) http://www.sqlservercentral.com/Forums/FindPost979750.aspx

    Yes indeed - and you still owe me a pint 😎

  • Jeff remember this article?

    The "Numbers" or "Tally" Table: What it is and how it replaces a loop.

    By Jeff Moden, 2008/05/07

    Total article views: 41560 Views in the last 30 days: 985

    http://www.sqlservercentral.com/articles/T-SQL/62867/

    Unless my arithmetic is wrong that is 47 reads per day, 1,484 per month,

    a vast audience to be taught.

    Time for a rewrite or an addendum? ? ?

    (And please do not hate me for bringing it up)

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

  • Paul White NZ (9/22/2010)


    This seems a good time to highlight some other issues too:

    The Tally solution does not handle trailing blank records or Unicode characters correctly:

    Heh... Of course it doesn't handle Unicode correctly... the variables are all VARCHAR. 😉

    I am, however, surprised by the trailing comma problem. I'll take a look at that.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 151 through 165 (of 214 total)

You must be logged in to reply to this topic. Login to reply