Sending a Variable Number of Elements in a Parameter

  • David McKinney (2/6/2014)


    Also, sorry but I find the allusion to plagiarism a little facile. This code is pretty much public domain by now, ...{snip}..., I really don't feel that this is in the same ballpark as proper examples of the P word.

    I agree. In fact, since it uses a method originally conceived by Itzik Ben-Gan for creating the numbers table, we'd all be in danger of plagarism if it were a proper example of the P word (and it certainly isn't). A lot of people also forget that there is the opportunity for independent discovery of something already discovered. It happens with all sorts of code.

    As someone that's followed SSC for a number of years, I think there's a danger of the articles becoming a private members club, where the uninitiated are put off from submitting their ideas. I know I haven't written for quite a while, in part for these reasons.

    That's actually a huge loss for the rest of us. You used to be all over these forums a long time ago with some pretty good ideas. I'd love to see you write again.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Alex Grinberg-230981 (2/6/2014)


    I have tested CLR to split a delimited string, much slower and take bigger resources. This why I did include CLR option in my posting.

    It would be interesting to see the source code for whichever CLR you are using for the split. The one that Paul White wrote for me is lightning quick. I'd love to see a test between the SQLCLR splitter he wrote and the performance of the UDTT method you very nicely explained in your article.

    Speaking of the article, it would have been nice to see a simple performance test but you did a great job in simplifying the explanation of how to use a UDTT. It was written well enough to keep old salts interested and simple enough to help newbies. That's not such an easy thing to pull off so I definitely have to say "Well Done!"

    It doesn't matter for this article but, since you brought it up in the article, you actually can pass "tables" of information as delimited information a couple of different ways. They methods to do so aren't so hot for efficiency on the SQL Server side though.

    I can't speak for anyone else but I wanted to assure you that when I read the article, the "p" word never entered my thoughts. Splitters of similar nature are quite common on this and many other forums. It's been written about many times. Be advised, though, that the one you used in the article concatenates the delimiters (which is why the CSV/XML splitter that someone else posted shares the same fate) start's getting slow and is beaten to death even by a While Loop as you approach 8K characters.

    I also think that people made the mistake of judging the article by the formatting that it came out in. They made formatting easier on the submittal site by adding a "Paste from Word" option but it's not perfect. It's still a bit fickle when pasting images or trying to use the "code" formatting boxes and catches even me by surprise. I'll send an email to Steve and see if he can get someone to make some repairs for you. It's a good article with some good code and it shouldn't be judged by formatting problems that even old salts have with submittals.

    You're a decent writer with well timed graphics and I encourage you to continue writing. Things said in the discussions can get a little out of hand. You should see the hell I caught on my original and rewrite for the running totals articles. 😀 HOOOOOIIEEEE!!! They gave me the skin of a rhino! :hehe: Despite the (extremely, in some cases) harsh reception I got from many, it actually did help me a lot because I studied the "WHY" of the harshness and I think I'm a better writer for it now.

    Keep up the good work. I look forward to your next article.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Phil Factor (2/6/2014)


    Alex,

    I've gone back to have a closer look at the article. It could be that I've misunderstood something, but I really think the first example could be improved. I reckon that the string-splitting routine was fine before we had XML stuffed into SQL in SQL 2005 but really an XML-based string-splitting solution is twice the speed of the string-split and far more economical because it doesn't need a supporting function and a number table.

    Here is my solution.

    [font="Courier New"]CREATE PROC ListProductNames

         @CategoryList NVARCHAR(500),

         @Separator VARCHAR(10) =','

    /*

    List all products that are of a list of categories.

    Execute ListProductNames @CategoryList='Condiments,Grains/Cereals,Dairy Products'

    */

    AS

    DECLARE @XMLList XML

    SELECT @XMLlist='<list><y>'+REPLACE(@CategoryList,@Separator,'</y><y>')+'</y></list>';

    SELECT dbo.Categories.CategoryName, dbo.Products.ProductName, dbo.Products.QuantityPerUnit, dbo.Products.UnitPrice

    FROM dbo.Categories

         JOIN dbo.Products ON dbo.Categories.CategoryID = dbo.Products.CategoryID

         JOIN (   SELECT x.y.value('.','varchar(80)') AS IDs

                  FROM @XMLList.nodes('/list/y/text()') AS x ( y )

               )  AS Chosen_Categories(item)

             ON Categories.CategoryName = Chosen_Categories.Item;

    GO       [/font]

    You can see the timings of the string-splitting techniques here in Using XML to pass lists as parameters in SQL Server[/url] I've just noticed that some of the SQL has got mangled when the blog post was converted to WordPress, but it is easy to unmangle it.

    I really think there isn't any further need for the old number-table version of the string-split.

    Considering the pipe clogging tag bloat even of attribute based XML compared to a nice little TSV, I think there is. 😉

    Gosh... I have to test that. I would think the limiting factor would be the expansive REPLACE in that much like the concatenation of a delimiter would be for Tally based splitters. For passing real XML instead of a CSV list and converting it, I agree that the XML method is going to be faster (as advertised in your good article).

    As to your article itself and the methods you came up with and that other method that appeared in the discussion that followed, I again have to say "Absolutely Brilliant", ol' friend. What a great peice of research culiminating in a huge change of how things can be done. I still haven't done any personal testing there but your performance chart pretty much says it all. Now, all I have to do is get you to stop using While Loops to build test data. 😛

    What would be interesting now is to have a couple of drag races between your very much improved XML method, the UDTT method identified in this article, and Paul White's simple SQLCLR splitter.

    Anyway... very good "seeing" you again. Sorry I missed PASS 2013. Were you able to make it?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I just reported the formatting problems for this article. Hopefully, someone at SQLServerCentral will pick up on it and fix it. Again, these problems aren't caused by the author. The article submission site can be a little tough to use.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (2/6/2014)


    What would be interesting now is to have a couple of drag races between your very much improved XML method, the UDTT method identified in this article, and Paul White's simple SQLCLR splitter.

    I have run some tests using the same strings and a simplified UDTT as shown here.

    USE [AdminDB]

    GO

    CREATE TYPE TblParam AS TABLE(

    [id] int IDENTITY(1,1),

    [item] [varchar](80) NOT NULL

    )

    GO

    CREATE PROCEDURE usp_TableTypeTest @tbl dbo.TblParam READONLY

    AS

    SELECT id,item

    FROM @tbl AS TblType;

    GO

    Testing the XML attributes version, the 8k splitter, Paul Whites CLR SplitterC and this TableType method. The CLR wins out hands down. The UDTT comes in second. Then it is the 8k splitter and then the XML version.

    For this string (so yes it is limited), I see the following avg timings over 10 executions.

    First the string

    @Items VARCHAR(1024)='Condiments,Grains/Cereals,Dairy Products,Condiments,Grains/Cereals,Dairy Products,Condiments,Grains/Cereals,Dairy Products,Condiments,Grains/Cereals,Dairy Products,Condiments,Grains/Cereals,Dairy Products,Condiments,Grains/Cereals,Dairy Products'

    XML.......133.6ms

    8k.........105.4ms

    UDTT.....1ms

    CLR........0ms

    This is great across the board but definitely shows an advantage to the CLR splitter.

    A thought on the UDTT is that it really isn't splitting anything. It's reading from a table where the data has been already "split" into separate values somewhere else and then inserted for consumption. The UDTT method did have a 14.8ms average to fill that table while the xml had an average of 0ms to build the xml string.

    In any case, the CLR method is a clear cut winner in my testing, which continues to support the findings of Jeff, Paul, and Phil.

    I have used the UDTT in the past when the application could take advantage of it. It is a pretty straight forward implementation.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • It seems that the harsh critic has been universally condemned for his harsh remarks. You know what they say about movie critics, if you can't act become a critic! :hehe:

    I'll apologize yet again for my harsh criticisms, especially for my unfortunate use of the P-word. I did not mean that usage to imply that I considered this article to fall in that category.

    Just so you know, I've received some criticism on the articles I've written too. I think that comes with the territory and I do my best to take it in stride. I can only hope that Alex will be able to do so as well. I try to think of things like that as a learning experience that will help me do better the next time around. Or maybe I'm just really thick-skinned. 🙂

    SSC is a really big pond and it includes some pretty big fish (and I don't count myself among them). I agree that newbies need to be welcomed with open arms. For the most part, I think they are. Fresh viewpoints are always useful in one way or another.

    I'll try to do better on my tone if I ever have negative comments to convey in the future. Lesson learned.

    Just like science, building on SQL concepts is often the result of the efforts of many individual contributors. It is a good thing to acknowledge that when you can. Very little can be done, given the level of maturity in the SQLverse, that is 100% new and fresh. So do your homework and be kind to those that you build upon, and they'll be kind to you in return (or at least I hope they will).


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • dwain.c (2/6/2014)


    It seems that the harsh critic has been universally condemned for his harsh remarks. You know what they say about movie critics, if you can't act become a critic! :hehe:

    I'll apologize yet again for my harsh criticisms, especially for my unfortunate use of the P-word. I did not mean that usage to imply that I considered this article to fall in that category.

    Let he without sin cast the first stone. I should have qualified my remarks with the fact that I've already found myself commiting the exact same crime, on more than one occasion. We can't be perfect every day 😉

    I'm glad my initial post found an echo - that I wasn't a lone voice - but also pleased that it was well taken on board by those concerned.

  • Jeff Moden (2/6/2014)


    As someone that's followed SSC for a number of years, I think there's a danger of the articles becoming a private members club, where the uninitiated are put off from submitting their ideas. I know I haven't written for quite a while, in part for these reasons.

    That's actually a huge loss for the rest of us. You used to be all over these forums a long time ago with some pretty good ideas. I'd love to see you write again.

    Thanks Jeff,

    That's a massive compliment. I'd encourage anyone to write for SSC. On a personal level, it's very rewarding. It gave me some of the positive recognition that at the time I felt I wasn't getting from my employer. It also gave me something tangible I could point to at an interview, to show a prospective employer where I was coiming from. Finally, it opened the door to a technical reviewer role for one of the Simple Talk publications - which was a hugely enriching experience.

    These days, I'm much less directly involved with SQL these days in my job (having become a business analyst!!). But I still like to keep up to date. And I'm above all hugely admirative of the endless patience of people such as yourself on the forums.

    I certainly don't exclude picking up the scribe again, when the right idea comes along. The only thing better than having a great idea, is sharing it with the world!

  • In my blog post 'Using XML to pass lists as parameters in SQL Server'[/url] that compared string-splitting, I didn't include the actual REPLACE operation because I was assuming the lists would be already in XML format since once you have taken the decision to store such structures as lists, 'documents' and arrays in XML (I advise it as it allows easier datatype checking and it is generally safer: also XML lists are generally easier to generate automatically from SQL code). I actually converted the data into delimited lists for the number-table-based version, but didn't include the conversion time. It seems unfair to the delimited-list-splitter as compared with the XML-based methods but then it's unfair whichever way around you do it. They're doing slightly different jobs in that one is splitting a delimited list and the other is splitting an XML list. I certainly didn't notice a performance problem with using REPLACE but I'll recheck when I get a moment.

    The timings I was getting on the three really good methods were so low for huge lists that it would seem a lot of extra effort, and add to complexity, to use a CLR. Usually, for performance tuning, one must chase the aspects of performance that are causing the most problems, and I'd suspect that splitting lists isn't going to be high up anyone's list. This is why I'm saying that simple solutions are likely to be best.

    For a list of 3000 elements, I was getting

    Using Split function 46ms

    Bad Element-based XML 7703ms

    Using Values script 160ms

    XML Attributes 26ms

    Element-based XML (text()) 20ms

    Cross apply XML 30ms

    Attribute-based XML (@) 20ms

    Best wishes,
    Phil Factor

  • dwain.c (2/6/2014)


    It seems that the harsh critic has been universally condemned for his harsh remarks. You know what they say about movie critics, if you can't act become a critic! :hehe:

    I'll apologize yet again for my harsh criticisms, especially for my unfortunate use of the P-word. I did not mean that usage to imply that I considered this article to fall in that category.

    I am right there with you Dwain.

    It is impossible to "unsay" something once it has been said. My comments are ones I wish I had worded differently. I sincerely hope that Alex will continue writing and contributing to SSC.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • I have a number of .Net applications utilizing the UDTT's to pass variable numbers of "parameters" to stored procedures. I haven't run into any problems, other than someone forgetting to script the type before the proc when creating a new DB instance.

    It's a very clean way to send multiple name value pairs, or even more complex data in a single parameter.

    And of course you have the benefit of an actual type that can be shared across procs.

    String manipulation inside stored procedures should be avoided like cursors, tables without primary keys, etc... It has a way of being propagated over time and becoming the death of a thousand cuts. Best to nip it in the bud.

  • From my experience, and as any experienced DBA/Programmer would know, you should test everything at a statistically relevant level, i.e., average over multiple tests and eliminate unexplained outliers (if you can find a consistent explanation for the better performing outlier restructure you code around that technique!).

    I can't speak for anyone else but in all the shops I've worked performance is king, period. Sure, we talk about coding standards and maintainability and implement them time permitting, but they are secondary in the SDLC to performance. When I'm asked to optimize a query the first things I look at are (in no particular order):

    • SQL Server version-specific use of T-SQL and SQL statements (stuff comes [new functionality like MERGE, INTERSECT, EXEPT, APPLY, etc.] and goes [deprecations] with each major SQL release)
    • Use of UDFs in row set operations (usually a major boat anchor to performance, however there are some exceptions in the case of CLR UDFs)
    • Indexes (Appropriate Primary Keys, compound indexes, index inclusions, index options, possible clustering for priority scan-type queries, partitions, etc.)
    • Isolation levels (if they are having excessive blocking or any deadlocks)

    The performance killers I see the most are (once again, in no particular order):

    • UDFs in row set operations
    • Inappropriate or missing indexes

    As an example, here's a hypothetical use of an actual UDF vs intrinsic functions comparison (all examples are run on SQL 2005 (x64)):

    Declare @s-2 VarChar(max),@Top Int;

    Set @Top=1000000;

    Set Statistics Time Off;

    Set Statistics Time On;

    Select top (@Top)

    @s-2=

    Common.dbo.fnPadLeft(sc1.name,30,' ')+

    Common.dbo.fnPadLeft(sc2.name,30,' ')+

    Common.dbo.fnPadLeft(sc3.name,30,' ')

    from sys.columns sc1

    cross join sys.columns sc2

    cross join sys.columns sc3;

    Set Statistics Time Off;

    Set Statistics Time On;

    Select top (@Top)

    @s-2=

    Right(' '+sc1.name,30)+

    Right(' '+sc2.name,30)+

    Right(' '+sc3.name,30)

    from sys.columns sc1

    cross join sys.columns sc2

    cross join sys.columns sc3;

    Set Statistics Time Off;

    Which, on my development server, returns:

    SQL Server Execution Times:

    CPU time = 8970 ms, elapsed time = 8974 ms.

    SQL Server Execution Times:

    CPU time = 2761 ms, elapsed time = 2758 ms.

    In the past I've used stacked sub-queries to transform more complex deterministic UDF code into intrinsic function calls by percolating all result columns up through the sub-query stack performing different intrinsic functions on them as they pass by. You may think that this complex looking query would not perform very well; you'd be wrong. Microsoft does a very good job with their query optimizer (did I just say that out loud:w00t:). And with the advent of APPLY this technique has become even more useful as it eliminates the percolation of all results columns (you only need to percolate the function-specific columns). Here's an example of this technique:

    Set NoCount On;

    Declare @s-2 VarChar(max),@Top Int;

    Set @Top=100000;

    Set Statistics Time Off;

    Declare @Tbl table

    (

    RId Int Identity Primary Key,

    Not1NF VarChar(256)

    );

    Insert @Tbl

    Select top (@Top)

    sc3.name+'|'+sc2.Name+'|'+sc1.Name

    from sys.columns sc1

    cross join sys.columns sc2

    cross join sys.columns sc3

    where sc1.name<sc2.name and sc1.name<sc3.name and sc2.name<sc3.name

    Set Statistics Time On;

    Select

    @s-2=Common.dbo.fnDelimit(Not1NF,'|',1),

    @s-2=Common.dbo.fnDelimit(Not1NF,'|',2),

    @s-2=Common.dbo.fnDelimit(Not1NF,'|',3)

    from @Tbl;

    Set Statistics Time Off;

    Set Statistics Time On;

    Select

    --RId,

    --Not1NF,

    @s-2=t1,

    @s-2=t2,

    @s-2=t3

    from @Tbl tv

    cross apply

    (

    Select

    t1,

    t2,

    SubString(tv.Not1NF,s,Case when e-s<0 then 0 else e-s End)[t3],

    e+1,

    Case CharIndex('|',tv.Not1NF,e+1)

    when 0 then Len(tv.Not1NF)+1

    else CharIndex('|',tv.Not1NF,e+1)

    End[e]

    from

    (

    Select

    t1,

    SubString(tv.Not1NF,s,Case when e-s<0 then 0 else e-s End)[t2],

    e+1,

    Case CharIndex('|',tv.Not1NF,e+1)

    when 0 then Len(tv.Not1NF)+1

    else CharIndex('|',tv.Not1NF,e+1)

    End[e]

    from

    (

    Select

    SubString(tv.Not1NF,s,Case when e-s<0 then 0 else e-s End)[t1],

    e+1,

    Case CharIndex('|',tv.Not1NF,e+1)

    when 0 then Len(tv.Not1NF)+1

    else CharIndex('|',tv.Not1NF,e+1)

    End[e]

    from

    (

    Select

    1,

    Case CharIndex('|',tv.Not1NF,1)

    when 0 then Len(tv.Not1NF)+1

    else CharIndex('|',tv.Not1NF,1)

    End[e]

    ) t1

    ) t2

    ) t3

    ) ca;

    Set Statistics Time Off;

    Which, on my development server, returns:

    SQL Server Execution Times:

    CPU time = 32464 ms, elapsed time = 33034 ms.

    SQL Server Execution Times:

    CPU time = 5647 ms, elapsed time = 5646 ms.

    [h4]Observations[/h4]

    • Don't bust my chops about 1NF violating data columns. They're out there more than you'd think.
    • The UDF method has to parse through subsequent tokens each time it is called (it uses a recursive CTE to find beginning and end indexes returning the specified token with a SubString() function); plus its a UDF - Unrealistically Delayed Function...
    • The sub-query stack only passes through each token once.
    • Look at the actual execution plan if you lack faith in the query analyzer.
    • I don't have the time now, but it would be interesting if someone could post other technique comparisons to this one.

    I have a request: Does anyone know of a way that a CTE could be used in the CROSS/OUTER APPLY so that non-deterministic "things" could be done? I'm not asking in a vacuum. I've spent a bit of time attempting this and the problem is: The T-SQL compiler doesn't do late-binding (unless you consider dynamic SQL late binding) so the compiler pitches an error about not being able to bind the column passed into the APPLY sub-query containing the CTE call, i.e., It can't bind column tv.Not1NF hard-coded in the CTE definition as it doesn't yet exist at definition time (early binding)! Also, you can call a CTE passing in a defined @ variable, however, I couldn't get this idea to work either (how do you get column tv.Not1NF into a predefined variable so the CTE that references it see different values for each column it is called on...).

    Any ideas would be helpful (sorry MarbryHardin, I'm not trying to co-opt your post!).



    PeteK
    I have CDO. It's like OCD but all the letters are in alphabetical order... as they should be.

  • I've just run a re-test on my system of the two rival splits, dbo.DelimitedSplit8K, and 'Element-based XML (text()), but this time including the REPLACE-based conversion of the list to XML in the timing. It turns out that the conversion hardly affects the timings at all. If the list contains strings with one of the few 'banned' characters such as < or > then the conversion would take longer but even then it is dwarfed by the time taken to shred the list into rows.

    [font="Courier New"]

    --in order to record the timings, we prepare a log.

    DECLARE @log TABLE (

                Log_Id INT IDENTITY(1, 1),TheEvent VARCHAR( 2000 ),

                [Values] INT, CPU FLOAT DEFAULT  @@CPU_BUSY,

                DateAndTime DATETIME DEFAULT GETDATE()) ;

    --define working variables

    DECLARE @List VARCHAR(MAX),@XMLList XML,@AttributeBasedXMLList XML,

            @ii INT, @Values INT;

    --and table variables to receive the results/relations

    DECLARE @ByteBucket TABLE (TheNumeral INT)

    DECLARE @ByteBucket1 TABLE (TheNumeral INT)

    SET NOCOUNT ON

    SELECT @Values=1--start with one item in the list

    WHILE @Values<=2800

      BEGIN

    --build up the list with random integers

      SELECT @List='1',@ii=@Values

      WHILE @ii>1

        BEGIN

        SELECT @List=@List+','+CONVERT(VARCHAR(3),CONVERT(INT,RAND()*100))

        SELECT @ii=@ii-1--  .. to the required length

        END

    --try doing the delimited list function

      INSERT INTO @log (TheEvent, [Values]) SELECT 'Using Split function',@Values

      INSERT INTO @ByteBucket (TheNumeral)

      SELECT item FROM dbo.DelimitedSplit8K (@list,',')

    --use the Element-based XML (text())

      INSERT INTO @log (TheEvent, [Values]) SELECT 'Element-based XML (text())',@Values

      SELECT @XMLList='<list><i>'+REPLACE(@List,',','</i><i>')+'</i></list>'

      INSERT INTO @ByteBucket1 (TheNumeral)

      SELECT x.y.value('.','int') AS IDs

             FROM @XMLList.nodes('/list/i/text()') AS x ( y )

      INSERT INTO @log (TheEvent, [Values]) SELECT 'Element-based XML (text())',@Values

      INSERT INTO @log (TheEvent, [Values]) SELECT 'finished',@Values

      SELECT @Values=@Values+100

    END      

    --Yes, we need to check that they all agree!

    SELECT COUNT(*), SUM(TheNumeral) FROM @ByteBucket

    UNION ALL

    SELECT COUNT(*), SUM(TheNumeral) FROM @ByteBucket1

    SELECT

      TheStart.[Values],

      MAX(CASE WHEN TheStart.TheeVent ='Using Split function'

               THEN DATEDIFF( ms, TheStart.DateAndTime, Theend.DateAndTime )

               ELSE 0

          END) AS [Using Split function],

      MAX(CASE WHEN TheStart.TheeVent ='Element-based XML (text())'

               THEN DATEDIFF( ms, TheStart.DateAndTime, Theend.DateAndTime )

               ELSE 0

          END) AS [Element-based XML (text())]

          FROM   @log TheStart

      INNER JOIN @log Theend

        ON Theend.Log_Id = TheStart.Log_Id + 1

    WHERE TheStart.TheEvent<>'finished'

    GROUP BY TheStart.[Values];

    [/font]

    I'd have liked to test it on much longer lists but my copy of dbo.DelimitedSplit8K wouldn't tackle lists any longer.

    The results (in Ms) on my system were ...

    [font="Courier New"]

    No. Values  Using Split function Element-based XML (TEXT())

    ----------- -------------------- --------------------------

    1           0                    0

    101         0                    0

    201         3                    3

    301         0                    3

    401         0                    6

    501         3                    3

    601         3                    6

    701         10                   3

    801         6                    6

    901         10                   6

    1001        10                   6

    1101        10                   10

    1201        13                   10

    1301        13                   10

    1401        16                   10

    1501        20                   13

    1601        23                   13

    1701        16                   13

    1801        23                   16

    1901        26                   16

    2001        26                   16

    2101        30                   16

    2201        33                   20

    2301        33                   20

    2401        33                   23

    2501        36                   23

    2601        40                   23

    2701        43                   26

    [/font]

    Best wishes,
    Phil Factor

  • I'd honestly take XML and Xquery over the other two methods any day. TVP's seemed to still be in the infant stage of their development, meaning they have to be pre defined as a data type and are read only. Using a delim splitter has always been a big head ache for the data sets we deal with at work because some of the fields are free form and we've had times where a package blows up because someone actually entered a pipe !!' Yes a pipe in a text field, which was also our delimiter. XML avoids all these complications from the get-go.

    Just my two cents. I'm curious if any of you have constructive criticism.

    🙂

  • One of my first attempts at splitting delimited data in a column used the XML shred method. Even on relatively small tables (by row count) the queries took forever due to the poor performance of Microsoft's XML implementation and the "shred" operation. A CTE-based UDF performed way better, however, the the CROSS APPLY subquery stack method I ended up with (see my previous comment) beat them all, hands down. You may not care for the apparent code complexity, but my employers sure appreciated the significant performance gains.



    PeteK
    I have CDO. It's like OCD but all the letters are in alphabetical order... as they should be.

Viewing 15 posts - 31 through 45 (of 54 total)

You must be logged in to reply to this topic. Login to reply