Split string using XML

  • peter (6/29/2009)


    I have a set of comma delimited strings which have between 1 and 100's of invoice records per string (with each invoice record having the same 10 data fields). Your code turns the data into rows with admirable efficiency. However, I can't work out how to transform the results into columns with one row per invoice record. Can anyone help?

    Yes... please see the following.

    http://www.sqlservercentral.com/articles/T-SQL/63681/

    http://www.sqlservercentral.com/articles/cross+tab/65048/

    Also, I'm not sure to whom you are speaking but, just to advise, XML splitters are anything but effecient.

    I'll also caution that once you've "normalized" a CSV, rotating or pivoting it into columns is usually a bad idea for performance reasons if all the columns are basically the same "thing".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I did a slight modification

    select @xml = cast(( '') as xml)

    SELECT N.value('.', 'nvarchar(max)')

    so if there are tags in the string it handles them

  • Hello all,

    I have placed the XML string split logic into a user defined function and I think running the split within the function results better in performance than keeping the split login inline.

    You can check the result with an other example at Split String using XML - How to Convert or Split a Delimited String Values to Rows using T-SQL XML Commands

    Eralper

    SQL Programming

  • Eralper (7/10/2009)


    Hello all,

    I have placed the XML string split logic into a user defined function and I think running the split within the function results better in performance than keeping the split login inline.

    You can check the result with an other example at Split String using XML - How to Convert or Split a Delimited String Values to Rows using T-SQL XML Commands

    Eralper

    SQL Programming

    Can I ask why you are using this function and passing it along to others?

    Performance is pretty horrible compared to other ways to split a string.

  • Hello Goldie,

    Do you mean using the XML logic or placing it into a udf causes the performance decrease?

    Using the XML inline is a little bit difficult for whom do not use XML features of the SQL2005.

    This format is not easy to implement.

    Placing the logic in a udf makes the implementation easier.

    Actually, I made a simple test to see that the xml function is quicker. But I do not say that this is always works in better performance. May be slower, and since you disagree, it is probably slower perhaps while working with high number of rows.

  • Eralper (7/10/2009)


    Hello Goldie,

    Do you mean using the XML logic or placing it into a udf causes the performance decrease?

    Using the XML inline is a little bit difficult for whom do not use XML features of the SQL2005.

    This format is not easy to implement.

    Placing the logic in a udf makes the implementation easier.

    Actually, I made a simple test to see that the xml function is quicker. But I do not say that this is always works in better performance. May be slower, and since you disagree, it is probably slower perhaps while working with high number of rows.

    I am talking about the XML logic. Not the UDF.

    Using a loop is faster, and using the Tally method is fastest.

    Why use XML if there are much faster ways?

  • Hi,

    You know this discussion is originally started on an article which is describing this method. Because of that I had given an example on splitting by XML.

    Eralper

    http://www.kodyaz.com

  • Eralper (7/10/2009)


    Hi,

    You know this discussion is originally started on an article which is describing this method. Because of that I had given an example on splitting by XML.

    Eralper

    http://www.kodyaz.com

    My objection is to the article your site which states:

    ...splitting and parsing can also be implemented by using the new XML improvements in t-sql...So this is somehow a little bit more advanced than the regular parsing methods we used to code during our t-sql splitting tasks...

    In fact this method is not better, and you do not state that anywhere in your article.

  • Hi Goldie,

    Thanks for your concern that you have read my article.

    I do not want to make the conversation too long about my article here since we should discuss the article by Divya here. I'd like to talk about your objections in detail at kodyaz.com.

    Since I love the new t-sql improvements like XML improvements, CTEs, Window functions, Merge, Hierarchy, etc and I got impressed by the style and the variety in code the t-sql provides I like to work with the new methods in tsql solutions.

    So you are right that it may have performance issues, the developers would take the performance of their solution while implementing it on a system. But my main motivation is just providing an additional way of solution to an existing problem.

    But I'd take your objection into account since you are right at performance. And I'll alter the article by adding a note on that topic.

    Thanks again Goldie,

    Eralper

  • Thanks very much for discussing on the article regarding performance issues. I have just shown a new approach to split strings with the help of XML and a very good use of new T-sql feature 'CROSS APPLY'.

    Good to know that XML should not be preferred when performance come into picture.

    --Divya

  • yes

    --Divya

  • Kindred spirit, Goldie. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Eralper (7/10/2009)


    Actually, I made a simple test to see that the xml function is quicker.

    Actually, let's see THAT test. 😉 I already posted mine.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi all,

    sorry for the late reply...

    I agreed with performance issue of SQL function incase of too large string needs to be split.

    hence, i have slightly modified the SQL function which makes better performance.

    Please try the below code...

    CREATE function Split_fn

    (

    @split_string varchar(8000),

    @deli_char varchar(3)

    )

    returns @list table

    (

    SeqNo int,

    SplitString varchar(8000)

    Primary Key(SeqNo)

    )

    as

    begin

    declare @from_loc int

    declare @to_loc int

    if charindex(@deli_char,@split_string,0) <= 0

    begin

    insert into @list(seqno, SplitString) values (1, @split_string)

    return

    end

    if charindex(@deli_char,@split_string,0) > 0

    begin

    select @from_loc = 0

    select @to_loc = charindex(@deli_char,@split_string,0)

    end

    if charindex(@deli_char,@split_string,0) <= 0

    begin

    select @to_loc = null

    end

    while @to_loc is not null

    begin

    if substring(@split_string,@from_loc, @to_loc - @from_loc) <> ''

    begin

    insert into @list(seqno, SplitString)

    select isnull(max(seqno),0) + 1, substring(@split_string,@from_loc, @to_loc - @from_loc)

    from @list

    end

    select @from_loc = charindex(@deli_char,@split_string,@from_loc+len(@deli_char)) + len(@deli_char)

    select @to_loc = charindex(@deli_char,@split_string,@from_loc)

    if @to_loc = 0

    begin

    if substring(@split_string,@from_loc, (len(@split_string) - @from_loc) + len(@deli_char)) <> ''

    begin

    insert into @list(seqno, SplitString)

    select isnull(max(seqno),0) + 1, substring(@split_string,@from_loc, (len(@split_string) - @from_loc) + len(@deli_char))

    from @list

    end

    select @to_loc = null

    end

    end

    return

    end

    With regards,

    Rafidheen.M

Viewing 15 posts - 31 through 45 (of 48 total)

You must be logged in to reply to this topic. Login to reply