String to rows

  • Performance Guard (Shehap)

    Ten Centuries

    Points: 1297

    More Simple+ specific code for that business case ,you could use the below TVF

    alter function dbo.Returnvalues (@string VARCHAR (50))

    returns @TEMPTBALE table(newstring varchar (50))

    as

    begin

    declare @value varchar (50)

    DECLARE @newstring varchar (50)=@string

    while (LEN(@newstring)>0)

    begin

    set @value= isnull (SUBSTRING (@newstring , 1,CHARINDEX ('#', @newstring)-1),'')

    insert into @TEMPTBALE values (@value)

    select @newstring=SUBSTRING (@newstring, CHARINDEX ('#', @newstring)+1,LEN(@newstring)- CHARINDEX ('#', @newstring)+1)

    end

    return

    end

    However , you have to pay high cautions while using such TVF that within joins columns or even select columns for much data entity ..

    Hence now , you could have 2 workable solutions :

    1. Either to create temp table to insert all results in a temp table created with Create table command not declare table commands to get advantage of temp clustered /non clustered indexes there as well and so we could avoid table scans exist.

    2. Using persisted deterministic computed columns ,but it should be enumerated the required number of delimiters to estimate no of computed columns as well.

    [font="Arial Black"]Performance Guard[/font]

    [font="Courier New"]Just ask me to get 0 sec for all queries[/font]

  • yubo1

    SSChasing Mays

    Points: 612

    HI Jeff.Thanks a lot

    Jeff Moden (10/8/2011)


    yubo1 (10/8/2011)


    create table test

    (

    string varchar(max)

    )

    insert into test

    select 'A#B#C'

    select SUBSTRING(string+'#',number,charindex('#',string+'#',number)-number) from

    test,

    (select number from master..spt_values where number>=1 and type='P')t

    where SUBSTRING('#'+string,number,1)='#'

    Yubo1,

    That method is the "old" method of splitting that uses concatenated delimiters. It's very fast for a small number of elements in the string but rapidly degrades with string length. Please see the following article for why you shouldn't use the method you posted and a much faster (closest yet to a decent CLR) methods as an alternative.

    http://www.sqlservercentral.com/articles/Tally+Table/72993/

  • Jeff Moden

    SSC Guru

    Points: 995608

    Performace Guard (Shehap) (10/9/2011)


    More Simple+ specific code for that business case ,you could use the below TVF

    alter function dbo.Returnvalues (@string VARCHAR (50))

    returns @TEMPTBALE table(newstring varchar (50))

    as

    begin

    declare @value varchar (50)

    DECLARE @newstring varchar (50)=@string

    while (LEN(@newstring)>0)

    begin

    set @value= isnull (SUBSTRING (@newstring , 1,CHARINDEX ('#', @newstring)-1),'')

    insert into @TEMPTBALE values (@value)

    select @newstring=SUBSTRING (@newstring, CHARINDEX ('#', @newstring)+1,LEN(@newstring)- CHARINDEX ('#', @newstring)+1)

    end

    return

    end

    However , you have to pay high cautions while using such TVF that within joins columns or even select columns for much data entity ..

    Hence now , you could have 2 workable solutions :

    1. Either to create temp table to insert all results in a temp table created with Create table command not declare table commands to get advantage of temp clustered /non clustered indexes there as well and so we could avoid table scans exist.

    2. Using persisted deterministic computed columns ,but it should be enumerated the required number of delimiters to estimate no of computed columns as well.

    I never mind learning something new... Please show us how the use of a While Loop in a Scalar UDF or even an mTVF would be faster than a properly formed "cteTally/Table" based splitter in an iTVF. You can find the testing I did at the following URL... http://www.sqlservercentral.com/articles/Tally+Table/72993/

    For those too busy to actually read the article, here's the performance footprint... The skinny little black line labeled as "????" is the new cteTally/Table based splitter...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden

    SSC Guru

    Points: 995608

    yubo1 (10/9/2011)


    HI Jeff.Thanks a lot

    You bet... thank you for taking it the right way.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Iulian -207023

    SSCertifiable

    Points: 7508

    Jeff Moden (10/8/2011)


    Iulian -207023 (10/8/2011)


    Thank you all,

    Do you have at hand a sample code on how to create the CLR function for splitting a string into rows / fields? I am not familiar with CLR functions, yet 🙂

    I need to parse each line using regular expression, at least this is what I have in mind now.

    After reading your recommendations I have the feeling regular expression using CLR is the key of my exercise. I appreciate any help / sample on this.

    A sample:

    If the input is:

    A # 2 # 09/20/2011

    X # 09/20/2011 & 123.00

    A # 3 # 10/07/2011

    the result will have 3 rows, each row having 4 fields:

    Record_Name ID Date Value

    A 2 09/20/2011 0

    X 0 09/20/2011 123.00

    A 3 10/07/2011 0

    Thanks,

    Iulian

    You need to read the code attached to the article I pointed you to... it's in there. And, no... using regular expressions for split probably won't be faster than a CLR written specifically to do splits.

    Personally, I think enabling CLR just for the sake of a splitter is overkill. If it's already enabled, the the CLR splitter code I've just pointed you to will do the job nicely. You do have to make up your mind what you want to use for a delimiter, though. 😉

    Now I see it, they are attached at the bottom of the article.

    Thanks a lot,

    Iulian

Viewing 5 posts - 16 through 20 (of 20 total)

You must be logged in to reply to this topic. Login to reply