Parse a string help

  • I have a string that I need to parse. The text will vary but the format always the same.

    Example Text:    Hello5-E-100

    I need this parsed into two variables.

    First being Hello5-E  and the second simply 100

    Can someone please assist with this? I have the following code that creates  (HELLO5  and second as E100)

    DECLARE @VendItemNumber varchar(100) = 'HELLO5-E-100'

    -- HOTEL5-E  100

    SELECT LTRIM(RTRIM(REPLACE(LEFT(@VendItemNumber, CHARINDEX('-', @VendItemNumber)),'-' ,'')))
    ,LTRIM(RTRIM(REPLACE(SUBSTRING(@VendItemNumber, CHARINDEX('-', @VendItemNumber), LEN(@VendItemNumber) - CHARINDEX('-', @VendItemNumber) + 1),'-' ,'')))

    <hr noshade size=1 width=250 color=#BBC8E5> Regards,Jeffery Williams http://www.linkedin.com/in/jwilliamsoh

  • A little Closer now just need to move the 'E' over and I will be set... HELP!  lol

    DECLARE @VendItemNumber varchar(100) = 'HELLO5-E-100'

    -- HOTEL5-E  100

    SELECT LTRIM(RTRIM(REPLACE(LEFT(@VendItemNumber, CHARINDEX('-', @VendItemNumber)),'-' ,'-')))
    ,LTRIM(RTRIM(REPLACE(SUBSTRING(@VendItemNumber, CHARINDEX('-', @VendItemNumber), LEN(@VendItemNumber) - CHARINDEX('-', @VendItemNumber) + 1),'-' ,'')))

    <hr noshade size=1 width=250 color=#BBC8E5> Regards,Jeffery Williams http://www.linkedin.com/in/jwilliamsoh

  • Jeffery Williams - Tuesday, March 6, 2018 1:02 PM

    A little Closer now just need to move the 'E' over and I will be set... HELP!  lol

    DECLARE @VendItemNumber varchar(100) = 'HELLO5-E-100'

    -- HOTEL5-E  100

    SELECT LTRIM(RTRIM(REPLACE(LEFT(@VendItemNumber, CHARINDEX('-', @VendItemNumber)),'-' ,'-')))
    ,LTRIM(RTRIM(REPLACE(SUBSTRING(@VendItemNumber, CHARINDEX('-', @VendItemNumber), LEN(@VendItemNumber) - CHARINDEX('-', @VendItemNumber) + 1),'-' ,'')))

    Something like this?

    DECLARE @VendItemNumber varchar(100) = 'HELLO5-E-100';

    SELECT LEFT(@VendItemNumber,LEN(@VendItemNumber) - CHARINDEX('-',REVERSE(@VendItemNumber))), RIGHT(@VendItemNumber,CHARINDEX('-',REVERSE(@VendItemNumber)) - 1);

  • Lynn Pettis - Tuesday, March 6, 2018 1:55 PM

    Jeffery Williams - Tuesday, March 6, 2018 1:02 PM

    A little Closer now just need to move the 'E' over and I will be set... HELP!  lol

    DECLARE @VendItemNumber varchar(100) = 'HELLO5-E-100'

    -- HOTEL5-E  100

    SELECT LTRIM(RTRIM(REPLACE(LEFT(@VendItemNumber, CHARINDEX('-', @VendItemNumber)),'-' ,'-')))
    ,LTRIM(RTRIM(REPLACE(SUBSTRING(@VendItemNumber, CHARINDEX('-', @VendItemNumber), LEN(@VendItemNumber) - CHARINDEX('-', @VendItemNumber) + 1),'-' ,'')))

    Something like this?

    DECLARE @VendItemNumber varchar(100) = 'HELLO5-E-100';

    SELECT LEFT(@VendItemNumber,LEN(@VendItemNumber) - CHARINDEX('-',REVERSE(@VendItemNumber))), RIGHT(@VendItemNumber,CHARINDEX('-',REVERSE(@VendItemNumber)) - 1);

    Thank you so much

    <hr noshade size=1 width=250 color=#BBC8E5> Regards,Jeffery Williams http://www.linkedin.com/in/jwilliamsoh

  • My pleasure.

  • Some code formatting food for thought...
    For cases where I need to repeat the same expression multiple times you can create on "on-the-fly" inline function to simplify your code. In Lynn's code this expression is repeated a couple times:
    CHARINDEX('-',REVERSE(@VendItemNumber)

    You can simplify the code using the Table Value constructor like this:
    SELECT LEFT(f.vi,LEN(f.vi) - f.pos), RIGHT(f.vi, f.pos - 1)
    FROM (VALUES (CHARINDEX('-',REVERSE(@VendItemNumber)), @VendItemNumber)) f(pos,vi);
     

    For situations with longer expressions that are called several times you will see much cleaner code and without any performance penalty. In fact, there are times where you will see a performance improvement in cases where the optimizer is not smart enough to realize that it only needs to calculate the same expression once.

    There is no performance penalty for doing this, actually - there are times where you will see a performance improvement. For example, run this will "include actual execution plan" turned on:

    -- sample data 
    CREATE TABLE #sometable(someid int identity, somevalue decimal(10,2));
    INSERT #sometable(somevalue) VALUES (100),(1050),(5006),(111),(4);
    GO

    DECLARE @var1 int = 100, @var2 int = 50, @var3 int = 900, @topPct tinyint = 90;
    -- version 1 with repeated formula
    SELECT TOP (@topPct) PERCENT
    someid,
        somevalue,
        someCalculation =
      CASE WHEN @var3 < somevalue THEN (@var1/(@var2*2.00))+@var3 ELSE @var3+somevalue END,
    someRank = dense_rank() OVER (ORDER BY
      CASE WHEN @var3 < somevalue THEN (@var1/(@var2*2.00))+@var3 ELSE @var3+somevalue END)
    FROM #sometable
    WHERE CASE WHEN @var3 < somevalue THEN (@var1/(@var2*2.00))+@var3 ELSE @var3+somevalue END
    BETWEEN 900 AND 2000
    ORDER BY -- simulate another event that causes a sort
      CASE WHEN @var3 < somevalue THEN (@var1/(@var2*2.00))+@var3 ELSE @var3+somevalue END;

    -- version 2 with inline alias
    SELECT TOP (@topPct) PERCENT
    someid,
    somevalue,
    someCalculation = itvf.result,
    someRank   = dense_rank() OVER (ORDER BY itvf.result)
    FROM #sometable
    CROSS APPLY (VALUES(CASE WHEN @var3 < somevalue THEN (@var1/(@var2*2.00))+@var3
         ELSE @var3+somevalue END)) itvf(result)
    WHERE itvf.result between 900 and 2000
    ORDER BY itvf.result;

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Alan.B - Wednesday, March 7, 2018 9:11 AM

    Some code formatting food for thought...
    For cases where I need to repeat the same expression multiple times you can create on "on-the-fly" inline function to simplify your code. In Lynn's code this expression is repeated a couple times:
    CHARINDEX('-',REVERSE(@VendItemNumber)

    You can simplify the code using the Table Value constructor like this:
    SELECT LEFT(f.vi,LEN(f.vi) - f.pos), RIGHT(f.vi, f.pos - 1)
    FROM (VALUES (CHARINDEX('-',REVERSE(@VendItemNumber)), @VendItemNumber)) f(pos,vi);
     

    For situations with longer expressions that are called several times you will see much cleaner code and without any performance penalty. In fact, there are times where you will see a performance improvement in cases where the optimizer is not smart enough to realize that it only needs to calculate the same expression once.

    There is no performance penalty for doing this, actually - there are times where you will see a performance improvement. For example, run this will "include actual execution plan" turned on:

    -- sample data 
    CREATE TABLE #sometable(someid int identity, somevalue decimal(10,2));
    INSERT #sometable(somevalue) VALUES (100),(1050),(5006),(111),(4);
    GO

    DECLARE @var1 int = 100, @var2 int = 50, @var3 int = 900, @topPct tinyint = 90;
    -- version 1 with repeated formula
    SELECT TOP (@topPct) PERCENT
    someid,
        somevalue,
        someCalculation =
      CASE WHEN @var3 < somevalue THEN (@var1/(@var2*2.00))+@var3 ELSE @var3+somevalue END,
    someRank = dense_rank() OVER (ORDER BY
      CASE WHEN @var3 < somevalue THEN (@var1/(@var2*2.00))+@var3 ELSE @var3+somevalue END)
    FROM #sometable
    WHERE CASE WHEN @var3 < somevalue THEN (@var1/(@var2*2.00))+@var3 ELSE @var3+somevalue END
    BETWEEN 900 AND 2000
    ORDER BY -- simulate another event that causes a sort
      CASE WHEN @var3 < somevalue THEN (@var1/(@var2*2.00))+@var3 ELSE @var3+somevalue END;

    -- version 2 with inline alias
    SELECT TOP (@topPct) PERCENT
    someid,
    somevalue,
    someCalculation = itvf.result,
    someRank   = dense_rank() OVER (ORDER BY itvf.result)
    FROM #sometable
    CROSS APPLY (VALUES(CASE WHEN @var3 < somevalue THEN (@var1/(@var2*2.00))+@var3
         ELSE @var3+somevalue END)) itvf(result)
    WHERE itvf.result between 900 and 2000
    ORDER BY itvf.result;

    Actually, thanks for showing this as it goes right along with the DRY principle.  I just need to work harder and using it.

  • Lynn Pettis - Wednesday, March 7, 2018 9:48 AM

    Alan.B - Wednesday, March 7, 2018 9:11 AM

    Some code formatting food for thought...
    For cases where I need to repeat the same expression multiple times you can create on "on-the-fly" inline function to simplify your code. In Lynn's code this expression is repeated a couple times:
    CHARINDEX('-',REVERSE(@VendItemNumber)

    You can simplify the code using the Table Value constructor like this:
    SELECT LEFT(f.vi,LEN(f.vi) - f.pos), RIGHT(f.vi, f.pos - 1)
    FROM (VALUES (CHARINDEX('-',REVERSE(@VendItemNumber)), @VendItemNumber)) f(pos,vi);
     

    For situations with longer expressions that are called several times you will see much cleaner code and without any performance penalty. In fact, there are times where you will see a performance improvement in cases where the optimizer is not smart enough to realize that it only needs to calculate the same expression once.

    There is no performance penalty for doing this, actually - there are times where you will see a performance improvement. For example, run this will "include actual execution plan" turned on:

    -- sample data 
    CREATE TABLE #sometable(someid int identity, somevalue decimal(10,2));
    INSERT #sometable(somevalue) VALUES (100),(1050),(5006),(111),(4);
    GO

    DECLARE @var1 int = 100, @var2 int = 50, @var3 int = 900, @topPct tinyint = 90;
    -- version 1 with repeated formula
    SELECT TOP (@topPct) PERCENT
    someid,
        somevalue,
        someCalculation =
      CASE WHEN @var3 < somevalue THEN (@var1/(@var2*2.00))+@var3 ELSE @var3+somevalue END,
    someRank = dense_rank() OVER (ORDER BY
      CASE WHEN @var3 < somevalue THEN (@var1/(@var2*2.00))+@var3 ELSE @var3+somevalue END)
    FROM #sometable
    WHERE CASE WHEN @var3 < somevalue THEN (@var1/(@var2*2.00))+@var3 ELSE @var3+somevalue END
    BETWEEN 900 AND 2000
    ORDER BY -- simulate another event that causes a sort
      CASE WHEN @var3 < somevalue THEN (@var1/(@var2*2.00))+@var3 ELSE @var3+somevalue END;

    -- version 2 with inline alias
    SELECT TOP (@topPct) PERCENT
    someid,
    somevalue,
    someCalculation = itvf.result,
    someRank   = dense_rank() OVER (ORDER BY itvf.result)
    FROM #sometable
    CROSS APPLY (VALUES(CASE WHEN @var3 < somevalue THEN (@var1/(@var2*2.00))+@var3
         ELSE @var3+somevalue END)) itvf(result)
    WHERE itvf.result between 900 and 2000
    ORDER BY itvf.result;

    Actually, thanks for showing this as it goes right along with the DRY principle.  I just need to work harder and using it.

    😀

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Jeffery Williams - Tuesday, March 6, 2018 12:59 PM

    I have a string that I need to parse. The text will vary but the format always the same.

    Example Text:    Hello5-E-100

    I need this parsed into two variables.

    First being Hello5-E  and the second simply 100

    You said the format will always be the same.
    Does that mean it will always be 6 a/n chars, a dash, a char, another dash, 3 numbers?
    Or it will always be a string of varying length containing 2 dashes somewhere?
    If the latter, simply using LEFT and RIGHT would suffice.  But I suspect that's not the answer or you would have thought of it yourself - so see the other replies.
  • If the format will always be delimited by a dash - then you can easily utilize a string split utility.  A normal string split will split the results into separate rows - but a modified utility can be built to return a specific element from a string.  Using something like that - this becomes:


    Declare @testTable Table (InputString varchar(100));

    Insert Into @testTable
    Values ('HELLO5-E-100'), ('Hotel5-E-101'), ('BYE5-F-001');

     Select concat(p1.Item, '-', p2.Item) As Part1
          , p3.Item As Part2
       From @testTable  t
    Cross Apply dbo.fnGetStringElement(t.InputString, '-', 1) As p1
    Cross Apply dbo.fnGetStringElement(t.InputString, '-', 2) As p2
    Cross Apply dbo.fnGetStringElement(t.InputString, '-', 3) As p3;

    Here is the function...


     Create Function [dbo].[fnGetStringElement] (
            @pString varchar(8000)
          , @pDelimiter char(1)
          , @pElement int)
    Returns Table
       With Schemabinding
         As
     Return

       With e1(n)
         As ( --=== Create Ten 1s
     Select 1 Union All Select 1 Union All
     Select 1 Union All Select 1 Union All
     Select 1 Union All Select 1 Union All
     Select 1 Union All Select 1 Union All
     Select 1 Union All Select 1       --10
            )
          , e2(n) As (Select 1 From e1 a, e1 b)  -- 100
          , e3(n) As (Select 1 From e2 a, e2 b)  -- 10,000
          , cteTally (Number)
         As (
     Select Top (datalength(isnull(@pString, 0)))
            row_number() over(Order By (Select Null))
       From e3
            )
          , cteStart(n1)
         As (
     Select 1
      Union All
     Select t.Number + 1
       From cteTally t
      Where substring(@pString, t.Number, 1) = @pDelimiter
            )
          , cteEnd (n1, l1)
         As (
     Select s.n1
          , coalesce(nullif(charindex(@pDelimiter, @pString, s.n1), 0) - s.n1, 8000)
       From cteStart s
            )
          , cteSplit   --==== Do the split
         As (
     Select row_number() over(Order By e.n1) As ItemNumber
          , substring(@pString, e.n1, e.l1) As Item
       From cteEnd e
            )
     Select ltrim(rtrim(Item)) As Item
       From cteSplit
      Where ItemNumber = @pElement;
    Go

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Jeffrey Williams 3188 - Friday, March 9, 2018 1:20 PM

    If the format will always be delimited by a dash - then you can easily utilize a string split utility.  A normal string split will split the results into separate rows - but a modified utility can be built to return a specific element from a string.  Using something like that - this becomes:


    Declare @testTable Table (InputString varchar(100));

    Insert Into @testTable
    Values ('HELLO5-E-100'), ('Hotel5-E-101'), ('BYE5-F-001');

     Select concat(p1.Item, '-', p2.Item) As Part1
          , p3.Item As Part2
       From @testTable  t
    Cross Apply dbo.fnGetStringElement(t.InputString, '-', 1) As p1
    Cross Apply dbo.fnGetStringElement(t.InputString, '-', 2) As p2
    Cross Apply dbo.fnGetStringElement(t.InputString, '-', 3) As p3;

    Here is the function...


     Create Function [dbo].[fnGetStringElement] (
            @pString varchar(8000)
          , @pDelimiter char(1)
          , @pElement int)
    Returns Table
       With Schemabinding
         As
     Return

       With e1(n)
         As ( --=== Create Ten 1s
     Select 1 Union All Select 1 Union All
     Select 1 Union All Select 1 Union All
     Select 1 Union All Select 1 Union All
     Select 1 Union All Select 1 Union All
     Select 1 Union All Select 1       --10
            )
          , e2(n) As (Select 1 From e1 a, e1 b)  -- 100
          , e3(n) As (Select 1 From e2 a, e2 b)  -- 10,000
          , cteTally (Number)
         As (
     Select Top (datalength(isnull(@pString, 0)))
            row_number() over(Order By (Select Null))
       From e3
            )
          , cteStart(n1)
         As (
     Select 1
      Union All
     Select t.Number + 1
       From cteTally t
      Where substring(@pString, t.Number, 1) = @pDelimiter
            )
          , cteEnd (n1, l1)
         As (
     Select s.n1
          , coalesce(nullif(charindex(@pDelimiter, @pString, s.n1), 0) - s.n1, 8000)
       From cteStart s
            )
          , cteSplit   --==== Do the split
         As (
     Select row_number() over(Order By e.n1) As ItemNumber
          , substring(@pString, e.n1, e.l1) As Item
       From cteEnd e
            )
     Select ltrim(rtrim(Item)) As Item
       From cteSplit
      Where ItemNumber = @pElement;
    Go

    No. 

    Using a splitter for this is kind of like pole vaulting over mouse turds. Next, if you understand what DelimitedSplit8K.ItemNumber does then you don't need a function named fnGetStringElement based on DelimitedSplit8K. If you were to use a splitter you could just do this:
    SELECT
      Part1 = SUBSTRING(t.inputString, 0, LEN(t.inputString)-LEN(s.item)),
      Part2 = Item
    FROM #testTable t
    CROSS APPLY dbo.DelimitedSplit8K(t.InputString, '-') s
    WHERE ItemNumber = 3

    What Lynn posted is the way to go. Another efficient way (just for fun):
    SELECT
      part1 = CONCAT(PARSENAME(clean.string,3),'-', PARSENAME(clean.string,2)),
      part2 = PARSENAME(clean.string,1)
    FROM @testTable t
    CROSS APPLY (VALUES (REPLACE(t.inputString,'-','.'))) clean(string);

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Alan.B - Friday, March 9, 2018 5:53 PM

    No. 

    Using a splitter for this is kind of like pole vaulting over mouse turds. Next, if you understand what DelimitedSplit8K.ItemNumber does then you don't need a function named fnGetStringElement based on DelimitedSplit8K. If you were to use a splitter you could just do this:
    SELECT
      Part1 = SUBSTRING(t.inputString, 0, LEN(t.inputString)-LEN(s.item)),
      Part2 = Item
    FROM #testTable t
    CROSS APPLY dbo.DelimitedSplit8K(t.InputString, '-') s
    WHERE ItemNumber = 3

    What Lynn posted is the way to go. Another efficient way (just for fun):
    SELECT
      part1 = CONCAT(PARSENAME(clean.string,3),'-', PARSENAME(clean.string,2)),
      part2 = PARSENAME(clean.string,1)
    FROM @testTable t
    CROSS APPLY (VALUES (REPLACE(t.inputString,'-','.'))) clean(string);

    Here is a thought - change the specification slightly and how does your solution work?  For example, instead of the first 3 elements of the string let's say we want to return the 2nd, 3rd and 5th elements.  Or maybe we want the 1st, 3rd, 5th or any other elements that are non-contiguous?  Your first solution won't work because you cannot eliminate the 1st or 4th elements of the string without additional code (charindex to locate the first dash but then how do you eliminate the 4th element?) - your second solution won't work because PARSENAME only works with 4 elements in a string.

    How would you code for this: Return the 2nd through 10th pipe-delimited elements as separate fields from the 1st underscore element, the first and second caret (^) elements from the 6th pipe-delimited element of the second underscore element and the 4th caret element of the 11th pipe-delimited element of the second underscore element?  Replace the underscores with CHAR(13) and add a whole lot more segments and you have a standard HL7 message.

    Lynn's solution works - but has the same problem as it only addresses one specific requirement and cannot easily be adapted to resolve a slightly different requirement.

    DelimitedSplit8K could be utilized to return single elements - but then you are doing the exact same thing I am doing in fnGetStringElement so I really see no difference other than the fact that for fnGetStringElement I don't need to specify the ItemNumber in the where clause to eliminate the other rows returned.

    SELECT ...
    FROM Table t
    CROSS APPLY dbo.DelimitedSplit8K(t.InputString, '-') As p1
    CROSS APPLY dbo.DelimitedSplit8K(t.InputString, '-') As p2
    CROSS APPLY dbo.DelimitedSplit8K(t.InputString, '-') As p3
    WHERE p1.ItemNumber = 2
    AND p2.ItemNumber = 3
    AND p3.ItemNumber = 5

    SELECT ...
    FROM Table t
    CROSS APPLY dbo.fnGetStringElement(t.InputString, '-', 1) As p1
    CROSS APPLY dbo.fnGetStringElement(t.InputString, '-', 3) As p2
    CROSS APPLY dbo.fnGetStringElement(t.InputString, '-', 5) As p3

    With that said - I really don't understand why you had to be so rude with your response...

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • My apologies if I came off as rude. I was just trying to point out that what you posted was overkill for the requirement. I write must faster code than I used to because people on this forum have, for years, pointed out what my code could be optimized. The OP stated: 

    Example Text: Hello5-E-100
    I need this parsed into two variables.
    First being Hello5-E and the second simply 100

    Let's assume we're dealing with two hyphens - everything before the last hyphen is part1, the text after is part2. In that scenario, the four solutions presented thusfar work: 

    Sample data
    IF OBJECT_ID('tempdb..#testTable') IS NOT NULL DROP TABLE #testTable;
    CREATE TABLE #testTable(InputString varchar(100));
    INSERT #testTable VALUES ('HELLO5-E-100'), ('Hotel5-E-101'), ('BYE5-F-001');
    GO

    -- fnGetStringElement solution
    SELECT
    part1 = concat(p1.Item, '-', p2.Item),
    Part2 = p3.Item
    FROM #testTable t
    CROSS APPLY dbo.fnGetStringElement(t.InputString, '-', 1) As p1
    CROSS APPLY dbo.fnGetStringElement(t.InputString, '-', 2) As p2
    CROSS APPLY dbo.fnGetStringElement(t.InputString, '-', 3) As p3;

    -- Alan Splitter Solution
    SELECT
    Part1 = SUBSTRING(t.inputString, 0, LEN(t.inputString)-LEN(s.item)),
    Part2 = Item
    FROM #testTable t
    CROSS APPLY dbo.DelimitedSplit8K(t.InputString, '-') s
    WHERE ItemNumber = 3

    -- Lynn
    SELECT
    part1 = LEFT(t.InputString,LEN(t.InputString) - CHARINDEX('-',REVERSE(t.InputString))),
    part2 = RIGHT(t.InputString,CHARINDEX('-',REVERSE(t.InputString)) - 1)
    FROM #testTable t

    -- PARSENAME
    SELECT
    part1 = CONCAT(PARSENAME(clean.string,3),'-', PARSENAME(clean.string,2)),
    part2 = PARSENAME(clean.string,1)
    FROM #testTable t
    CROSS APPLY (VALUES (REPLACE(t.inputString,'-','.'))) clean(string);

    To understand where I was coming from let's start with the execution plans:

    fnGetStringElement solution

    This solution does three itvf function calls then three nested loop joins to merge the data back together.

    Alan Splitter Solution

    The est. subtree cost of this soution is roughtly 1/3rd (0.022 vs 0.06 on my PC). This is due to the fact that I am making only one itvf call vs. three. Lynn's solution and my PARSENAME solution produce the same plan:

    Now a 1,000,000 row test harness
    INSERT #testTable
    SELECT TOP (1000000) LEFT(newid(),17)
    FROM sys.all_columns a, sys.all_columns b;

    PRINT 'fnGetStringElement solution'+char(10)+replicate('-',50);
    GO
    DECLARE @st datetime = getdate(), @part1 varchar(100), @part2 varchar(100);
    SELECT
    @part1 = concat(p1.Item, '-', p2.Item),
    @Part2 = p3.Item
    FROM #testTable t
    CROSS APPLY dbo.fnGetStringElement(t.InputString, '-', 1) As p1
    CROSS APPLY dbo.fnGetStringElement(t.InputString, '-', 2) As p2
    CROSS APPLY dbo.fnGetStringElement(t.InputString, '-', 3) As p3;
    PRINT datediff(MS, @st, getdate());
    GO 3

    PRINT 'Alan Splitter Solution'+char(10)+replicate('-',50);
    GO
    DECLARE @st datetime = getdate(), @part1 varchar(100), @part2 varchar(100);
    SELECT
    @part1 = SUBSTRING(t.inputString, 0, LEN(t.inputString)-LEN(s.item)),
    @part2 = Item
    FROM #testTable t
    CROSS APPLY dbo.DelimitedSplit8K(t.InputString, '-') s
    WHERE ItemNumber = 3;
    PRINT datediff(MS, @st, getdate());
    GO 3

    PRINT 'LYNN'+char(10)+replicate('-',50);
    GO
    DECLARE @st datetime = getdate(), @part1 varchar(100), @part2 varchar(100);
    SELECT
    @part1 = LEFT(t.InputString,LEN(t.InputString) - CHARINDEX('-',REVERSE(t.InputString))),
    @part2 = RIGHT(t.InputString,CHARINDEX('-',REVERSE(t.InputString)) - 1)
    FROM #testTable t
    PRINT datediff(MS, @st, getdate());
    GO 3

    PRINT 'Parsename'+char(10)+replicate('-',50);
    GO
    DECLARE @st datetime = getdate(), @part1 varchar(100), @part2 varchar(100);
    SELECT
    @part1 = CONCAT(PARSENAME(clean.string,3),'-', PARSENAME(clean.string,2)),
    @part2 = PARSENAME(clean.string,1)
    FROM #testTable t
    CROSS APPLY (VALUES (REPLACE(t.inputString,'-','.'))) clean(string);
    PRINT datediff(MS, @st, getdate());
    GO 3

    Results;
    fnGetStringElement solution
    --------------------------------------------------
    Beginning execution loop
    4424
    2470
    2820
    Batch execution completed 3 times.
    Alan Splitter Solution
    --------------------------------------------------
    Beginning execution loop
    990
    960
    956
    Batch execution completed 3 times.
    LYNN
    --------------------------------------------------
    Beginning execution loop
    637
    630
    626
    Batch execution completed 3 times.
    Parsename
    --------------------------------------------------
    Beginning execution loop
    1510
    1520
    1547
    Batch execution completed 3 times.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Alan - I agree for this specific scenario the other options will perform better.  However, you didn't answer the questions I asked - what if the requirements or data change?

    The problem with Lynn's solution is that it cannot be utilized for anything other than splitting into 2 columns.  If the requirement changes and the users now want the data split into 3 columns this solution doesn't work.  If the users decide to add another element to the beginning - or add additional elements to the end but they still want the original 2 columns - this solution will not work.  If anything changes (and it generally does) - this will require a complete code rewrite and testing.

    The problem with PARSENAME is that it only works as long as there are no more than 4 elements in the string.  As soon as we get a fifth element in the string PARSENAME no longer works at all.  The same applies to this solution - once the requirements change it no longer works and will require a complete rewrite and testing of the new process.

    What is very interesting here is that using DelimitedSplit8K actually performs worse than using fnGetStringElement.  This test does not include Lynn's solution or PARSENAME - but includes 2 versions of DelimitedSplit8K.  The first version uses 3 CROSS APPLY's to get each individual element - the second version uses one CROSS APPLY and a CROSS TAB to get the results.

    SET NOCOUNT ON;
     
    IF OBJECT_ID('tempdb..#testTable', 'U') IS NOT NULL
     DROP TABLE #testTable;

    SELECT TOP (1000000) cast(newid() As varchar(255)) As InputString
    INTO #TESTTABLE
    FROM sys.all_columns a, sys.all_columns b;

    PRINT 'fnGetStringElement solution'+char(10)+replicate('-',50);
    GO
    DECLARE @st datetime = getdate(), @Input varchar(255), @part1 varchar(100), @part3 varchar(100), @part5 varchar(100);
    SELECT
    @Input = t.InputString,
    @part1 = p1.Item,
    @part3 = p2.Item,
    @Part5 = p3.Item
    FROM #testTable t
    CROSS APPLY dbo.fnGetStringElement(t.InputString, '-', 1) As p1
    CROSS APPLY dbo.fnGetStringElement(t.InputString, '-', 3) As p2
    CROSS APPLY dbo.fnGetStringElement(t.InputString, '-', 5) As p3;
    PRINT datediff(MS, @st, getdate());
    GO 3

    PRINT 'Alan Splitter Solution'+char(10)+replicate('-',50);
    GO
    DECLARE @st datetime = getdate(), @Input varchar(255), @part1 varchar(100), @part3 varchar(100), @part5 varchar(100);
    SELECT
    @Input = t.InputString,
    @part1 = s.Item,
    @part3 = s3.Item,
    @part5 = s5.Item
    FROM #testTable t
    CROSS APPLY dbo.DelimitedSplit8K(t.InputString, '-') s
    CROSS APPLY dbo.DelimitedSplit8K(t.InputString, '-') s3
    CROSS APPLY dbo.DelimitedSplit8K(t.InputString, '-') s5
    WHERE s.ItemNumber = 1
    AND s3.ItemNumber = 3
    AND s5.ItemNumber = 5;
    PRINT datediff(MS, @st, getdate());
    GO 3

    PRINT 'Modified Alan Splitter Solution'+char(10)+replicate('-',50);
    GO
    DECLARE @st datetime = getdate(), @Input varchar(255), @part1 varchar(100), @part3 varchar(100), @part5 varchar(100);
    SELECT
    @Input = t.InputString,
    @part1 = max(Case When s.ItemNumber = 1 Then s.Item End),
    @part3 = max(Case When s.ItemNumber = 3 Then s.Item End),
    @part5 = max(Case When s.ItemNumber = 5 Then s.Item End)
    FROM #testTable t
    CROSS APPLY dbo.DelimitedSplit8K(t.InputString, '-') s
    GROUP BY t.InputString;
    PRINT datediff(MS, @st, getdate());
    GO 3

    The results of my test are:

    fnGetStringElement solution
    --------------------------------------------------
    Beginning execution loop
    4570
    4913
    4743
    Batch execution completed 3 times.
    Alan Splitter Solution
    --------------------------------------------------
    Beginning execution loop
    7453
    7480
    7466
    Batch execution completed 3 times.
    Modified Alan Splitter Solution
    --------------------------------------------------
    Beginning execution loop
    Warning: Null value is eliminated by an aggregate or other SET operation.
    24540
    Warning: Null value is eliminated by an aggregate or other SET operation.
    24810
    Warning: Null value is eliminated by an aggregate or other SET operation.
    24503
    Batch execution completed 3 times.

    All I can say is - wow...the cross tab version is extremely bad and the DelimitedSplit8K version is a bit slower than the fnGetStringElement version.  So - given the specific scenario from the OP then either Lynn's solution of PARSENAME would be best - when and only when it is guaranteed that the requirements will never change.  If the requirements change - even slightly - then another solution would be required.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • If the requirements change then obviously the code has to change.  Sometimes you just need what works rather than trying to code to a bunch of what ifs that may never occur.

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply