SQL - SUBSTRING combined with CHARINDEX not returning desired results.

  • Hello Friends,

    I'm in need of some help here please. I'm trying to extract a certain part of a string.

    I can extract the desired part from first example below by using the using.

    RIGHT(SUBSTRING((ORGPATHTXT),CHARINDEX('/',ORGPATHTXT) - 8, 
    LEN(ORGPATHTXT) - CHARINDEX('/',ORGPATHTXT )- CHARINDEX('/',ORGPATHTXT ) - CHARINDEX('/',REVERSE(ORGPATHTXT))),8)

    The problem is that if there is a location before the last segment in the string, it throws of the results.

    FYI that example 1 works fine but the next two do not.

    Example 1: ./Company/20-00001/120---30-00006/40-00506/50-99506/60-29145/97455001/0097455001/Fire-Grill

    Example 2: ./Company/20-00001/120---30-00006/40-00073/50-00075/60-26275/68990001/0068990001/Location3/Burger-Cook

    Example 3: ./Company/20-00001/120---30-00006/40-00073/50-00075/60-26275/77990993/0077990993/Location6/Food- Worker

    Below is the script and the results it's producing, the last screen is what I'm hoping to get.

    The third column in the results is for the extract code aka LOC.

    WITH SampleData2 (PERSONNUM, [ORGPATHTXT]) AS

    (
    SELECT 231089,'./Company/20-00001/120---30-00006/40-00506/50-99506/60-29145/97455001/0097455001/Fire-Grill'
    UNION ALL SELECT 121564,'./Company/20-00001/120---30-00006/40-00073/50-00075/60-26275/68990001/0068990001/Location3/Burger-Cook'
    UNION ALL SELECT 999998, './Company/20-00001/120---30-00006/40-00073/50-00075/60-26275/77990993/0077990993/Location6/Food-Worker'
    )
    SELECT PERSONNUM,ORGPATHTXT
    ,RIGHT(SUBSTRING((ORGPATHTXT),CHARINDEX('/',ORGPATHTXT) - 8,
    LEN(ORGPATHTXT) - CHARINDEX('/',ORGPATHTXT )- CHARINDEX('/',ORGPATHTXT ) - CHARINDEX('/',REVERSE(ORGPATHTXT))),8) AS LOC
    FROM SampleData2;

    CURRENT RESULTS

    1

    DESIRED RESULTS

    2

    Thank you in advance and god bless !

     

    DS

  •  

    SELECT PERSONNUM,ORGPATHTXT,LOC    
    FROM SampleData2
    CROSS APPLY (
    SELECT TOP (1) LOC
    FROM (
    SELECT TOP (2) LOC
    FROM (
    SELECT TOP (4) REVERSE(Item) AS LOC
    FROM dbo.DelimitedSplit8K(REVERSE(ORGPATHTXT), '/')
    ORDER BY ItemNumber
    ) AS derived
    WHERE LOC NOT LIKE '%[^0-9]%'
    ) AS derived2
    ORDER BY LOC DESC
    ) AS ca1

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Hi Scott,

    Thank you for replying.

    I'm getting an Invalid Object Name for dbo.DelimitedSplit8k, did I miss an step?

    Thank you Sir !

    DS

  • An alternate solution that doesn't rely on the DelimitedSplit8K:

    WITH [SampleData2] ([PERSONNUM], [ORGPATHTXT])
    AS
    (
    SELECT
    231089
    , './Company/20-00001/120---30-00006/40-00506/50-99506/60-29145/97455001/0097455001/Fire-Grill'
    UNION ALL
    SELECT
    121564
    , './Company/20-00001/120---30-00006/40-00073/50-00075/60-26275/68990001/0068990001/Location3/Burger-Cook'
    UNION ALL
    SELECT
    999998
    , './Company/20-00001/120---30-00006/40-00073/50-00075/60-26275/77990993/0077990993/Location6/Food-Worker'
    )
    , [sampledata3] ([PERSONNUM], [ORGPATHTXT], [NoLocation], [ContainsLocation])
    AS
    (
    SELECT
    [SampleData2].[PERSONNUM]
    , [SampleData2].[ORGPATHTXT]
    , REPLACE( [SampleData2].[ORGPATHTXT]
    , '/Location'
    , ''
    )
    , CASE
    WHEN [SampleData2].[ORGPATHTXT] LIKE '%Location%'
    THEN 1
    ELSE 0
    END
    FROM[SampleData2]
    )
    SELECT
    [sampledata3].[PERSONNUM]
    , [sampledata3].[ORGPATHTXT]
    , CASE
    WHEN [sampledata3].[ContainsLocation] = 0
    THEN LEFT(RIGHT(SUBSTRING(([sampledata3].[ORGPATHTXT])
    , CHARINDEX( '/'
    , [sampledata3].[ORGPATHTXT]
    )
    , LEN([sampledata3].[ORGPATHTXT]) - CHARINDEX( '/'
    , REVERSE([sampledata3].[ORGPATHTXT])
    )
    ), 9), 8)
    ELSE LEFT(RIGHT(SUBSTRING( ([sampledata3].[NoLocation])
    , CHARINDEX( '/'
    , [sampledata3].[NoLocation]
    )
    , LEN([sampledata3].[NoLocation]) - CHARINDEX('/'
    , REVERSE([sampledata3].[NoLocation])
    )
    ), 10), 8)
    END AS [LOC]
    FROM[sampledata3];

    With this, I looked at the source data and saw that rows 2 and 3 had an additional "Location" field on them, so I stripped that out in the CTE.  As the number after location does not seem to be consistent, that is why I am grabbing the right 10 characters and then the left 8 of them which would strip off the number after location and the / after that number.  If location doesn't exist, then I only need to grab the right 9 characters which are the LOC number plus a / and get the left 8 of them, so I remove the /.

    The above works with the input you provided, but may need tweaks if the text "location" MAY appear elsewhere or may be a different value.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Sorry, I left out a step.

    dbo.DelimitedSplit8k is a function commonly used on this site to efficiently split a string into multiple parts based on a single-character delimiter.

    Here's a version of it you can use to create the function so that code above will run:

    SET ANSI_NULLS ON
    SET QUOTED_IDENTIFIER ON
    GO
    CREATE FUNCTION [dbo].[DelimitedSplit8K] (
    @pString varchar(8000),
    @pDelimiter char(1)
    )
    RETURNS TABLE WITH SCHEMABINDING
    AS
    /*SELECT * FROM dbo.DelimitedSplit8K('ab/c/def/ghijklm/no/prq/////st/u//', '/')*/
    RETURN
    /*Inline CTE-driven "tally table" produces values from 0 up to 10,000: enough to cover varchar(8000).*/
    WITH E1(N) AS (SELECT N FROM (VALUES(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) AS Ns(N)),
    E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
    E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
    ctetally(N) AS (/* This provides the "zero base" and limits the number of rows right up front,
    for both a performance gain and prevention of accidental "overruns". */
    SELECT 0 UNION ALL
    SELECT TOP (DATALENGTH(ISNULL(@pString, 1))) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
    ), cteStart(N1) AS ( /* This returns N+1 (starting position of each "element" just once for each delimiter). */
    SELECT t.N+1
    FROM ctetally t
    WHERE (SUBSTRING(@pString,t.N,1) = @pDelimiter OR t.N = 0)
    )
    /* Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found. */
    SELECT ROW_NUMBER() OVER(ORDER BY s.N1) AS ItemNumber,
    SUBSTRING(@pString, s.N1, ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1), 0) - s.N1, 8000)) AS Item
    FROM cteStart s;
    /*end of function*/
    GO

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • maybe ?

    WITH SampleData2 (PERSONNUM, [ORGPATHTXT]) AS
    (
    SELECT 231089,'./Company/20-00001/120---30-00006/40-00506/50-99506/60-29145/97455001/0097455001/Fire-Grill'
    UNION ALL
    SELECT 121564,'./Company/20-00001/120---30-00006/40-00073/50-00075/60-26275/68990001/0068990001/Location3/Burger-Cook'
    UNION ALL
    SELECT 999998, './Company/20-00001/120---30-00006/40-00073/50-00075/60-26275/77990993/0077990993/Location6/Food-Worker'
    )

    SELECT *
    ,CAST(N'<x>' + REPLACE(REPLACE(ORGPATHTXT,'/','~'),N'~',N'</x><x>') + N'</x>' AS XML).value('/x[8]','nvarchar(max)')
    from SampleData2

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • Hi Scott,

    This works beautifully on my own system however I'm afraid to implement this on my client's Prod system unless I can back this up with enough knowledge about this solution in case their DBA starts to inquire more about the function.

    Since I know that the following query brings me the code that I need when there is no location added in the string, can't I just modify the following SQL to bring me the code when there is 'Location' added?

    RIGHT(SUBSTRING((ORGPATHTXT),CHARINDEX('/',ORGPATHTXT) - 8, 
    LEN(ORGPATHTXT) - CHARINDEX('/',ORGPATHTXT )- CHARINDEX('/',ORGPATHTXT )- CHARINDEX('/',REVERSE(ORGPATHTXT))),8)

    The above SQL returns the highlighted code below from example 1.

    Example 1: ./Company/20-00001/120---30-00006/40-00506/50-99506/60-29145/97455001/0097455001/Fire-Grill

    Can I edit the same SQL to bring the desired code whenever there is a 'Location' in the string? So I can use a CASE statement to say if ORGPATHTXT LIKE ('%Location%') then use similar looking SQL query as shown above else use

    Example 2: ./Company/20-00001/120---30-00006/40-00073/50-00075/60-26275/68990001/0068990001/Location3/Burger-Cook.

    In other words, what changes can I make to the following SQL to give me 68990001 from this string:

    ./Company/20-00001/120---30-00006/40-00073/50-00075/60-26275/68990001/0068990001/Location3/Burger-Cook

    RIGHT(SUBSTRING((ORGPATHTXT),CHARINDEX('/',ORGPATHTXT) - 8, LEN(ORGPATHTXT) - CHARINDEX('/',ORGPATHTXT )- CHARINDEX('/',ORGPATHTXT )- CHARINDEX('/',REVERSE(ORGPATHTXT))),8)

    Sorry to make this so complicated 🙁

    Than you !

    DS

  • WITH SampleData2 (PERSONNUM, [ORGPATHTXT]) AS
    (
    SELECT 231089,'./Company/20-00001/120---30-00006/40-00506/50-99506/60-29145/97455001/0097455001/Fire-Grill'
    UNION ALL
    SELECT 121564,'./Company/20-00001/120---30-00006/40-00073/50-00075/60-26275/68990001/0068990001/Location3/Burger-Cook'
    UNION ALL
    SELECT 999998, './Company/20-00001/120---30-00006/40-00073/50-00075/60-26275/77990993/0077990993/Location6/Food-Worker'
    )
    SELECT PERSONNUM,ORGPATHTXT, T.LOC
    FROM SampleData2
    CROSS APPLY(VALUES (CHARINDEX('/',ORGPATHTXT,1)+1)) P1(Pos)
    CROSS APPLY(VALUES (CHARINDEX('/',ORGPATHTXT,P1.Pos)+1)) P2(Pos)
    CROSS APPLY(VALUES (CHARINDEX('/',ORGPATHTXT,P2.Pos)+1)) P3(Pos)
    CROSS APPLY(VALUES (CHARINDEX('/',ORGPATHTXT,P3.Pos)+1)) P4(Pos)
    CROSS APPLY(VALUES (CHARINDEX('/',ORGPATHTXT,P4.Pos)+1)) P5(Pos)
    CROSS APPLY(VALUES (CHARINDEX('/',ORGPATHTXT,P5.Pos)+1)) P6(Pos)
    CROSS APPLY(VALUES (CHARINDEX('/',ORGPATHTXT,P6.Pos)+1)) P7(Pos)
    CROSS APPLY(VALUES (CHARINDEX('/',ORGPATHTXT,P7.Pos)+1)) P8(Pos)
    CROSS APPLY(VALUES (SUBSTRING(ORGPATHTXT,P7.Pos,P8.Pos-P7.Pos-1))) T(LOC);
  • Sure, you can do that.  But I can't really follow the logic of the the original statement -- particularly the logic of starting from a negative starting byte, so I can't help you with that.  I also strongly prefer not to hard-code restrictions for certain "magic" strings in my code, but instead to allow for any reasonable data that comes in to be correctly processed.

    If you want to limit your logic to check for the string 'location' and counting on that always staying true, then you can adjust the code to use that to pull the SUBSTRING.  J Livingston SQL posted a proposed solution along exactly those lines.  You should review it and see it helps you.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply