SubstringIndex

  • Hi,

    Can someone guide me on the best possible way to remove all the characters after a 3rd repetition of a character?

    For Example:

    I want 10.0.1600.22 to be 10.0.1600

    Everything after and including the '.' to be removed.

    I undersand Substring_Index() is not available whats the other options?

  • All values have the same format as in your sample?

    Always "dot" is a character you are after?

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • If the above assumptions are right, then you can do this:

    ;with sd(v)

    as (select '10.0.1600.22'

    union select '233.38550.10.23423.2'

    union select '2353.345.4543'

    union select '456433')

    SELECT v, LEFT(v, ISNULL(NULLIF(p,-1),LEN(v)))

    FROM sd

    CROSS APPLY (SELECT CHARINDEX('.',v,CHARINDEX('.',v,CHARINDEX('.',v)+1)+1) - 1 p) f

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • This wont perform as well as what Eugene posted but is a solid, set-based way to solve this problem dynamically...

    Using my Ngrams8k function:

    ALTER FUNCTION [dbo].[nGrams8K]

    (

    @string VARCHAR(8000),

    @n TINYINT,

    @pad BIT=0

    )

    /*--

    Created by:Alan Burstein

    Created on: 3/10/2014

    Updated on: 5/20/2014 (changed the logic to use an "inline tally table")

    9/10/2014 Added some more code examples in the comment section

    Use:Outputs a stream of tokens based on an input string.

    Works just like mdq.nGrams; see http://msdn.microsoft.com/en-us/library/ff487027(v=sql.105).aspx.

    To better understand N-Grams

    see: http://en.wikipedia.org/wiki/N-gram

    */

    RETURNS TABLE

    WITH SCHEMABINDING

    AS

    RETURN

    WITH

    E1(n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t(n)),

    E2(n) AS (SELECT 1 FROM E1 a CROSS JOIN E1 b),

    E4(n) AS (SELECT 1 FROM E2 a CROSS JOIN E2 b),

    iTally(n) AS

    (

    SELECT TOP (LEN(@string)+@n) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM E4

    ),

    NewString(NewString) AS

    (

    SELECTREPLICATE(CASE @pad WHEN 0 THEN '' ELSE ' ' END,@n-1)+@string+

    REPLICATE(CASE @pad WHEN 0 THEN '' ELSE ' ' END,@n-1)

    )

    SELECT n AS sequence,

    SUBSTRING(NewString,n,@n) AS token

    FROM iTally

    CROSS APPLY NewString

    WHERE n < ((@n)+LEN(@string));

    you could do something like this:

    DECLARE @string varchar(1000)='10.0.1600.22',

    @substring_index tinyint = 3;

    WITH get_pos AS

    (

    SELECT rn = row_number() over (order by sequence), substring_index = sequence

    FROM dbo.nGrams8K(@string,1,1)

    WHERE token='.'

    )

    SELECT substring(@string,1,substring_index-1)

    FROM get_pos

    WHERE rn=@substring_index;

    The cool thing about this is that you can change the @substring_index variable.

    Note: No logic is built in for when @substring_index is out of scope (e.g. @substring_index = 4 would return nothing)

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • ... I was thinking, using the logic in my previous post you could create a dbo.SubstringBeforeSSI function (examples included in the code)...

    ALTER FUNCTION dbo.SubstringBeforeSSI

    (

    @string varchar(1000),

    @substring varchar(100),

    @substring_index tinyint

    )

    /*

    DECLARE @string varchar(1000)='10.0.1600.22',

    @searchPattern varchar(100)='.',

    @substring_index tinyint = 3;

    SELECT * FROM dbo.SubstringBeforeSSI(@string,@searchPattern,@substring_index);

    GO

    DECLARE @string varchar(1000)='Line 1 text...<br/>Line 2 text...<br/>Line 3 text...<br/>Line 4 text...',

    @searchPattern varchar(100)='<br/>',

    @substring_index tinyint = 2;

    SELECT * FROM dbo.SubstringBeforeSSI(@string,@searchPattern,@substring_index);

    */

    RETURNS TABLE

    WITH SCHEMABINDING

    AS

    RETURN

    WITH get_pos AS

    (

    SELECT rn = row_number() over (order by sequence), substring_index = sequence

    FROM dbo.nGrams8K(@string,len(@substring),1)

    WHERE token=@substring

    )

    SELECT newstring = substring(@string,1,substring_index-len(@substring))

    FROM get_pos

    WHERE rn=@substring_index;

    GO

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply