Search within a string and return value(s)

  • Hello,

    I have this test table:

    CREATE TABLE #T2 ( VARCHAR(100))
    INSERT INTO #T2 ()
    VALUES
    ('https://www.abc123.com/search?breadcrumbs%5B0%5D=red-big-widget&breadcrumbs%5B1%5D=red-long-big-widgets&cc=LUNERATOOBS&facet.multiselect=true&page=1&q=*&rows=15&son=0&sort=price+asc&start=0&filter=(category:"12897")&filter=(a_brand_t_fq:"BrandLoonry")&filter=(a_length_d_fq:[12%20TO%2024]%20OR%20a_length_d_fq:[24%20TO%2036]%20OR%20a_length_d_fq:[36%20TO%2048])&filter=(a_dimmable_t_fq:"Yes")',

    ('https://www.abc123.com/search?facet.multiselect=true&page=1&q=*&rows=15&son=0&sort=price+asc&start=0&filter=(category:"12897")&filter=(a_brand_t_fq:"Euro%20Litny")&filter=(a_voltage_t_fq:"120")'),

    ('https://www.abc123.com/search?facet.multiselect=true&page=1&q=*&rows=15&son=0&sort=price+asc&start=0&filter=(category:"12897")&filter=(a_brand_t_fq:"LifeWerks")&filter=(a_lens_t_fq:"Frosted%20carbonate")'),

    ('https://www.abc123.com/search?facet.multiselect=true&page=1&q=*&rows=15&son=0&sort=price+asc&start=0&filter=(category:"12897")&filter=(a_brand_t_fq:"GEICO")&filter=(a_dimmable_t_fq:"No")'),

    ('https://www.abc123.com/search?breadcrumbs%5B0%5D=red-big-widget&breadcrumbs%5B1%5D=red-long-big-widgets&cc=LUNERATUBES&facet.multiselect=true&page=1&q=*&rows=15&son=0&sort=price+asc&start=0&filter=(category:"12897")&filter=(a_brand_t_fq:"BrandLoonry"%20OR%20a_brand_t_fq:"Lighting%20Science")&filter=(a_length_d_fq:[12%20TO%2024]%20OR%20a_length_d_fq:[24%20TO%2036]%20OR%20a_length_d_fq:[36%20TO%2048])')

    ('https://www.abc123.com/search?breadcrumbs%5B0%5D=red-big-widget&breadcrumbs%5B1%5D=red-long-big-widgets&cc=LUNERATUBES&facet.multiselect=true&page=1&q=*&rows=15&son=0&sort=price+asc&start=0&filter=(category:"12897")&filter=(a_brand_t_fq:"BrandLoonry"%20OR%20a_brand_t_fq:"Lighting%20Science"%20OR%20a_brand_t_fq:"MaxLite")&filter=(a_length_d_fq:[12%20TO%2024]%20OR%20a_length_d_fq:[24%20TO%2036]%20OR%20a_length_d_fq:[36%20TO%2048])')

    ------
    *There maybe a slight syntax error in above create values..

    What I'm wanting if possible, is look at each string, only in section  (
    where a_brand_t_fq:  , and return the unique value by parsing out the brand section, below is a sample expected result:

    i.e.
    Euro Litny
    BrandLoonry
    Lifewerks
    GEICO
    Lighting Science
    MaxLite

    Thanks

  • Quick suggestion
    😎

    SELECT
    CHARINDEX('a_brand_t_fq:"',U.,1) + 14 AS PAT_START
    ,CHARINDEX('"',U.,CHARINDEX('a_brand_t_fq:"',U.,1) + 14) AS PAT_END
    ,(CHARINDEX('"',U.,CHARINDEX('a_brand_t_fq:"',U.,1) + 14)) - CHARINDEX('a_brand_t_fq:"',U.,14) AS PAT_LEN
    ,SUBSTRING(U.,CHARINDEX('a_brand_t_fq:"',U.,1) + 14
     ,(CHARINDEX('"',U.,CHARINDEX('a_brand_t_fq:"',U.,1) + 14))
      - (CHARINDEX('a_brand_t_fq:"',U.,1) + 14)) AS PAT_STR
    ,U.
    FROM #T2 U;

  • Thanks Eirikur Eiriksson. I should have said this is sample data, have millions of records, with various values in that section. Will this work across all, it appears it should:

    The actual table I have is below. I've highlighted in bold the column where data resides (address),

    but get the following error:Incorrect syntax near 'a_brand_t_fq:"'.

    SELECT [Address]
    FROM [dbo].[test_parse_brand_from_url]
    where
    CHARINDEX('a_brand_t_fq:"',U.,1) + 14 AS PAT_START
    ,CHARINDEX('"',U.,CHARINDEX('a_brand_t_fq:"',U.,1) + 14) AS PAT_END
    ,(CHARINDEX('"',U.,CHARINDEX('a_brand_t_fq:"',U.,1) + 14)) - CHARINDEX('a_brand_t_fq:"',U.,14) AS PAT_LEN
    ,SUBSTRING(U.,CHARINDEX('a_brand_t_fq:"',U.,1) + 14
    ,(CHARINDEX('"',U.,CHARINDEX('a_brand_t_fq:"',U.,1) + 14))
     - (CHARINDEX('a_brand_t_fq:"',U.,1) + 14)) AS PAT_STR
    ,U.

  • VegasL - Wednesday, April 18, 2018 6:59 AM

    Thanks Eirikur Eiriksson. I should have said this is sample data, have millions of records, with various values in that section. Will this work across all, it appears it should:

    The actual table I have is below. I've highlighted in bold the column where data resides (address),

    but get the following error:Incorrect syntax near 'a_brand_t_fq:"'.

    SELECT [Address]
    FROM [dbo].[test_parse_brand_from_url]
    where
    CHARINDEX('a_brand_t_fq:"',U.,1) + 14 AS PAT_START
    ,CHARINDEX('"',U.,CHARINDEX('a_brand_t_fq:"',U.,1) + 14) AS PAT_END
    ,(CHARINDEX('"',U.,CHARINDEX('a_brand_t_fq:"',U.,1) + 14)) - CHARINDEX('a_brand_t_fq:"',U.,14) AS PAT_LEN
    ,SUBSTRING(U.,CHARINDEX('a_brand_t_fq:"',U.,1) + 14
    ,(CHARINDEX('"',U.,CHARINDEX('a_brand_t_fq:"',U.,1) + 14))
     - (CHARINDEX('a_brand_t_fq:"',U.,1) + 14)) AS PAT_STR
    ,U.

    Put the query I posted in a CTE and then do the predicates.
    😎

  • noob on sql. no idea on cte. can you perhaps shed more info or perhaps put those values in sql query? also not sure what the original error in temp table i created.. thx

  • VegasL - Wednesday, April 18, 2018 8:16 AM

    noob on sql. no idea on cte. can you perhaps shed more info or perhaps put those values in sql query? also not sure what the original error in temp table i created.. thx

    What kind of predicates are you using (where clause conditions)?
    😎

    The CTE syntax is like this:

    ;WITH [CTE NAME] ([OUTPUT COLUMN NAMES]) AS
    (
      [SELECT STATEMENT]
    )
    SELECT 
        [COLUMN LIST]
    FROM [CTE NAME] [ALIAS]

  • ok this is what i came up with:

    ;with [my cte test] ([Address]) AS
    (SELECT [Address]
    FROM [dbo].[test_parse_brand_from_url]
    where
    CHARINDEX('a_brand_t_fq:"',U.,1) + 14 AS PAT_START
    ,CHARINDEX('"',U.,CHARINDEX('a_brand_t_fq:"',U.,1) + 14) AS PAT_END
    ,(CHARINDEX('"',U.,CHARINDEX('a_brand_t_fq:"',U.,1) + 14)) - CHARINDEX('a_brand_t_fq:"',U.,14) AS PAT_LEN
    ,SUBSTRING(U.,CHARINDEX('a_brand_t_fq:"',U.,1) + 14
    ,(CHARINDEX('"',U.,CHARINDEX('a_brand_t_fq:"',U.,1) + 14))
    - (CHARINDEX('a_brand_t_fq:"',U.,1) + 14)) AS PAT_STR
    ,U.
    )

    select [address]
    from [my cte test] as [sample cte test]

    --- get error... Msg 156, Level 15, State 1, Line 5
    Incorrect syntax near the keyword 'AS'.

  • I'm not sure why you are adding Eirikur's script as the WHERE clause, because it's a SELECT as opposed to any kind of filter.   I corrected the sample table data syntax errors and ran it, and it appears you want to find ALL the possible values found for the a_brand_t_fq designation within each URL string.   The script Eirikur provided isn't going to do that.   You'd probably need an inline table-valued function to parse each string and provide values, from which you could then SELECT DISTINCT.   I don't have the time right now to work on such a function, but it needs to take a similar approach to Jeff Moden's DelimitedSplit8K function.   Search the Articles section of this site for "Tally Oh", and the first article that pops up with that in the Title is the one.  The code is at the very end of the article, but you'll want to read the whole thing so that you can become familiar with its methodology.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • Here's the cleaned up sample data for anyone looking to work on the function:CREATE TABLE #T2 (
         varchar(1000)
    );
    INSERT INTO #T2 ()
    VALUES    ('https://www.abc123.com/search?breadcrumbs%5B0%5D=red-big-widget&breadcrumbs%5B1%5D=red-long-big-widgets&cc=LUNERATOOBS&facet.multiselect=true&page=1&q=*&rows=15&son=0&sort=price+asc&start=0&filter=(category:"12897")&filter=(a_brand_t_fq:"BrandLoonry")&filter=(a_length_d_fq:[12%20TO%2024]%20OR%20a_length_d_fq:[24%20TO%2036]%20OR%20a_length_d_fq:[36%20TO%2048])&filter=(a_dimmable_t_fq:"Yes")'),
            ('https://www.abc123.com/search?facet.multiselect=true&page=1&q=*&rows=15&son=0&sort=price+asc&start=0&filter=(category:"12897")&filter=(a_brand_t_fq:"Euro%20Litny")&filter=(a_voltage_t_fq:"120")'),
            ('https://www.abc123.com/search?facet.multiselect=true&page=1&q=*&rows=15&son=0&sort=price+asc&start=0&filter=(category:"12897")&filter=(a_brand_t_fq:"LifeWerks")&filter=(a_lens_t_fq:"Frosted%20carbonate")'),
            ('https://www.abc123.com/search?facet.multiselect=true&page=1&q=*&rows=15&son=0&sort=price+asc&start=0&filter=(category:"12897")&filter=(a_brand_t_fq:"GEICO")&filter=(a_dimmable_t_fq:"No")'),
            ('https://www.abc123.com/search?breadcrumbs%5B0%5D=red-big-widget&breadcrumbs%5B1%5D=red-long-big-widgets&cc=LUNERATUBES&facet.multiselect=true&page=1&q=*&rows=15&son=0&sort=price+asc&start=0&filter=(category:"12897")&filter=(a_brand_t_fq:"BrandLoonry"%20OR%20a_brand_t_fq:"Lighting%20Science")&filter=(a_length_d_fq:[12%20TO%2024]%20OR%20a_length_d_fq:[24%20TO%2036]%20OR%20a_length_d_fq:[36%20TO%2048])'),
            ('https://www.abc123.com/search?breadcrumbs%5B0%5D=red-big-widget&breadcrumbs%5B1%5D=red-long-big-widgets&cc=LUNERATUBES&facet.multiselect=true&page=1&q=*&rows=15&son=0&sort=price+asc&start=0&filter=(category:"12897")&filter=(a_brand_t_fq:"BrandLoonry"%20OR%20a_brand_t_fq:"Lighting%20Science"%20OR%20a_brand_t_fq:"MaxLite")&filter=(a_length_d_fq:[12%20TO%2024]%20OR%20a_length_d_fq:[24%20TO%2036]%20OR%20a_length_d_fq:[36%20TO%2048])');

    SELECT DISTINCT
        --CHARINDEX('a_brand_t_fq:"',U.,1) + 14 AS PAT_START
        --,CHARINDEX('"',U.,CHARINDEX('a_brand_t_fq:"',U.,1) + 14) AS PAT_END
        --,(CHARINDEX('"',U.,CHARINDEX('a_brand_t_fq:"',U.,1) + 14)) - CHARINDEX('a_brand_t_fq:"',U.,14) AS PAT_LEN
        SUBSTRING(U.,CHARINDEX('a_brand_t_fq:"',U.,1) + 14
            ,(CHARINDEX('"',U.,CHARINDEX('a_brand_t_fq:"',U.,1) + 14))
            - (CHARINDEX('a_brand_t_fq:"',U.,1) + 14)) AS PAT_STR
        --,U.
    FROM #T2 U;

    DROP TABLE #T2;

    I also included a SELECT DISTINCT based on Eirikur's code, just so I could see what it was returning, and it was missing some values.   As there may not be any consistency to how many of them there might be in a given URL, that's why you'll need an inline table-valued function.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • Thanks Erik / Steve,

    I used your query from and it definitely worked in temp table, but when i replaced values in actual table, i got

    Error: Invalid length parameter passed to the LEFT or SUBSTRING function.

    Here is what I have:

    SELECT DISTINCT
      --CHARINDEX('a_brand_t_fq:"',U.[Address],1) + 14 AS PAT_START
      --,CHARINDEX('"',U.[Address],CHARINDEX('a_brand_t_fq:"',U.[Address],1) + 14) AS PAT_END
      --,(CHARINDEX('"',U.[Address],CHARINDEX('a_brand_t_fq:"',U.[Address],1) + 14)) - CHARINDEX('a_brand_t_fq:"',U.[Address],14) AS PAT_LEN
      SUBSTRING(U.[Address],CHARINDEX('a_brand_t_fq:"',U.[Address],1) + 14
       ,(CHARINDEX('"',U.[Address],CHARINDEX('a_brand_t_fq:"',U.[Address],1) + 14))
       - (CHARINDEX('a_brand_t_fq:"',U.[Address],1) + 14)) AS PAT_STR
      --,U.[Address]
    FROM [dbo].[test_parse_brand_from_url] U;

    *My column name is : Address
    The data resides in this column.

    Table Name is: test_parse_brand_from_url

    Any additional help would be most appreciated.

  • VegasL - Wednesday, April 18, 2018 12:43 PM

    Thanks Erik / Steve,

    I used your query from and it definitely worked in temp table, but when i replaced values in actual table, i got

    Error: Invalid length parameter passed to the LEFT or SUBSTRING function.

    Here is what I have:

    SELECT DISTINCT
      --CHARINDEX('a_brand_t_fq:"',U.[Address],1) + 14 AS PAT_START
      --,CHARINDEX('"',U.[Address],CHARINDEX('a_brand_t_fq:"',U.[Address],1) + 14) AS PAT_END
      --,(CHARINDEX('"',U.[Address],CHARINDEX('a_brand_t_fq:"',U.[Address],1) + 14)) - CHARINDEX('a_brand_t_fq:"',U.[Address],14) AS PAT_LEN
      SUBSTRING(U.[Address],CHARINDEX('a_brand_t_fq:"',U.[Address],1) + 14
       ,(CHARINDEX('"',U.[Address],CHARINDEX('a_brand_t_fq:"',U.[Address],1) + 14))
       - (CHARINDEX('a_brand_t_fq:"',U.[Address],1) + 14)) AS PAT_STR
      --,U.[Address]
    FROM [dbo].[test_parse_brand_from_url] U;

    *My column name is : Address
    The data resides in this column.

    Table Name is: test_parse_brand_from_url

    Any additional help would be most appreciated.

    Tried to create a custom function, but quickly abandoned the idea when I came across a way to make it work with Jeff Moden's DelimitedSplit8K function.   Here's the code:CREATE TABLE #T2 (
        ID int IDENTITY(1,1) NOT NULL,
      varchar(1000)
    );
    INSERT INTO #T2 ()
    VALUES    ('https://www.abc123.com/search?breadcrumbs%5B0%5D=red-big-widget&breadcrumbs%5B1%5D=red-long-big-widgets&cc=LUNERATOOBS&facet.multiselect=true&page=1&q=*&rows=15&son=0&sort=price+asc&start=0&filter=(category:"12897")&filter=(a_brand_t_fq:"BrandLoonry")&filter=(a_length_d_fq:[12%20TO%2024]%20OR%20a_length_d_fq:[24%20TO%2036]%20OR%20a_length_d_fq:[36%20TO%2048])&filter=(a_dimmable_t_fq:"Yes")'),
            ('https://www.abc123.com/search?facet.multiselect=true&page=1&q=*&rows=15&son=0&sort=price+asc&start=0&filter=(category:"12897")&filter=(a_brand_t_fq:"Euro%20Litny")&filter=(a_voltage_t_fq:"120")'),
            ('https://www.abc123.com/search?facet.multiselect=true&page=1&q=*&rows=15&son=0&sort=price+asc&start=0&filter=(category:"12897")&filter=(a_brand_t_fq:"LifeWerks")&filter=(a_lens_t_fq:"Frosted%20carbonate")'),
            ('https://www.abc123.com/search?facet.multiselect=true&page=1&q=*&rows=15&son=0&sort=price+asc&start=0&filter=(category:"12897")&filter=(a_brand_t_fq:"GEICO")&filter=(a_dimmable_t_fq:"No")'),
            ('https://www.abc123.com/search?breadcrumbs%5B0%5D=red-big-widget&breadcrumbs%5B1%5D=red-long-big-widgets&cc=LUNERATUBES&facet.multiselect=true&page=1&q=*&rows=15&son=0&sort=price+asc&start=0&filter=(category:"12897")&filter=(a_brand_t_fq:"BrandLoonry"%20OR%20a_brand_t_fq:"Lighting%20Science")&filter=(a_length_d_fq:[12%20TO%2024]%20OR%20a_length_d_fq:[24%20TO%2036]%20OR%20a_length_d_fq:[36%20TO%2048])'),
            ('https://www.abc123.com/search?breadcrumbs%5B0%5D=red-big-widget&breadcrumbs%5B1%5D=red-long-big-widgets&cc=LUNERATUBES&facet.multiselect=true&page=1&q=*&rows=15&son=0&sort=price+asc&start=0&filter=(category:"12897")&filter=(a_brand_t_fq:"BrandLoonry"%20OR%20a_brand_t_fq:"Lighting%20Science"%20OR%20a_brand_t_fq:"MaxLite")&filter=(a_length_d_fq:[12%20TO%2024]%20OR%20a_length_d_fq:[24%20TO%2036]%20OR%20a_length_d_fq:[36%20TO%2048])');

    SELECT DISTINCT
        STUFF(
            S3.ItemStart,
            CHARINDEX('"', S3.ItemStart),
            LEN(S3.ItemStart) - CHARINDEX('"', S3.ItemStart) + 1,
            '') AS Brand
    FROM #T2 AS T
        CROSS APPLY SSA_Master.dbo.DelimitedSplit8K(T., '&') AS S
        CROSS APPLY SSA_Master.dbo.DelimitedSplit8K(S.Item, ':') AS S2
        CROSS APPLY (SELECT REPLACE(STUFF(S2.Item, 1, 1, ''), '%20', ' ') AS ItemStart) AS S3
    WHERE S.Item LIKE '%a_brand_t_fq:%'
        AND S2.ItemNumber > 1;

    DROP TABLE #T2;

    You can get the code for that function at the end of his article, here:  http://www.sqlservercentral.com/articles/Tally+Table/72993/

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • thanks Steve. i opened "The New Splitter Functions.zip" and ran CREATE FUNCTION [dbo].[DelimitedSplit8K]  all the way to go   at end of script.  Now will try you're script.

    again appreciate your help

  • VegasL - Thursday, April 19, 2018 4:02 PM

    thanks Steve. i opened "The New Splitter Functions.zip" and ran CREATE FUNCTION [dbo].[DelimitedSplit8K]  all the way to go   at end of script.  Now will try you're script.

    again appreciate your help

    Glad I could help.   Let me know if there are any rows that aren't getting processed correctly...  and I'll see if I can fix...

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply