Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Trying to iteratively use PATINDEX commands to comb through a large amount of text Expand / Collapse
Author
Message
Posted Friday, December 17, 2010 1:24 PM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Friday, December 24, 2010 3:58 PM
Points: 59, Visits: 155
Here's the scenario.

I have a 10000 record table where each row contains a large amount of text. I want to do a PATINDEX search to determine where a specific word occurs (let's say "dog" as an example). Since the patindex command only finds the first instance, I want to iterate using substrings to keep finding additional hits.

Right now this is the method I know how to do it (there may be a few little details missing, its mainly to show my thought process):

1) Create a field that contains my current patindex position.
2) Run the patindex command, update the table with the current patindex position for each row.
3) Store the patindex position in another table (or wherever, not really important for this exercise).
4) Create a dynamic sql query to do a substring on the text, starting at the current patindex position +1.
5) Run another patindex command on this new substring. Update the current patindex position with the new position.
6) Rinse and Repeat.


This method will work, but I was wondering if anyone had a cleaner and better way to do this (perhaps that didn't involve dynamic sql).
Post #1036728
Posted Sunday, December 19, 2010 1:24 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Wednesday, October 15, 2014 4:57 AM
Points: 141, Visits: 857
First things that pop in to my head:
1. CTE to identify occurrences of the desired string part
2. Table valued function using the CTE and then using CROSS APPLY

Perhaps others have better ideas?
Post #1036951
Posted Tuesday, December 21, 2010 11:28 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Monday, August 25, 2014 12:52 PM
Points: 22, Visits: 249
I'm dreaming up a solution using tally tables... What is your desired output? Number of occurrences, or indexes of occurrences?
Post #1038078
Posted Wednesday, December 22, 2010 12:02 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Monday, August 25, 2014 12:52 PM
Points: 22, Visits: 249
Ok here we go...
You'll need to build a tally table. Jeff Moden has an excellent article about tally tables here.

After you have a tally table, create a test table with test data: (or just plug in your own table I suppose)

select top 10000 identity(int,1,1) as id, Dog
into DogTable
from
(
select 'dogcatdogdogbirdape' as Dog
union all
select 'fishbirdnotadogoopsthatwasadog'
union all
select 'caninecanislupisdomesticus'
)x
cross join
(select sc1.name from master.dbo.syscolumns sc1
cross join
master.dbo.syscolumns sc2
) y


Now for the solution:
select id, count(id)
from
(
select d.id, d.dog, substring(d.dog, case when t.N > len(d.dog) then len(d.dog) else t.n end, 3) as test, n as foundindex
from dogtable d
cross join
tally t
where t.n < (select max(len(dog)) from dogtable)
and substring(d.dog, case when t.N > len(d.dog) then len(d.dog) else t.n end, 3) = 'dog'
)x
group by id

Run the inner view if you just want a list of indexes of the tested string.
This could be easily turned into a table valued function, just read the length of your input string into the last parameter of substring
Post #1038085
Posted Monday, August 5, 2013 12:38 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Wednesday, September 3, 2014 1:26 PM
Points: 8, Visits: 277
Sorry for necro'ing an old thread, but I'm trying to do something similar. The difference is that I'm looking for multiple strings (matching a pattern) of varying lengths. I've tried to modify the "dog" query, but am falling short of accomplishing what I want to accomplish. Can anyone help shed light?

FYI, I'm trying to search through stored procs and other objects for instances of linked servers, which follow a standard of "LS-%".

select object_name(s.id) Object, substring(s.text, n, charindex(' ', text, n) - n) as Target
from syscomments s
cross join
tally t
where t.n < (select max(len(text)) from syscomments)
and lower(substring(s.text, case when t.N > len(s.text) then len(s.text) else t.n end, case when t.N > len(s.text) then 5 else len(s.text) - n end)) like 'ls-__'
order by object, target

Post #1481044
Posted Tuesday, August 13, 2013 3:13 PM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Today @ 4:18 PM
Points: 2,192, Visits: 3,299
That code's written for SQL 2000; I suggest completely ignoring it.

Instead, the code below should give you a list of object names and their full text reasonably quickly.

If you want to add in the part about showing the specific text that matches, let me know. Just be aware that'll be slower -- perhaps much slower?!


IF OBJECT_ID('tempdb..#search_strings') IS NOT NULL
DROP TABLE #search_strings
CREATE TABLE #search_strings (
search_id int IDENTITY(1, 1) NOT NULL,
search_string nvarchar(40) NOT NULL
)
INSERT INTO #search_strings
SELECT 'linked_server_name1'
--UNION ALL SELECT 'linked_server_name2'
--UNION ALL ...

SELECT
OBJECT_NAME(object_id) AS Object_Name,
search_string,
(SELECT definition FROM sys.sql_modules WHERE object_id = derived.object_id) AS Object_Text
FROM (
SELECT DISTINCT
sm.object_id,
ss.search_string
FROM sys.sql_modules sm
CROSS JOIN #search_strings AS ss
WHERE
CAST(OBJECTPROPERTYEX(sm.object_id, 'IsProcedure') AS int) = 1 AND
sm.definition LIKE '%' + ss.search_string + '%'
) AS derived
ORDER BY
Object_Name




SQL DBA,SQL Server MVP('07, '08, '09)

Carl Sagan said: "There is no such thing as a dumb question." Sagan obviously never watched a congressional hearing!
Post #1483994
Posted Wednesday, August 14, 2013 11:56 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Wednesday, September 3, 2014 1:26 PM
Points: 8, Visits: 277
Thanks for the code! It works great if I know what I'm searching for, but unfortunately to be useful in my current situation it would have to return the actual match. So I could search for "ls-" and it would return "ls-app1" and "ls-app2", etc.
Post #1484431
Posted Wednesday, August 14, 2013 12:56 PM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Today @ 4:18 PM
Points: 2,192, Visits: 3,299
The code uses LIKE to compare the string, so just construct your string accordingly.

For example, you could use 'ls-' as the search string. Or, if you want a closer match, then:

ls-app[0-9]

or something similar.

I'll add code to see the actual part(s) of the text that match if you want.


SQL DBA,SQL Server MVP('07, '08, '09)

Carl Sagan said: "There is no such thing as a dumb question." Sagan obviously never watched a congressional hearing!
Post #1484458
Posted Wednesday, August 14, 2013 9:38 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 9:03 PM
Points: 35,347, Visits: 31,884
ScottPletcher (8/13/2013)
That code's written for SQL 2000; I suggest completely ignoring it.


I wouldn't ignore it because it was written for SQL Server 2000... I'd ignore it because it solves the wrong problem for the latest request.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1484577
Posted Wednesday, August 21, 2013 11:20 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Wednesday, September 3, 2014 1:26 PM
Points: 8, Visits: 277
ScottPletcher (8/14/2013)
The code uses LIKE to compare the string, so just construct your string accordingly.
<snip>
I'll add code to see the actual part(s) of the text that match if you want.</snip>


Yes, please!
I don't even know where to begin for that...
Post #1486819
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse