April 13, 2007 at 8:44 am
I'm trying to find a way to count how many times a string 'xyz' occurs in a ntext column. The column is full-text indexed but that just gives whether the string is present or not. Also, I can't use CHARINDEX to evaluate the length as it doesn't work with this datatype. Any suggestions ?
We use this column to store MS Word documents in a SQL Server 7 database (alas ... no iFilter for Word before SQL 2000).
April 13, 2007 at 9:58 am
One way
SELECT COUNT(*)
FROM
a
INNER JOIN [Number] n
ON SUBSTRING(a.[ntextcolumn],n.N,3) = N'xyz'
AND n.N < (DATALENGTH(a.x) / 2)
providing number table contains numbers upto max len of column
not sure how efficient it would be on large data
Far away is close at hand in the images of elsewhere.
Anon.
April 13, 2007 at 10:05 am
I'm confused when you say Charindex can't be used on this datatype. The following script returns 15.
CREATE TABLE tblNText
(
ntxColumn ntext
)
DECLARE
@vchSearchString varchar(50)
,@bitIsDone bit
,@intPosition int
,@intOccurrences int
SET @vchSearchString = 'SQL'
INSERT INTO
tblNText
VALUES
(
'This SQL insert, which is a SQL command, into a SQL column in a SQL table is an SEO gaming SQL string. SQL is as SQL does, which is something that all SQL experts, which are people who know SQL well, and even SQL moderates, as well as SQL neophytes knows. SQL SQL SQL, I can''t seem to get enough SQL.'
)
SET @bitIsDone = 0
SET @intOccurrences = 0
SET @intPosition = 0
WHILE @bitIsDone = 0
BEGIN
SET @intPosition = (SELECT Charindex(@vchSearchString,ntxColumn,@intPosition + 1) FROM tblNText)
IF @intPosition > 0
SET @intOccurrences = @intOccurrences + 1
ELSE
SET @bitIsDone = 1
END
SELECT @intOccurrences
April 13, 2007 at 10:19 am
I'm confused when you say Charindex can't be used on this datatype |
Because CHARINDEX only works with short 'character data types' (char,varchar,nchar,nvarchar) and SQL will do an implicit CAST to nvarchar and truncate the data at 8000 bytes (4000 chars)
Far away is close at hand in the images of elsewhere.
Anon.
April 13, 2007 at 10:29 am
Ah, I wasn't aware that that conversion takes place (and that's one reason that I hate implicit conversions), and BOL doesn't note that fact in the CHARINDEX entry. I did do testing, and you are indeed correct. My count stops at 199 no matter how many times I repeat my sample string.
April 16, 2007 at 12:34 am
Thank you all for your help. I think I'm going to abandon the idea of implementing this new criteria in our search engine as it proves to be to slow (the search runs through 80'000 word documents which can contain up to 10 pages) and would, in conjunction with the other criteria, cause a timeout when the query runs.
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply