Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Full Text Search and words with symbols


Full Text Search and words with symbols

Author
Message
Boris Pazin
Boris Pazin
SSC Rookie
SSC Rookie (36 reputation)SSC Rookie (36 reputation)SSC Rookie (36 reputation)SSC Rookie (36 reputation)SSC Rookie (36 reputation)SSC Rookie (36 reputation)SSC Rookie (36 reputation)SSC Rookie (36 reputation)

Group: General Forum Members
Points: 36 Visits: 216
Hi,

Is there a way to make full text search which supports search terms for words with symbols, like C#, C++ etc.?

I see that even this forum, if I try search for C++, it returns everything where letter c exists. So, it's not possible I guess?


Regards
Arno Ho
Arno Ho
SSC Eights!
SSC Eights! (965 reputation)SSC Eights! (965 reputation)SSC Eights! (965 reputation)SSC Eights! (965 reputation)SSC Eights! (965 reputation)SSC Eights! (965 reputation)SSC Eights! (965 reputation)SSC Eights! (965 reputation)

Group: General Forum Members
Points: 965 Visits: 63
Hi,

I used the instruction containstable (myTable, MyColumn, 'C++' )

it's work fine with 'C++', 'C#' ...

You can see this
http://technet.microsoft.com/en-us/library/ms189760.aspx

Best regards
andyscott
andyscott
Valued Member
Valued Member (60 reputation)Valued Member (60 reputation)Valued Member (60 reputation)Valued Member (60 reputation)Valued Member (60 reputation)Valued Member (60 reputation)Valued Member (60 reputation)Valued Member (60 reputation)

Group: General Forum Members
Points: 60 Visits: 131
There are lots of posts suggesting you can't do this but it seems to work fine in 2012.

EG:
USE IMPORT;
-- CREATE FULLTEXT CATALOG ft AS DEFAULT;
DROP TABLE test;
CREATE TABLE test (ID int not null,string varchar(max));
INSERT INTO test VALUES (1,'this is a bit of text with the term C# somewhere in it'),(2,'and this one uses C++ instead.'),(3,'whilst this just has a c in it somewhere')
CREATE UNIQUE INDEX ixID ON test (ID);
CREATE FULLTEXT INDEX ON test (string) KEY INDEX ixID;

SELECT * FROM test WHERE CONTAINS(string,'C#');
SELECT * FROM test WHERE CHARINDEX('C#',string)>0;

Returns:

ID string
1 this is a bit of text with the term C# somewhere in it
ID string
1 this is a bit of text with the term C# somewhere in it


However (BIG however).....
If you change the above script and replace all instances of C# with, say, X# and all instances of C++ with X++ then you'll find that it doesn't work any more. What this means is that Microsoft must have a list of "words" that full text indexing is able to pick up on: if your search terms, as in the OP, are included in that list then you're fine, but if they are not then you're no further forward. Someone with more experience of full text editting can probably tell you if and how you might control the "word list"........
Boris Pazin
Boris Pazin
SSC Rookie
SSC Rookie (36 reputation)SSC Rookie (36 reputation)SSC Rookie (36 reputation)SSC Rookie (36 reputation)SSC Rookie (36 reputation)SSC Rookie (36 reputation)SSC Rookie (36 reputation)SSC Rookie (36 reputation)

Group: General Forum Members
Points: 36 Visits: 216
I just tried a query like

SELECT ID, string FROM test
INNER JOIN containstable(test, string, 'C#')
AS KEY_TBL ON test.ID = KEY_TBL.[KEY]

It is very interesting that it returns correct results on SQL Server 2012, but ignores # on SQL Server 2008.


andyscott (2/10/2014)

However (BIG however).....
If you change the above script and replace all instances of C# with, say, X# and all instances of C++ with X++ then you'll find that it doesn't work any more. What this means is that Microsoft must have a list of "words" that full text indexing is able to pick up on: if your search terms, as in the OP, are included in that list then you're fine, but if they are not then you're no further forward. Someone with more experience of full text editting can probably tell you if and how you might control the "word list"........


Maybe there is some internal English dictionary which contains all "real" words.

It looks like all words that I need are included in that list, so I am fine with it.

Thanks guys! Cool
Luis Cazares
Luis Cazares
SSCrazy Eights
SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)

Group: General Forum Members
Points: 8564 Visits: 18143
Maybe this can help:
Creating Custom Dictionaries for special terms to be indexed 'as-is' in SQL Server 2008 Full-Text Indexes


Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?


How to post data/code on a forum to get the best help: Option 1 / Option 2
Phil Haselden
Phil Haselden
Forum Newbie
Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)

Group: General Forum Members
Points: 1 Visits: 27
I suspect that on 2008 (R2) the # is treated as a wildcard for characters that are number and this might explain why C# is handled differently by 2008 and 2012. By default SS strips leading zeros off numbers when word breaking so searching for 00123 would find documents containing 123. To get around this in 2008 you can use a custom dictionary that includes "0#" (without quotes) and then 00123 would be considered to be a word rather than a number and the leading zeros would not be stripped by the word breaker. However this behaviour has changed in 2012.

Sorry no real references to back this up, mostly by trial and error. :-( It would be nice if this was all documented somewhere, but I haven't been able to find much at all.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search