Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Full Text Search and words with symbols Expand / Collapse
Author
Message
Posted Saturday, February 8, 2014 3:30 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Wednesday, August 20, 2014 9:39 AM
Points: 30, Visits: 183
Hi,

Is there a way to make full text search which supports search terms for words with symbols, like C#, C++ etc.?

I see that even this forum, if I try search for C++, it returns everything where letter c exists. So, it's not possible I guess?


Regards
Post #1539533
Posted Monday, February 10, 2014 2:08 AM
Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Friday, July 11, 2014 9:02 AM
Points: 560, Visits: 56
Hi,

I used the instruction containstable (myTable, MyColumn, 'C++' )

it's work fine with 'C++', 'C#' ...

You can see this
http://technet.microsoft.com/en-us/library/ms189760.aspx

Best regards
Post #1539658
Posted Monday, February 10, 2014 2:35 AM This worked for the OP Answer marked as solution
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Tuesday, May 6, 2014 11:31 AM
Points: 28, Visits: 77
There are lots of posts suggesting you can't do this but it seems to work fine in 2012.

EG:
USE IMPORT;
-- CREATE FULLTEXT CATALOG ft AS DEFAULT;
DROP TABLE test;
CREATE TABLE test (ID int not null,string varchar(max));
INSERT INTO test VALUES (1,'this is a bit of text with the term C# somewhere in it'),(2,'and this one uses C++ instead.'),(3,'whilst this just has a c in it somewhere')
CREATE UNIQUE INDEX ixID ON test (ID);
CREATE FULLTEXT INDEX ON test (string) KEY INDEX ixID;

SELECT * FROM test WHERE CONTAINS(string,'C#');
SELECT * FROM test WHERE CHARINDEX('C#',string)>0;

Returns:

ID string
1 this is a bit of text with the term C# somewhere in it
ID string
1 this is a bit of text with the term C# somewhere in it


However (BIG however).....
If you change the above script and replace all instances of C# with, say, X# and all instances of C++ with X++ then you'll find that it doesn't work any more. What this means is that Microsoft must have a list of "words" that full text indexing is able to pick up on: if your search terms, as in the OP, are included in that list then you're fine, but if they are not then you're no further forward. Someone with more experience of full text editting can probably tell you if and how you might control the "word list"........
Post #1539667
Posted Monday, February 10, 2014 3:40 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Wednesday, August 20, 2014 9:39 AM
Points: 30, Visits: 183
I just tried a query like

SELECT ID, string FROM test
INNER JOIN containstable(test, string, 'C#')
AS KEY_TBL ON test.ID = KEY_TBL.[KEY]

It is very interesting that it returns correct results on SQL Server 2012, but ignores # on SQL Server 2008.


andyscott (2/10/2014)

However (BIG however).....
If you change the above script and replace all instances of C# with, say, X# and all instances of C++ with X++ then you'll find that it doesn't work any more. What this means is that Microsoft must have a list of "words" that full text indexing is able to pick up on: if your search terms, as in the OP, are included in that list then you're fine, but if they are not then you're no further forward. Someone with more experience of full text editting can probably tell you if and how you might control the "word list"........


Maybe there is some internal English dictionary which contains all "real" words.

It looks like all words that I need are included in that list, so I am fine with it.

Thanks guys!
Post #1539971
Posted Monday, February 10, 2014 3:47 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Yesterday @ 5:29 PM
Points: 3,513, Visits: 7,565
Maybe this can help:
Creating Custom Dictionaries for special terms to be indexed 'as-is' in SQL Server 2008 Full-Text Indexes



Luis C.
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

Forum Etiquette: How to post data/code on a forum to get the best help
Post #1539973
Posted Friday, March 28, 2014 6:04 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, March 31, 2014 4:24 PM
Points: 1, Visits: 27
I suspect that on 2008 (R2) the # is treated as a wildcard for characters that are number and this might explain why C# is handled differently by 2008 and 2012. By default SS strips leading zeros off numbers when word breaking so searching for 00123 would find documents containing 123. To get around this in 2008 you can use a custom dictionary that includes "0#" (without quotes) and then 00123 would be considered to be a word rather than a number and the leading zeros would not be stripped by the word breaker. However this behaviour has changed in 2012.

Sorry no real references to back this up, mostly by trial and error. :-( It would be nice if this was all documented somewhere, but I haven't been able to find much at all.
Post #1555854
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse