Using the 'LIKE' comparison

  • I have an interesting problem that I hope someone can help me with. I need to produce a query that will show the results of a name search on my database.

    For example if the user types in the word 'Lopez' I need to list all people who's name is

    like '%lopez%'.

    This works fine until I encounter special characters in a name such as 'López'.

    I need to be able to return both names as part of the same query.

    I have users in different countries entering this information using different language settings hence the problem when trying to do a like comparison.

    This apllies also for other language specific characters.

    Any help will be greatly appreciated.

  • One solution would be to perform a REPLACE of your search variable before performing the search. This would allow you to replace 'o' with '[oó]'.

    set @strName = replace(@strName, 'o', '[oó]')

    Obviously you could add in other acented characters etc.

  • Soudex is anytime better options !! than replace as replace restricts you to all the similar pattern which may be entered from anyone across globe.

    regards

    sanjeev

    What ever a Human can imagine , it can be achieved !!


    What ever a Human can imagine , it can be achieved !!

  • I agree that Soundex functionality will return the all the various characters used in this 'Lopez' example. However, due to the way in which soundex works it is possible for it to return names similar to 'Lopez'.

    So it comes down to how precise do you want your match to be? (Yes I know soundex has 4 levels of match......(ONLY 4!!))

  • Example: These are the results I get back from one of my databases using SOUNDEX.

    select distinct Surname

    from tblEmployee

    where difference(Surname, 'Smith') = 4

    Results:

    Sainty

    Samani

    Samina Seikh

    Sammimi

    Sammut

    Sanani

    Sandhi

    Sandhu

    Sandow

    Sandy

    Sant

    Saundh

    Seaman

    Seamen

    Seemann

    Seemon

    Shanahan

    Shand

    Shando

    Shanhan

    Shannon

    Shant

    Shimmin

    Siannot

    Simeon

    Simmnet

    Simmonds

    Simmonite

    Simon

    Simonds

    Simonite

    Sinnot

    Sinnott

    Smit

    Smith

    Smith - Left

    Smith-Taylor

    Smith?

    Smooth

    Smoothy

    Smoth

    Smout

    Smyth

    Smythe

    Snead

    Sneyd

    Snoad

    Somani

    Someone

    Sonnet

    Soundy

    Summon

    Symmonds

    Symon

    Symonds

    I'm not saying that SOUNDEX is the wrong choice, but you have to be aware that even at the highest level of soundex match (i.e. 4), you will get back names which bear no resemblance to the name being searched.

  • Thanks for all of the replies.

    Does anyone know of any other method, Soundex appears to be a little too vague for the level of accuracy that I require.

  • DoubleMetaphone is a kind of Soundex, but more powerful.

    The original article describing the method can be found at:

    http://www.cuj.com/articles/2000/0006/0006d/0006d.htm?topic=articles

    A translation of this function in T-SQL can be found at:

    http://www.planet-source-code.com/vb/scripts/ShowCode.asp?txtCodeId=519&lngWId=5

    It is quite an elaborate function, but the nice thing is you can improve it yourself.

    I myself use this method in a VB-variant and works fine.

    Erik

  • Thanks evorsten, these links are very helpful. Just a side note, I have been looking into the collation settings of my SQL installation and notice that there is an option setting for 'ACCENT-INSENSITIVE/SENSITIVE'. Before I start looking into rebuilding all of my databases in order to change the collation settings, would having this option set to 'ACCENT-INSENSITIVE' maybe solve my problem? i.e. would 'o' and 'ó' now get treated the same? Any thoughts would be greatly appreciated.

  • By the way, when I use LIKE I get worse performance than using charindex('searchstring', column) > 0 - Has anyone else noticed this?

  • Scout7

    quote:


    Before I start looking into rebuilding all of my databases in order to change the collation settings, would having this option set to 'ACCENT-INSENSITIVE' maybe solve my problem? i.e. would 'o' and 'ó' now get treated the same? Any thoughts would be greatly appreciated.


    It would solve your problem. è = e = é and so on in a ACCENT-INSENSITIVE collation.

  • quote:


    By the way, when I use LIKE I get worse performance than using charindex('searchstring', column) > 0 - Has anyone else noticed this?


    This is because of LIKE's flexibility, it has to run it's "rule-set" for every row

    When you use CHARINDEX you are specifying that you only want to find characters within the field

    LIKE is easier to read for general programming, but you should always check speed with alternatives, it's only good programming

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply