Full-Text Search – Thesaurus Languages

  • Steve Jones - SSC Editor

    SSC Guru

    Points: 720952

  • demonfox

    SSCertifiable

    Points: 6289

    I got it right 🙂

    No reference required this time ; I got the answer since the last question posted by steve on thesaurus files 😀

    thanks for the question !!!

    Edit :

    ~ demonfox
    ___________________________________________________________________
    Wondering what I would do next , when I am done with this one :ermm:

  • Vinay Kumar

    SSCertifiable

    Points: 6098

    +1 🙂

    Thanks
    Vinay Kumar
    -----------------------------------------------------------------
    Keep Learning - Keep Growing !!!

  • Lokesh Vij

    SSChampion

    Points: 10836

    demonfox (3/18/2013)


    I got it right 🙂

    No reference required this time ; I got the answer since the last question posted by steve on thesaurus files 😀

    thanks for the question !!!

    Edit :

    +1

    I knew the right answer, not sure what make me think to mark the in-correct option :w00t:

    ~ Lokesh Vij


    Guidelines for quicker answers on T-SQL question[/url]
    Guidelines for answers on Performance questions

    Link to my Blog Post --> www.SQLPathy.com[/url]

    Follow me @Twitter

  • Carlo Romagnano

    SSC-Insane

    Points: 22010

    Good qotd!

    🙂

  • Hugo Kornelis

    SSC Guru

    Points: 64685

    Great question, Steve!!

    I had a strong hunch, but I wanted to know for sure so I tried to find a list of three-letter language codes. I couldn't - so I ended up just following my hunch to see if I was right and to see the documentation reference in the explanation.

    I turned out to be right, but the reference leaves me wanting more. It points to a list of language codes in the format xx-YY (four letters seperated by a dash), not the three-letter format required for thesaurus files. It appears as if the three-letter format is always found by removing the dash and the last letter from the listed language code, but this is not described on that web page. And after following the link on that page to http://msdn.microsoft.com/en-us/goglobal/bb896001.aspx (which is listed as being documentation for Windows Vista!), I see a table that suggests that this is not the case - but that also includes many languages that I believe not to be supported by SQL Server, so I'm not sure how relevant this is.

    Can anyone fill me in on the missing details?


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Raghavendra Mudugal

    SSChampion

    Points: 10658

    Simply the best, thank you for posting steve...

    (i looked into the thesaurus file location on the hard drive and then noticed eng/enu two files but was not sure which one was US? i thought possibly this will be the answer where there are two files on english and not knowing which to edit... and it was right. initially from the previous qtod on the same subject was related to using sp to load; i was thinking of that but then no answer and this in)

    ww; Raghu
    --
    The first and the hardest SQL statement I have wrote- "select * from customers" - and I was happy and felt smart.

  • (Bob Brown)

    SSCrazy

    Points: 2705

    Yay. Great question. Had to do a lot of research to get it right. Thanks.

  • TomThomson

    SSC Guru

    Points: 104773

    Good question, but the definite cultural bias is perhaps unfortunate. I suppose it's fair enough, as the default installation will use LCID 1033, not 2057. But there may be some Brits around for whom teseng.xml is the right file and they wouldn't stand much chance of spotting the right answer, would they?

    Tom

  • Hugo Kornelis

    SSC Guru

    Points: 64685

    L' Eomot Inversé (3/19/2013)


    Good question, but the definite cultural bias is perhaps unfortunate. I suppose it's fair enough, as the default installation will use LCID 1033, not 2057. But there may be some Brits around for whom teseng.xml is the right file and they wouldn't stand much chance of spotting the right answer, would they?

    Even for Brits, the tseng.xml file is NOT the right choice when "working with an American English SQL Server instance" (quote from question text; emphasis added by me). I guess you overlooked that part of the question?


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • demonfox

    SSCertifiable

    Points: 6289

    L' Eomot Inversé (3/19/2013)


    Good question, but the definite cultural bias is perhaps unfortunate. I suppose it's fair enough, as the default installation will use LCID 1033, not 2057. But there may be some Brits around for whom teseng.xml is the right file and they wouldn't stand much chance of spotting the right answer, would they?

    totally , out of the context , but there is no british english ; there is australian english, there is american english .. but only English , when it comes to britain ...:-P

    ~ demonfox
    ___________________________________________________________________
    Wondering what I would do next , when I am done with this one :ermm:

  • sestell1

    SSChampion

    Points: 10230

    Interesting question. In researching this, I was amazed how many posts I found stating that SQL Server needed to be restarted after changing a thesaurus file.

  • TomThomson

    SSC Guru

    Points: 104773

    Hugo Kornelis (3/19/2013)


    I turned out to be right, but the reference leaves me wanting more. It points to a list of language codes in the format xx-YY (four letters seperated by a dash), not the three-letter format required for thesaurus files. It appears as if the three-letter format is always found by removing the dash and the last letter from the listed language code, but this is not described on that web page. And after following the link on that page to http://msdn.microsoft.com/en-us/goglobal/bb896001.aspx (which is listed as being documentation for Windows Vista!), I see a table that suggests that this is not the case - but that also includes many languages that I believe not to be supported by SQL Server, so I'm not sure how relevant this is.

    Can anyone fill me in on the missing details?

    Well, the only way I know to get this is to start with the registry key HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSearch\Language]\MSSearch\Language or the equivalent on your machine (on my machine MSSQL10_50.MSSQLSERVER is the directory name for the instance of SQL server within the MSSQL directory). Each subkey under that includes a mapping from template xml filename to locale ID - the subkey name is sometimes in the three character format, sometimes in the 5-character format, but the values are always the filename using the 3-character format. This will provide all 48 (or 44, excluding duplicates) of the 33 (:w00t:) supported languages.

    Mappings from 33 locale ids to language are given on the sys.languages BOL page. What the other 15 locale ids is probably documented somewhere else, but may be irrelevant.

    Tom

  • (Bob Brown)

    SSCrazy

    Points: 2705

    Hugo Kornelis (3/19/2013)


    Can anyone fill me in on the missing details?

    I don't know if this is any help but it is what I used to answer this question:

    http://msdn.microsoft.com/en-us/library/ms142491.aspx

    http://msdn.microsoft.com/en-us/library/ms142491(v=sql.100).aspx

  • Hugo Kornelis

    SSC Guru

    Points: 64685

    L' Eomot Inversé (3/19/2013)


    Hugo Kornelis (3/19/2013)


    I turned out to be right, but the reference leaves me wanting more. It points to a list of language codes in the format xx-YY (four letters seperated by a dash), not the three-letter format required for thesaurus files. It appears as if the three-letter format is always found by removing the dash and the last letter from the listed language code, but this is not described on that web page. And after following the link on that page to http://msdn.microsoft.com/en-us/goglobal/bb896001.aspx (which is listed as being documentation for Windows Vista!), I see a table that suggests that this is not the case - but that also includes many languages that I believe not to be supported by SQL Server, so I'm not sure how relevant this is.

    Can anyone fill me in on the missing details?

    Well, the only way I know to get this is to start with the registry key HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSearch\Language]\MSSearch\Language or the equivalent on your machine (on my machine MSSQL10_50.MSSQLSERVER is the directory name for the instance of SQL server within the MSSQL directory). Each subkey under that includes a mapping from template xml filename to locale ID - the subkey name is sometimes in the three character format, sometimes in the 5-character format, but the values are always the filename using the 3-character format. This will provide all 48 (or 44, excluding duplicates) of the 33 (:w00t:) supported languages.

    Mappings from 33 locale ids to language are given on the sys.languages BOL page. What the other 15 locale ids is probably documented somewhere else, but may be irrelevant.

    Thanks, Tom!

    It's simply incredible that Microsoft makes it so hard to find the correct file to use for adding thesaurus entries for a language.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Viewing 15 posts - 1 through 15 (of 39 total)

You must be logged in to reply to this topic. Login to reply