Database Design For Multiple Langauges - More Columns vs. More Tables

  • We are getting ready to create a new site for at least 3 langauges (English, French, Spanish), with an undefined amount to be determined later. We have tossed around creating a column for each of the languages in each table.

    For example:

    ID, Text_Eng, Text_Fr, Text_sp,

    We also looked at creating a sperate table to hold the translations. What has worked for everyone in the past.

    OT: We also are looking at using a combination of XML and XSLT (for the various height/width requirements in the more verbose langauges) to handle the static web pages as far as the translation is concerned. Any thoughts?

    Anton

  • Hi Anton,

    while the approach with additional fields looks more simple, it actually isn't: first, you have unstable table structure, that needs to be changed each time you add a language. Second, your SQL statements will have to be composed dynamically, since column name will depend on the language.

    Having in mind your explanation "with an indefined amount to be determined later", I vote for the approach with separate table with translations. The usual implementation is to assign a code for each text that is to be translated, create one base table with default message texts, one base table with languages, and one translation table (link between texts and languages). This works well enough if your site has the same structure in each language. If the structure depends on the language, than you may need additional customization.

    HTH,

    Goce.

  • Goce,

    Are you saying something like this:

    table 1

    1  textHere

    2  something

    table 2

    5  Spanish version of "textHere"

    6  Spanish version of "something"

    table 3 (connection table)

    1  5

    2  6

    If so, then where/how do you tell the app that you want Spanish this time?

    Also, what's the best strategy for globally changing the language of all the labels throughout the app?

    Sam

  • Hi Sam,

    I ment something like this (a standard approach):

    table Languages:
    ID Name
    1  English
    2  Spanish
    ...
    
    table Texts:
    ID Default_Text
    1  textHere
    2  something
    ...
    
    table Translations:
    Language Text_ID Translation
    1         1       English version of "textHere"
    1         2       English version of "something"
    2         1       Spanish version of "textHere"
    2         2       Spanish version of "something"
    ...
    

    We are discussung a database structure here, and it is used for storage only. The user, through the application, selects the language that will be used, and the application retrieves the correct portion of Translation table from the database. Each label that should be localized must reference a text code, instead of containing the actual text.

    Regards,

    Goce.

  • G'day all,

    Goce's approach has a few advantages not pointed out above.  disclaimer: I am prejudiced in favor of this approach as I have used it successfully in multiple products.

    Ease of loading translations: If you send your strings to an outside firm for translation, then loading the new strings into a single table is much less error prone than loading to multiple tables spread across you system.

    Locking during translation loads: Using a single table for text strings and translated strings greatly reduces the amount of locking during translation loads.  Only a single table need be locked rather than multiple tables across the system.  In addition, much of the system accesses the string table in read-only mode, thus enabling the administrator to do string loads during normal business hours, with due caution.

    Query and traversal performance: This may seem like a small matter, but with tables containing several hunder million rows small performance improvements add up fast.  Moving the default string, and translated strings to a separate table leaves mostly numeric data or single-character data in the remaining tables.  Selects from the non-string tables will be measureably faster than if the strings are spread across the DB.  NOTE: I did not say perceptibly - only measureably.

    There are more advantages that I will add as they occur to me.

    Disadvantages anyone?

    Have a nice day

    Wayne

  • I know this is an old thread but this is a subject which I will soon have to get into. 

    I actually don't like either solution.  The multi-column table is horrible as far as maintenance is concerned. Data reads are equally disastrous for the reasons mentioned in an earlie post (dynamic sql).  The only advantage I see to it is the ability to have single reference foreign keys on the table, which is the disadvantage of the second.  With the second solution, you have multiple rows which are actually the same answer.  Something in english and something in spanish both refer to the same (some?) thing.  This causes other problems relating to data health.

    I am thinking about using XML as the value/description column.  The xml would define the language "<en>something</en><pt>algo</pt>" and then using a parser to return the appropriate language.  If the language is passed in as a sproc parameter the appropriate text is very easily returned and all of the languages have a common key which means db integrity is maintained.  Another nice advantage to this, is that schema changes aren't to heavy.  Many of the columns could be change from nvarchar to text in order to handle larger data, and then simply running the columns through an update which executes: set value = '<en>'+value+'</en>' would do the trick since all we have is english currently.  A function could do the parsing and then each stored procedure could have references to the column replaced with a function call such as ' select pkey_id, dbo.parsexml('en', my_multilanguage_column) from my_multilanguage_table'

    The problem is that in sql2k, the xml parser is external to the engine and sucks up resources, however, this is not going to be the issue with 2K5 and since the engine is designed for xml, this solution works right into it.

    Any ideas on possible success with this approach?

     

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply