Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««1234»»»

Multiple Language Database Design - Additional Columns vs. Additional Tables... Expand / Collapse
Author
Message
Posted Thursday, December 6, 2007 3:36 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Saturday, February 2, 2013 8:21 AM
Points: 283, Visits: 268
Yes, I wouldn't have assumed that Asian character sets would be so easy. This is maybe another case for adding yet more "Value" columns which have their own Collation set properly to the appropriate Asian language. It is possible to have multiple collations in a table, but only one collation on the database. You have to modify your queries and procs to explicitly set the collation for those languages because SQL won't do it for you in some cases.

For example:

ALTER TABLE dbo.ObjectGlobalization
ADD [Value_] NVARCHAR(MAX) COLLATE Chinese_PRC_CI_AS

I don't know, but as a developer and DBA I am not one to want to maintain 32 different sets of code for one application and 32 different databases. There should be an easier way and if you have experience in this, can you shed some more light?



Post #430111
Posted Thursday, December 6, 2007 5:33 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: 2 days ago @ 4:18 AM
Points: 2,886, Visits: 3,256
I would strongly favour separate tables for each language. That way you do not have redundant columns for languages you do not use. You can also specify the relevant collation for each table, according to its language.

Original author: SQL Server FineBuild 1-click install and best practice configuration of SQL Server 2014, 2012, 2008 R2, 2008 and 2005. 18 October 2014: now over 31,000 downloads.
Disclaimer: All information provided is a personal opinion that may not match reality.
Concept: "Pizza Apartheid" - the discrimination that separates those who earn enough in one day to buy a pizza if they want one, from those who can not.
Post #430130
Posted Tuesday, July 29, 2008 5:54 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Tuesday, April 7, 2009 4:20 PM
Points: 2, Visits: 12
SOME PROS AND CONS OF THE DIFFERENT OPTIONS

I can see 3 different approaches regarding the storage of content in different languages:

1) one 'master' table holding invariant content, and an auxiliary table holding in different rows the versions of data (structure: idItem, idLang, text);

2) one 'master' table for invariant content and auxiliary tables -one for each language- (structure: idItem, text);

3) only one table, with different columns for the different languages for each translated field (structure: idItem, idCategory, text-en_US, text-es_ES, ...)

I don't consider the many 'master' tables with and only ONE auxiliary table holiding the different versions of ALL the tables (format: idItem, table_name, lang_id, text). I find it very similar to model 1 but with serious dissadvantages: this will be a huge table with n*m*L rows, and it needs a mechanism to deal with tables using different quantities of columns as PKs.

The following analisys is for a system consisting of T tables with m translated fields and n rows, and L languages.

Fallback refers to a mechanism to present the text in an alternative language in case text in the original language is not available.

The pros (-) and cons (+) I can think of for each approach are:

1) master + translationS table
+ saves on empty translated values
+ language add: do nothing
- one charset, collation fits all
- slower (requires 1 JOIN on n*L rows, returns m+1 rows to be processed)-can this be optimised with stored procedures?
- adds T tables
- adds n*m*L rows in total
- fallback fair: all translations must be fetch (returns (m*L)+1 rows to be processed)

2) master + translation tableS
+ each table with its own appropiate charset and collation
+ saves on empty translated values
- slower (requires 1 JOIN on n*L rows, returns L+1 rows to be processed)-can this be optimised with stored procedures?
- adds T*L tables;
- language add: N new tables must be created
- adds n*m*L rows in total
- fallback very expensive: all translations must be fetch (requires L JOINs on n*L rows, returns (m*L)+1 rows to be processed)

3) one table, multiple columns for translated fields
+ each column with its own appropiate charset and collation (does the DB or table definition impose any limit?)
+ does not add any row
+ faster (requires 1 SELECT on n rows, returns 1 row to be processed)
+ fallback unexpensive: texts in every language are retrieved in the original query
- language add: m new columns in the T different tables
- adds T*m*L columns in total

It seems to me that each particular situation make one of the alternatives the best (i.e. if most of the translated values are empty saving that space is be relevant, otherwise it's not). Anway it seems to me that option no. 3 -even if it looks like the most prosaic of them all- has several important advantages over the other ones. This option makes the administrator work harder (every time a language is added-hopefully not that ofen) but performs much faster on each user click (many times every hour). Or am I forgetting something?
Post #543138
Posted Wednesday, July 30, 2008 5:53 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Saturday, February 2, 2013 8:21 AM
Points: 283, Visits: 268
One thing I'd like to add to what [andarin2] said with options #3 is that there is all this talk of space savings. When you design a table, the sum of all the column widths are what the "potential" row size would be IF you filled in every column to its maximum value. However, with what we're talking about, having separate columns for each language would not waste more space than a 1-to-many translation table because you're only ever filling in 1-3 columns (if you want fall-back translations, say from es_MX to es_ES). However, I do agree that this approach does have more maintenance associated with it and even though SQL supports large numbers of columns (how many cultures are there?), it's probably not a good idea.

Using the 1-to-many approach (one translation table) would work, but you can only set 1 collation on the "Value" column and it needs to be nvarchar.

Having many tables (one per culture) would help with collations, however, it's maintenance is probably even more than the first option (multiple columns) and the JOINs would quickly become unmaintainable and complex as time goes on.



Post #543356
Posted Monday, November 10, 2008 6:42 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Saturday, February 2, 2013 8:21 AM
Points: 283, Visits: 268
One thing I'd like to add is that you can use the one "Value" column for all the Latin-based charsets. I'm pretty sure all English, Spanish, French, etc can use one Collation. Therefore you only ever need to add additional Culture-specific columns for the different Collations of say Asian or Russian characters.

Has anyone got anything better? I'd really like to know :)



Post #599823
Posted Monday, November 10, 2008 8:36 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Tuesday, April 7, 2009 4:20 PM
Points: 2, Visits: 12
unfortunately Spanish does not work with the same collation as English, for ñ is a letter (to appear after n) by itself. moreover ch and ll were 'letters' by themselves until the 1994 reform, so it is also necessary to choose between traditional or modern sorting...
Post #599915
Posted Wednesday, December 3, 2008 4:21 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Tuesday, February 1, 2011 3:06 PM
Points: 2, Visits: 25
I reckon the question should be, why cant we have SQL manage this for us?

Here would be my ideal scenario:

Example Schema. Assume we have a 'default culture', (let's say 'en-US') set at the Database Instance level.

CREATE TABLE Advertisement
(
AdvertisementID INT NOT NULL PRIMARY KEY,
Header NVARCHAR(500),
Body NVARCHAR(MAX)
)

When INSERTing into this table there are 2 ways to handle different cultures:
- insert to the default culture. no changes for this.. (ie.. INSERT INTO Advertisement (AdvertisementIDm, Header, Body) VALUES (1, 'American Header', 'American Body')).
- specify a culture string when inserting. i.e:
SET SESSIONCULTURE 'en-UK'
GO

INSERT INTO Advertisement (1, Header, Body) VALUES (1, 'British Header', 'British Body')

Then, when we select from this table we'd again, have two options
SELECT * FROM Advertisement WHERE AdvertisementID = 1 --returns default (en-US) value, or;
SELECT * FROM Advertisement WHERE AdvertisementID = 1 AND @@CultureString == 'en-UK' --returns UK. (or you could do this via a SET SESSION).

My thinking would be that you woulndt be able to insert a culture-specific value (i.e en-UK) until the default value has been inserted.

Anybody got any thoughts??
Cheers,
Post #613273
Posted Friday, December 5, 2008 6:09 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Saturday, February 2, 2013 8:21 AM
Points: 283, Visits: 268

SET SESSIONCULTURE 'en-UK'
GO

SELECT * FROM Advertisement WHERE AdvertisementID = 1 AND @@CultureString == 'en-UK'


I'm not seeing these Functions in SQL anywhere...



Post #614441
Posted Friday, December 5, 2008 1:07 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Tuesday, February 1, 2011 3:06 PM
Points: 2, Visits: 25
yeah i wrote them intended to be a wishlisht, or if similar functionality exists, someone might be able to point it out to me??
Regards,
Post #614922
Posted Sunday, December 7, 2008 12:19 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Saturday, February 2, 2013 8:21 AM
Points: 283, Visits: 268
Well, yes, I also wish MS would have just allowed us to specify a culture code. There are two functions @@LANGID and @@LANGUAGE. These will show you what the current Language is configured at on your current ConnectionString (or Query Analyzer / Mgmt Studio).

Select @@LANGID, @@LANGUAGE

You can set the "Current Language" on your connection strings (see www.connectionstrings.com). However, this refers to the LangID defined in SQL Server's SysLanguages. So, you can't just specify a culture of "en-US", it has to be LangID "0". British for example is LangID "23".

So, let's assume you modify your connectionstrings based on the current culture of the client (keep in mind, SQL Server does NOT by any means have a complete list of cultures you can map to - there's currently only 33 on my server). Let's also assume you have a "mapping" table created that maps between SQL Server LANGID's and Culture codes.

CREATE TABLE dbo.Cultures (
ID [int] IDENTITY(1,1) Primary Key,
Name [varchar](50),
Code [varchar](5),
LanguageID [int]
)

ConnectionString = "Data Source=(local);Initial Catalog=MyDB;Integrated Security=SSPI;Current Language=23;"

Now, in you Procs or Queries, you can determine the culture from that language.

Declare @Culture VarChar(5)
Select @Culture = Culture
FROM dbo.Cultures
Where LangID = @@LANGID

Then...

Select *
From dbo.SomeTable
Where ID = ?
And Culture = @Culture
------------------------------------------------------------

This is flaky at best because there's no possible way to ensure that SQL has the LangID you want to use. A better approach would be to have a standard applied to all your PROCS that the last parameter is always a "@ClientContext xml" parameter which contains certain specifics about the client requesting data (Culture, TimeZoneOffset, any thing else necessary). Then, use that parameter to determine the correct content.

Declare @ClientContext xml
SET @ClientContext = '<Context>
<Culture>en-US</Culture>
<TimeZoneOffset>240</TimeZoneOffset>
</Context>'

SELECT
T.ID,
T.Culture,
T.CreatedDate As CreatedDateUtc
DateAdd(mi, T.CreatedDate, @ClientContext.value('//TimeZoneOffset[1]', 'int')) As CreatedDate
FROM dbo.SomeTable T
Where (Culture = @ClientContext.value('//Culture[1]', 'varchar(5)'))

(This is just off the top of my head, so it's just a general idea of how to query things)



Post #615237
« Prev Topic | Next Topic »

Add to briefcase ««1234»»»

Permissions Expand / Collapse