• osmereqork (7/24/2013)


    Thanks.

    Yes it would be nice if the source data contained codes rather than text, but unfortunatly thats not the case. I've decided to do basically what you said. I will receive the files in their native language and translate any of these fields back into english via lookups at the staging phase.

    If i have 20 such fields in some source data is it practical to have 20 individual lookups within a dataflow? Is there a better approach?

    You may find that translating by look-up is somewhat fraught. For example you may have a field called "temps" in one sort of record that needs to be "times" in English and another field called "temps" in a different sort of record that needs to be called "weather" in English. That suggests that the lookup has to be able to spot the case where which English word is to be used depends on which sort of record you are looking at - it's easy enough to do, but does add some overhead. Of course if you have only one record type you don't have this problem.

    An alternative, which might perform better or might not, would be to translate to a language independent field identifier which incorporates a record type identifier. This probably saves store compared with using strings throughout. It's flexible, adding a language is trivial, and if a field identifier can be an integer it probably speeds up translation on output. Again, if you have only one record type this can be simplified by not incorporating that in the field identifier.

    Tom