Free text fields in dimensional model

  • Hello all,

    I am after some advice on the best way to model some data with free text fields. The following is simplified but generally I have a FactIncident table and then a dimension for this called DimPropertyType. There are effectively 3 fields that define the propertytype called Type1, Type2 and Type3 and these each contain one of a possible 20 values. Initially what I wanted to do was to simply have DimPropertype with the following fields:

    PropertyTypeKey

    Type1

    Type2

    Type3

    However looking at the data for each set of property type options there is an option titled 'Other' and then also an additional set of fields called - Type1OtherText, Type2OtherText and Type3OtherText. I have looked through the data and about 80% of each of those fields have been set to 'Other' with the respective free text set. Speaking to the business analysts they do some searches that use these fields as constraints so they need to be in there somewhere.

    Does anyone have any advice on the best way of dealing with this situation? Looking through the data this issue occurs in a number of different dimensions so I am going to have to deal with this a number of times.

    Thanks.

  • free text should not be stored in the DW, it is not a member, measure or key 🙂

    If they are using some 'feature' of the free text then that should be turned into a member, measure or key as part of the ETL process.

    If they really need the text then it should be stored as attributes on the FACT record, (but you still need the member, measure or key extracted) If there is a 1..n relationship then store the data in XML in a single field (But I would still push back against this really hard!)

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply