NULL values in source table don't add up

  • In my source tables, one is the fact table (call it General) and also contains Foreign Keys to the other tables. E.G. The General table will contain a column currency_code, which will link it to the Currency table. However, some rows in the General table may contain NULL values for the currency_code. The problem is, when trying to create a Currency Dimension for my cube, it does not include the NULL row values into the aggregation of the cube, causing data to be invalid (for example, the net_profit values in that NULL value line is not included when summing up the data).

    If i create a Currency dimension, only using the currency code from the General table, everything obviously adds up, but if i create a Currency Dimension using the Currency Table (to get the currency name), the NULL's are not included and the total values of all the NULL rows are left out, causing the value to be out by the summ of all rows where currency_code = NULL.

    Any suggestions or solutions, please.

    Thanks,

    Ryan

  • Hi Ryan,

    Can you add a catch-all level in your dimension table with a value of something like "N/A"? Then you could default all nulls to this level's key value...just a thought...

    hth,

    Michael

    Michael Weiss


    Michael Weiss

  • Thanks for your time and help, Michael, but, excuse me if i sound a bit dumb here, how do you add a catch-all level? Never heard of or about it before.

    Thanks,

    Ryan

  • You could add a 'Not Defined' value to your currency table and update all 'NULL' values in the fact table to point to this one.

    I guess this is about the thing Michael was indicating.

  • Well let's say that your dimension table has the following fields:

    currency_key, currency_type

    With data such as 1, US Dollar

    2, Euro

    Could you not have a data member that would look like 3, N/A? If you default all of your currency dimension foreign key values to 3 where they are null in your fact table, your data should aggregate correctly and you can always filter out those records where currency_type = "N/A".

    hth,

    Michael

    Michael Weiss


    Michael Weiss

  • As a slight twist on Michaels suggestion, we always try to use negative numbers (e.g -1 = n/a , -2 = unknown etc) for these types of placeholder/catch-all values, this way you can see them straight away in the fact table when viewing records without joining to the dimension table. A bigger advantage is that any SQL/DTS you use to load the fact records can know the alternate value to insert instead of NULL *without* having to do a look-up on the dimensional table on the description (ie [some select sql here] ISNULL(mycol1, -1), this is really only an advantage if you use identity fields for your surrogate key.

    HTH,

    steve

    Steve.

  • Thanks for all the help. I originally suggested replacing the null with a key, but management said that it was not the ideal situation as they wanted those specific fields to actually contain nothing (as there are actually about 5 or 6 foreign key references with null values in different rows), but it looks like they will have to make do with it.

    Thanks again, I appreciate it.

    Ryan.

  • Ryan,

    In my opinion there is no reason at all to not replace NULLs in your fact table by references to a 'N/A' record.

    I can see your management's point if you're talking about a table that is used in an OLTP environment, but in an OLAP environment NULL values in the dimension fields just make no sense.

  • Maybe I'm being overly simplistic here or this has already been tried, but why not something like this:

    SELECT SUM(CurrencyColumn) AS DollarSum

    WHERE CurrencyColumn NOT NULL

    That way, the NULL columns don't throw off your math.

    What that what you were talking about?

  • The fact table has many columns, including currency_code as well as, for example gross_profit. One row may have a NULL value for currency_code, but there will be a value for the gross_profit. If I try sum(gross_profit) where currency_code NOT NULL AND date = 'JUNE', for example, as per your suggestion, eljeffo, the value returned for gross_profit in JUNE is not the true value because it is not including those gross_profit_values where curerncy_code is NULL.

    Anyway, I have just suggested that we make all NULLS a key value, but management still believes that there is a better way...

    Oh well.

    Thanks for all the help and suggestion's.

  • I've been looking into this a bit closer.

    You might be able to solve this using a view as FACT table and/or as the Dimension table for currencies.

    For the FACT table you replace the NULL values with a dummy FK to the Dimension table :

    ISNULL(Currency_code, -1)

    .

    The Dimension table you can use the UNION to add the -1 currency code.

    Hope you can solve it this way...

  • I'm not quite sure that I fully understand you, NPEETERS.

    Could you please expand a bit more on your suggestion.

    I appreciate your time and effort.

    Thanks.

  • First, construct a view to substitute all NULL values in the fact table by a value. I am using '-1', since I guess this will not be present in the Currency table yet

    
    
    CREATE VIEW FactView AS
    SELECT IsNull(currency_code, -1),
    <All other required fields>
    FROM General

    Now, create a second view on the Currency table to include the newly created '-1' currency.

    
    
    CREATE VIEW DimensionView AS
    SELECT currency_code, currency_name
    FROM Currency
    UNION
    SELECT -1, 'Null'

    Now, build your cube using the FactView as your 'fact table' and the DimensionView as your Currency Dimension source.

    You will have an additional member of the dimension that is called 'Null' and that points to all the records in the fact table that don't have a currency code.

  • Thank you NPeeters. I tried out your suggestion and it all seems to work exactly as i was wanting, i appreciate it.

    Thanks again to all of you for your help.

    Ryan.

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply