Lookups SCD's Historical Data

  • Hi.

    I have an employee dimension with some recorded changes. That is, some employees have multiple entries regards name changes etc. I have an SCD that updates the Start/End date rather than just the TextStatus to 'CURRENT' or 'OLD' (By the way there is no option in SCD wizard to record both ways which is silly because I do want to do that). Any way, that is not the issue for this topic.

    Lets say that John smith got married on 3rd August 2006. Lets say that his John Smith entry has a surrogate key value of 26 which is populated with a EmployeeHistEndDate of 2nd August 2006. His new name is John Wickowski. He gets a new surrogate key which is 87 and starts on the 3rd August 2006. My implementation will record the change in SCD fine.

    However, when using a lookup to grab the Surrogate key of the employee when I feed the data into the downstream in order to load the fact table I am faced with an issue. I want the fact record that I am loading to refer to John Smith for transactions that occur before the 3rd August, so it needs to grab surr.key 26, not just 'Where EmployeeHistTextStatus = Current' which would grab key 87 instead.

    The issue is that I need some way of specifying in the lookup that the value that I am looking for is for that employee, but where the transaction date lies between the start and end historical date. Implementing this lookup that criteriorizes on a transaction date, seems to be awkward. I am unable to use the transaction date coming in downstream in order to grab the correct historical value.

    Can anyone point me in the right direction please???

    Regards,

    Nick.

  • If your database also contains a time dimension with a Date column, you can use a query like the following as the lookup table:

    SELECT

    E.ID, E.EMP_ID, P.DATE AS DATE

    FROM

    DIM_EMPLOYEE E

    JOIN

    DIM_TIME P ON P.DATE BETWEEN E.START_DATE AND COALESCE(E.END_DATE, GETDATE())

    where E.ID is the surrogate key en E.EMP_ID the business key. This only works when the START_DATE en END_DATE columns do not include a time fraction. You can use the EMP_ID and DATE column to lookup the surrogate key ID.

    Peter

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply