Hi Vincent
Nice overall concise article, however I don't agree with your Unknown record solution. The product dimension is not unknown, but rather only the atributes are unknown. But creating a force record of the product_id you don't lose the product_id information. Business can decide what they want as defaults for the product attributes So your data will now look like this:
Here is an example of a row in the source table with a product_id that does not exist in the dimension table:
order_id order_date product_id quantity price last_update
358 19/08/2006 BGCKZ 3 2.99 31/10/2006
"Force" the unknown product_id as an actual dimension record.
product_key product_id product_name description min_level valid_until load_time
231 BGCKZ Unknown Late Arriving 0 1/1/1900 2/10/2004
Now your lookups will still be accurate.
This is how fact_sales looks after that record is loaded:
fact_key date_key product_key order_id quantity price load_time
830937 2424 231 358 3 2.99 31/10/2006
This solves a problem for late arriving dimension records. Due to bad/unusual business process you may receive the product details for BGCKZ a few days after the fact record. It is just a matter of updating the existing dimension record with the correct details. This way business as well a foreign key integrity is preserved.
Regards,
Dudley