I have a client who will send me an Excel spreadsheet and I have to load that to a table.
1) This excel spreadsheet will be sent only 2 or 3 times an year, not daily.
2) It contains under 5k records.
3) This table will be used as a Reference Lookup table by those programs which need it
4) This table needs to maintain history of changes, should there be any.
Below is the model I am going to design, please share your thoughts if anything can be added to make it more efficient.
1) Design it as SCD-2, with current-ness identifier column
2) Nominate best possible Excel column as Natural Key, and rest of the columns are used for Change Detection
3) Any NK that is present in the Excel but not in my table will be used for INSERTs
4) Any row in Excel with its NK that is ALREADY present in my table, will be recognised as an UPDATE, if any of the non-NK columns has changed in comparison with what is in my table
PK-FK Referential Integrity
5) Since the scope of this table is a mere Reference Lookup table, it is NOT functionally related to any other tables (other than for lookup purposes). So, Foreign Key relations can be ignored.
6) Since this is a hand-written Excel spreadsheet, there is high chance for data quality issues, am getting a consensus from source about the possibility of errors and NULLs/Blanks. Am going to use CHECK constraints to emphasize Data Quality while loading to table.
Please share your thoughts adding-to or correcting the above list.