• I think you've missed the key architectural point - the 3 column xref tables dramatically reduce disk IO which is much more of a bottleneck than the cpu doing a hash function. The thing is to compare the width of the target table to the width of the xref table.

    Many data warehouse tables can be quite wide. Reading a 50 column wide target table just to get the hash value is many times slower than reading the 3 column wide xref table.

    I now understand your point in using xref table, thanks for the clarification.