I have an MPP PDW hardware scenario of 4 Compute nodes, each having 8 Distributions, totalling to 32 distributions.
I have a table that grows at the rate of 3 million per day.
My question is, if I partition my table on Date, I believe that REPLICATE is a better performant design than HASH Distribution, because - Partition is done at a higher level, and Distribution is done within EACH partition. So, it is advisable to Replicate a 3 million mini-table, than Hash Distributing it across Compute nodes.
please share your thoughts