Thanks Steve, this is an interesting read. One thing appears to be of interest. The small data when strategically isolated or selected and the clutter or noise removed may help the analysts to find the conclusions or patterns. But the smaller sample might not be large enough to verify the assumptions and original thesis while the theoretical model is being identified. Only when the candidate theory built out of the smaller sample is held to the the larger dispersion can the theory be validated or dispelled. And at that time you have for the first time prove that the smaller sample was in fact a representation of the total population of the big data.
So we limit the selection of the test set of data to a collection of data that fits within the limitations of our tool set, which may or may not be a large enough representation of the data to reach the correct conclusions.
Or in short we bight off as much as we can chew and see what we can do with it. If it fits we test it out against the whole deal and see if we guessed right. If we did, we move to the next thing. If not we reanalyze and build another theory.
Not all gray hairs are Dinosaurs!