Proper name for a book about all the data scientist tools:
"Blunt tools for dummies".
If you think about it - >90% of the operations they do is a waste of time and resources.
Because exactly the same data analysing operations on the same sets of data (except for a small portion of newly added data, which usually does not exceed several percents of the whole amount of analysed data).
A smart solution would analyse only the newly arrived data and summarise the aggregations for the new set with the ones collected previously.
But that would make the big data look not so big, and that must be the problem.
With all the petabytes gone - what do you have to brag about?