Paradoxes of Big Data
Interesting paper: “Three Paradoxes of Big Data,” by Neil M. Richards and Jonathan H. King, Stanford Law Review Online, 2013.
Abstract: Big data is all the rage. Its proponents tout the use of sophisticated analytics to mine large data sets for insight as the solution to many of our society’s problems. These big data evangelists insist that data-driven decisionmaking can now give us better predictions in areas ranging from college admissions to dating to hiring to medicine to national security and crime prevention. But much of the rhetoric of big data contains no meaningful analysis of its potential perils, only the promise. We don’t deny that big data holds substantial potential for the future, and that large dataset analysis has important uses today. But we would like to sound a cautionary note and pause to consider big data’s potential more critically. In particular, we want to highlight three paradoxes in the current rhetoric about big data to help move us toward a more complete understanding of the big data picture. First, while big data pervasively collects all manner of private information, the operations of big data itself are almost entirely shrouded in legal and commercial secrecy. We call this the Transparency Paradox. Second, though big data evangelists talk in terms of miraculous outcomes, this rhetoric ignores the fact that big data seeks to identify at the expense of individual and collective identity. We call this the Identity Paradox. And third, the rhetoric of big data is characterized by its power to transform society, but big data has power effects of its own, which privilege large government and corporate entities at the expense of ordinary individuals. We call this the Power Paradox. Recognizing the paradoxes of big data, which show its perils alongside its potential, will help us to better understand this revolution. It may also allow us to craft solutions to produce a revolution that will be as good as its evangelists predict.
EDITED TO ADD (10/11): Here’s an HTML version of the paper.
Ben • September 26, 2013 7:58 AM
Dangers of big data. See
IBM And the Holocaust
<
blockquote>
Only after Jews were identified — a massive and complex task that Hitler wanted done immediately — could they be targeted for efficient asset confiscation, ghettoization, deportation, enslaved labor, and, ultimately, annihilation. It was a cross-tabulation and organizational challenge so monumental, it called for a computer. Of course, in the 1930s no computer existed.
<
blockquote>
But IBM’s Hollerith punch card technology did exist. Aided by the company’s custom-designed and constantly updated Hollerith systems, Hitler was able to automate his persecution of the Jews. Historians have always been amazed at the speed and accuracy with which the Nazis were able to identify and locate European Jewry. Until now, the pieces of this puzzle have never been fully assembled.
(…)
Edwin Black has now uncovered one of the last great mysteries of Germany’s war against the Jews — how did Hitler get the names?