Can Data Save the World?

  • Comments posted to this topic are about the item Can Data Save the World?

  • When people choose experience over data is it possible that their experience is telling them that there are gaps in the data; their is not enough data?
    Is there some subliminal acknowledgement of Is there some subliminal acknowledgement of Simpsons Paradox

  • It's hard yes, but people also underestimate how hard it is to formulate the questions that drive the discovery.

  • Great Article. Reminds me of this bit from Hitchhikers Guide to the Galaxy.

    “... Nothing travels faster than the speed of light with the possible exception of bad news, which obeys it’s own special laws. The ... people ... did try to build spaceships that were powered by bad news but they ... were so unwelcome whenever they arrived anywhere that there really wasn’t much point in being there.”

    Kidding aside though, I agree. The patterns, the ones we haven’t discovered yet, or at least haven’t verified, are where it starts getting exciting.

    Thanks for calling this out.

  • Thinking you can save the world with data is a young man's fallacy. Data won't save the world because people are fundamentally irrational. And the real problems are complex with multiple moral and ethical landmines. Might as well try to control the weather.

    If you jump down the rabbit hole and really try to figure out the fundamental issues of the world, you will get mad, frustrated, and depressed. You can't handle the truth without changing. Be prepared for a journey into the darker side of human nature and weakness.

  • I read the initial Freakonomics book, and found it interesting, but also felt there was a great deal of selective 'hindsight' in the evaluation of different scenarios.

    Data can be seductive, especially as you note, to young people with little experience. It's become almost an article of faith that they have been taught that data is all that is needed.

    But conclusions based on date are only accurate when 1) the limitations of the data are well understood and 2) external current and future factors modifying the significance of data items are well understood (this is often unobtainable). (Remember 'new Coke').

    Tiny social, political, or physical perturbations have a habit of rippling through the world and can rapidly invalidate large swaths of data. Often times people with a sharply tuned sense of what's happening in the national or social climate see tiny shifts that will have a big effect later.

    ...

    -- FORTRAN manual for Xerox Computers --

  • cplommer - Tuesday, December 5, 2017 5:20 AM

    Great Article. Reminds me of this bit from Hitchhikers Guide to the Galaxy.“... Nothing travels faster than the speed of light with the possible exception of bad news, which obeys it’s own special laws. The ... people ... did try to build spaceships that were powered by bad news but they ... were so unwelcome whenever they arrived anywhere that there really wasn’t much point in being there.â€Kidding aside though, I agree. The patterns, the ones we haven’t discovered yet, or at least haven’t verified, are where it starts getting exciting. Thanks for calling this out.

    Here's another one that seems pertinent:
    "... The [Mid-Galactic] Census report, like most such surveys, had cost an awful lot of money and didn't tell anybody anything they didn't already know- except that every single person in the Galaxy had 2.4 legs and owned a hyena. Since this was clearly not true the whole thing had eventually to be scrapped."

  • IMHO, there is always a story with the data:  it's an inescapable function of data analysis.  And if you don't tell a story, the listener or reader provides their own which will reflect the preconceived notions they bring to the data.

    To the author's point about the difficulty in teaching someone to analyze data and tell a story, there is a perfectly good way to do it.  It's not necessarily harder than teaching someone how to program, but it can take a lot more time, usually four years or so....

  • Very interesting. I’m going to have to find this podcast. In the OP you said that younger people tend to be more open to the idea of trusting the data whereas old dogs sometimes prefer their experience, even if the data exposes flaws in their opinion. I’m reading Hacked by Donna Brazile where she explains some friction she had as the short-lived head of the DNC versus Robby Mook and company running the Clinton campaign. She describes this exact scenario. Robby and his cadre were the young data-driven mavens, whizzing through data, modeling voters, making calculated decisions about who they needed to win the election, who they could ignore, who they could offend. Donna, on the other hand, was this old dog who’d been around the block many times, and was trusting her instincts, her experience, her allies, etc. In the end it does seem that (whatever else is true about the 2016 election) Robby and company missed some vital information in the data. It got me wondering just how reliable “good data” is and how far we still have to go before we can truly rely confidently on data like this.

  • Scoan

    You make a very good point. Data is potentially useful, but it's not everything. 

    It's common for the young (I actually was there once!) to think they've got a whole new way of seeing things (as if they're the first generation who thought that way!), and the only 'problem' is the older guys are too set in their ways. That's how a lot of mistakes get made.

    Many years ago, I was reading an article about the involvement of scientists in early nuclear weapon policy, looking at Truman, Eisenhower etc. The article was written from the point of view of the scientists, and discussed how much input the scientists had in the different administrations. What struck me about the article, by the time I finished, was the assumption that since scientists understood the physics of the weaponry, they should be the ones who determined policy. What the author failed to fully understand was that policy was not about physics, but about international gamesmanship. Being a scientist does qualify you for that field.

    Mathematicians can talk for hours about the statistics of poker, but it doesn't make them good poker players.

    ...

    -- FORTRAN manual for Xerox Computers --

  • I guess when we refer to the story we're talking about narrative. Experience is a form of data or empirical evidence collected first hand by the individual and shouldn't be discounted. For example, a mayor running for re-election can point to charts and graphs illustrating how the city's south-side has been steadily improving economically for the past decade, but if the audience actually live on the south-side in a bad neighborhood or happen to drive through such a neighborhood while commuting to work, they'll have a hard time being persuaded. Even if the data does make some valid points, the narrative itself is not necessarily coherent or true in a meaningful way to the audience. It's interesting how economists or political activists can spin widely divergent or even opposing narratives using the exact same underlying data. It's also interesting how two people having the same experience can arrive at completely different conclusions.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • It will be harder for data to save the world when people in power keep saying it's fake news!



    Alvin Ramard
    Memphis PASS Chapter[/url]

    All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.

    For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • I shouldn't even get started on the value of good data and the responsibility that we as practitioners have to make sure we produce and analyze it accurately.  It is easy to expect other sources of data to be accurate while letting our own slide.  For instance, we expect our financial system to provide us with accurate data, but we still (should) reconcile theirs to ours.  I guess it is easier to be concerned about others than it is to be concerned about our own. 

    Probably the most blatant misuse of information is the presentation of 'information' not necessarily based on fact in the world of advertising.  This has become an accepted practice in our society.  But there is little difference between that and allowing our systems to produce inaccuracies on which decisions can be made.  We may not always be directly responsible for the interpretation, but I think do have a responsibility to make the basis true and accurate. 

    Another small example is my experience in a certain business where I maintained systems recording and reporting inventory.  It was impossible to maintain accurate records because owners made a practice of removing stock for personal use and consumption without producing the proper records.  Now obviously the inventory was theirs to use as they saw fit, but their practices made it not possible for me to fulfill my function of maintaining their data accuracy.

    Later on, in another business it was the express directive that we allow data to be entered into the systems regardless of validation errors that might have been detected and corrected.  The assumption was that it could be altered as needed later on in the process, when technical drawings were produced with obviously strange results.  This makes one concerned that if there are obvious flaws there most likely are those not so obvious.  Logically this can lead to concerns regarding legal responsibility for inaccuracies.

    Further, if you are consuming medications that supposedly have been thoroughly tested and documented as safe and effective, I'm pretty sure you expect that the test results have been accurately collected, stored, reported, and analyzed.

    In a nutshell, there are always legal, moral, and practical reasons for all of us making sure our data is valid, even if we have no control over the use and interpretation of same.

    Rick
    Disaster Recovery = Backup ( Backup ( Your Backup ) )

  • I thinks it's much easier to teach someone a programming language (it's mostly specific rules) then critical thinking skills (analysis/insight). Data is pretty useless without good analysis. Data scientists are like economists in that they are only as good as the models they build. Perhaps a better comparison are real scientists at CERN poring over ludicrous amounts of data looking for patterns that describe new particles and trying to separate the noise from the music. I believe analytical skills are far more important to an IT professional than any language skill you will ever know. They're merely tools.

  • jay-h - Tuesday, December 5, 2017 7:37 AM

    Mathematicians can talk for hours about the statistics of poker, but it doesn't make them good poker players.

    Score.

Viewing 15 posts - 1 through 15 (of 67 total)

You must be logged in to reply to this topic. Login to reply