Believing the Data

  • Comments posted to this topic are about the item Believing the Data

  • There are so many things we have to ask ourselves with regard to autonomous systems.
    Which rules designed for human situations simply don't apply in autonomous ones?  If you take vehicle stopping distances at a given speed an element of it is the time it takes the human to react and begin to apply the brake.  A human will be thinking of the road, listening to the radio, thinking about what is for tea tonight and a thousand other things that will extend that reaction time.  An autonomous vehicle will be thinking about the job of piloting the vehicle 100% of the time and be indefatigable.  It may not have to detect and react to the vehicle in front, it may be told explicitly by the vehicle in front that braking will take place.  It may have any number of sensors going far beyond those of a human.
    The precision in driving made possible by such technology may make it possible to reduce road width to the point where cars travelling 100mph in opposite directions need only have 6" of clearance either side rather than several feet for human drivers, if we trusted humans to drive at 100mph on normal roads.

    In terms of believability of data I think there is a tendency to use data to prop up a preconceived view.  Hence the joke about using data the way that a drunk uses a lamp post.  For support rather than illumination.  As long as data tells you what you expect it goes unchallenged.  The instant it challenges what you need it to say then "the data must be wrong".

  • Why would anyone believe log-file data from the industry that routinely faked emissions tests to get round anti-pollution laws ?

  • Data does not lie, but it can be wrong. Machines, like humans, can come to the wrong conclusions based on the available data and variables.

  • In terms of the case mentioned, stop and think about the distances involved.  .4 feet is about 4.75 inches.  4.75 inches...  Quite likely, your cell phones' screen is longer than that, so it's no wonder the officer thought the vehicle was closer than the 10ft limit.  Now, what I could see potentially coming from this would be several things:
    A)  A method for the owner / operator of an autonomous vehicle (AV) to pull up the "black box" logging from the time in question, showing an officer they are correct or incorrect (think showing an officer dashcam video when you're involved in or witnessed an accident.)
    B)  A way for an officer to remotely pull up the logs of a vehicles sensors (and this one is scary from a privacy point of view)

    On the main thrust of the editorial, perhaps the biggest challenge I can see will be ensuring the security and integrity of both the data and the methods used to collect it.  I can easily see several potential points along the data flow where changes could be made.  The other point to consider when believing or not believing the data is, as you pointed out, the context.  ML systems are only as good as the data that is fed into them.  How many times has a disagreement occurred between people, because one side didn't bring up a point that "everyone knows?"  No matter how "smart" ML systems get, they'll only ever be as good as the data fed into them (Garbage In / Garbage Out,) and how well they were initially "trained" by the developers (whose own biases *will* creep into the system.)

    Perhaps the best that anyone can expect to achieve with such systems will be "trust but verify..."

  • jasona.work - Friday, April 13, 2018 5:56 AM

    In terms of the case mentioned, stop and think about the distances involved.  .4 feet is about 4.75 inches.  4.75 inches...  Quite likely, your cell phones' screen is longer than that, so it's no wonder the officer thought the vehicle was closer than the 10ft limit.  

    Also, 10 foot to where? Knee, elbow, big toe on forward stepping foot? If the law says 10 foot minimum, why shouldn't the automated system use a 12 or 15 ft safety margin? Certainly most good human drivers would do that. There will certainly be plenty of times that the automated system will misinterpret the surrounding situation. Is that a malfunction if the software fails to operate as intended? That would essentially be a manufacturing failure (just as a a malfunctioning airbag). I suspect lot of the liability is going to go back to the manufacturer/coder.

    ...

    -- FORTRAN manual for Xerox Computers --

  • Good evidence should be corroborated whether that's from individuals (notoriously unreliable) or from autonomous systems. If those systems are under control of parties with different motives and they still agree all the better ultimately. Set up the system to corroborate bias to discourage manipulation... data collection systems set up by individuals are prone to bias according to the individuals setting them up.

    Hence
    Double blind testing.

    Still argue that data is a better evidential starting point than no data.

  • For good reason, I'm sure that the police officer was watching that driver-less car closely, and I can picture him following it for some time, waiting for it to make a mistake. At this stage in development, I think all driver-less cars should be treated as if they are a naive 16-year-old with a fresh license. 

    Do driver-less cars see all pedestrians as an equal risk? For example, if a driver-less car were to pass a group of kids on bicycles, is it smart enough to anticipate the higher probability that this particular demographic of pedestrian can drift or suddenly dart into the street, and will it thus make a minor a adjustment to speed and distance accordingly? A conscientious human driver will make this type of subjective risk mitigation dozens of times while on their daily commute.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell - Friday, April 13, 2018 7:22 AM

     A conscientious human driver will make this type of subjective risk mitigation dozens of times while on their daily commute.

    Agreed. When needing to pass close to pedestrians, there is a great deal of value in being able to make eye contact, to determine that the pedestrian is not looking somewhere else. Not to mention cases where there is a traffic officer or road worker attempting to direct traffic.

    ...

    -- FORTRAN manual for Xerox Computers --

  • I am a firm believer in using data, facts, and science to prove things, but if it is digital it can be hacked.  So, there definitely needs to be a lot of security to protect and ensure that the data being provided from say a driver-less car has not been tampered with.

    It would not surprise me one bit that companies would knowingly falsify data to win a court case, or someone might hack a device to tamper with the data to make a company or individual guilty.

  • I think there's a fundamental difference between believing the data and understanding what the data means and how to act on it.  Per the example in the article the exact distance the car was from the person is irrelevant, what's relevant is the officer thought it was dangerously close.

  • jasona.work - Friday, April 13, 2018 5:56 AM

    In terms of the case mentioned, stop and think about the distances involved.  .4 feet is about 4.75 inches.  4.75 inches...  Quite likely, your cell phones' screen is longer than that, so it's no wonder the officer thought the vehicle was closer than the 10ft limit.  Now, what I could see potentially coming from this would be several things:
    A)  A method for the owner / operator of an autonomous vehicle (AV) to pull up the "black box" logging from the time in question, showing an officer they are correct or incorrect (think showing an officer dashcam video when you're involved in or witnessed an accident.)
    B)  A way for an officer to remotely pull up the logs of a vehicles sensors (and this one is scary from a privacy point of view)

    On the main thrust of the editorial, perhaps the biggest challenge I can see will be ensuring the security and integrity of both the data and the methods used to collect it.  I can easily see several potential points along the data flow where changes could be made.  The other point to consider when believing or not believing the data is, as you pointed out, the context.  ML systems are only as good as the data that is fed into them.  How many times has a disagreement occurred between people, because one side didn't bring up a point that "everyone knows?"  No matter how "smart" ML systems get, they'll only ever be as good as the data fed into them (Garbage In / Garbage Out,) and how well they were initially "trained" by the developers (whose own biases *will* creep into the system.)

    Perhaps the best that anyone can expect to achieve with such systems will be "trust but verify..."

    And be transparent with the verification.

  • I'll give it to you you in two words: Captain Sullenberger.
    If only the computer log data was used, and not eyes, instinct and observation, not to mention actual results (data) of those direct observations, his life (and many other pilots) would be very different today.

    Believe log data all you want. But bring a healthy skepticism to things that just don't seem right. We have become slaves to these machines and their outputs. I've worked with these things over 30 years, and it's only gotten worse, and will continue to do so.

    .

  • "In any case, we need to learn to trust and believe in the data, even if our eyes and instincts lead us to different conclusion."

    Well, Steve, I have to come down entirely on the other side in this argument.  I've been retired for a number of years now, but my 42 years in IT, as a developer, analyst, DBA, and department manager taught me differently.  Let alone purposeful manipulation of data, there are always going to be those things like bugs, inaccurate usage of programming language capability, and the like over and above the different interpretation of our data. 

    As I've related here before, in my last position as a DBA and SQL developer, I had proved to my shop that a custom reporting system was producing invalid data, and had developed the needed fixes, but they refused to implement the corrections. 

    Even though we make every effort to accurately collect, manipulate, and present data, there is still always going to be the process of creating the data in the first place.  And that will always involve judgment, error, and outright bias (AKA Facebook).  As our president Reagan said, 'Trust but Verify'.  In other words, be a healthy skeptic.

    Rick
    Disaster Recovery = Backup ( Backup ( Your Backup ) )

  • The old saying of "If you make something idiot proof, then only idiots will use it" comes to mind when I think of where we're going with data.  Another is what Granny used to say about certain professions... "Figures can be made to lie and liars figure". 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply