SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 

Badly Trained AI

By Steve Jones,

Most of us know that data is being used to make more and more decisions inside of all kinds of organizations from retail giants to banks to sports teams. We are constantly asked, or see reports, of data driven decisions. We often need to show some data that supports and explains the rationale for making some choice. As our populace becomes more data savvy, I expect this trend to continue.

AI (Artificial Intelligence), and the related Machine Learning (ML), are becoming more and more widely used. From mobile phones to autos to trading systems, we regularly see new "AI capabilities" being added to products and services. No business or industry seems immune, and I'm sure many of you are seeing AI being incorporated or feeling pressure to start using some AI in your work. As you work with AI, or start to, you'll quickly realize the importance of data in your efforts.

This is true for the cleanliness of data, but perhaps even more important in the tagging of data sets. As Amazon learned, building an AI or ML system, is hard. They scrapped one system that was being used to rate resumes and help their recruiters sort through the volume of applications they received. Why? Because of bias.

Apparently the system would downgrade women's resumes for various reasons. To me, this is a perfect example of a principle I've had throughout my career: garbage in garbage out. In this case it's not necessarily bad data that was the problem, but bad tagging of what was a good and bad resume, probably from the internal prejudices of a few people.

There will be more dangers as we use ML and AI technologies in our work. It won't be enough that we clean the raw data for training, but also that we clean and properly manage the tagging of what data sets represent the results we are looking for. Like in much of our software, it's easy for us to only consider the happy path, to only tag those items we think are good results. That is useful, but we might also be unconsciously tagging other results as bad, which appears to have happened to Amazon.

We can build systems that do a better, more rational job than most humans, but we need extraordinary care to ensure our training data lacks bias. Unfortunately, most people both think they're not biased and are unwilling to spend extra resources to deeply examine the data. Two things that worry me about the future of our AI/ML systems that will inform us.

 
Total article views: 60 | Views in the last 30 days: 60
 
Related Articles
BLOG

Resume hoarders

Unfortunately in my line of work as a consultant, I am contacted far too often by resume hoarders, w...

ARTICLE

How To Write An Interview Winning Resume

Writing a resume that rocks is not an art. Follow these simple guidelines to help secure your next i...

ARTICLE

Crafting Your Resume

Today Steve Jones talks about how you can present yourself better within the confines of a resume.

FORUM

Where To Put the MCTS Logo on Resume?

Where To Put the MCTS Logo on Resume?

BLOG

Touch Your Resume

One of the pieces of advice that I give in my talk "The Modern Resume" is that you should review,......

Tags
editorial    
 
Contribute