Video: How to make computers less biased
Michael > March 18th, 2022, 12:12 AM
The Economist uploaded this video to their YouTube channel on February 10 (2022).
Watch How to make computers less biased | The Economist from YouTube
SF-Fandom not responsible for video content
SF-Fandom reserves the right to remove inappropriate video content from its discussions.
YouTube may remove the video from its service without notification.
This is a topic with which I have some familiarity for both professional and personal reasons. However, some of the story's revelations were new to me.
The video implies (near the beginning) that computer algorithms are "biased" - that is, that they tend to favor certain outcomes - in this case, racially motivated outcomes.
The debate and discussion about algorithmic bias is quite old, but was never cast in racial terms in my experience. And the video leaves that context in the realm of ambiguity. As the story unfolds, it leads to the conclusion that the data used to train the machine learning algorithms was the source of the bias.
So, that's probably true in some cases. But machine learning bias can be much more subtle than that. The machine learning "community" consists of several million people across the world who have professional and aspirational interests in this sub-discipline of computer science. It's not a simple, easily defined topic by any means.
Algorithmic bias is the subject of many research papers. Some researchers have claimed to find bias that exists in the code (not necessarily racial bias, but given the right data it could perform that way).
There are probably multiple types, perhaps multiple levels of bias. For example, the people who write the algorithms (and they comprise a relatively small percentage of the several millions of people who are active in machine learning) think in certain terms. They approach problem solving in certain ways. As a computer scientist, I know that we are taught to write algorithms according to a consistent set of rules - that is, if the algorithms are rigidly evaluated according to the rules of logic.
Computer science requires a fair amount of math, especially at the more advanced levels. I had to take calculus and linear algebra and set theory to earn my degree. You write a lot of proofs, and learning to prove an algorithm is logically correct (that it will perform as desired or expected) is a painful process. It looks so easy when the professor writes "Given A and knowing that b and c are similar and E is the set of ..." you think all I have to do is follow these steps and I've got a proof. And so students' papers come back with a lot of red marks, or maybe just a big red "NO! It Fails Here!"
Formal logic is a pain to learn.
But when you get out into the real world of computing, where you have deadlines and join teams with other people you've never worked with before, you lose track of that formal logic. People don't invest much time in writing those formal proofs. In fact, we tend to depend on tools written by other programmers to help us find the flaws in our logic. Or we just run the code and fix it until we get the results we desire.
The problem is that all this code tends to get very complicated and it can produce results you don't desire only because you didn't think about those results. That's where "it's a feature, not a bug" becomes a classic programmer joke. The unforeseen consequences of running code on data you didn't know could exist can look bizarre in the best/worst-case scenario - there's no doubt it's a bug in that case. But sometimes the output looks very familiar, so close to what you expect/want that it doesn't look like a bug right away.
Software could be in the field for days, weeks, months, or even years before these subtle bugs are discovered. And I know this because I've had to fix other people's code years after it was written. I've also reported bugs to software vendors that other people missed. This happens all the time. People are constantly finding bugs that slipped past development specs, quality control, alpha and beta testing, and even several versions of real-world releases.
Machine learning algorithms are no different from other types of software in that respect. And that brings me back to the Economist video. Yes, they sort of concclude that the data selection is most likely the problem. But what the video journalists missed (or maybe misunderstood) was that there is a very real chance that these algorithms do have certain biases that can't be fixed by fixing the data.
And that should give everyone pause regardless of how much they understand or know about computer science and machine learning.
So why did I put this into the Politics forum? Well, it just seemed like the right place for a video that discusses racial bias in computer algorithms. It may not be a perfect fit, but that's my bias revealing itself.