Mathematical models—algorithms—increasingly fuel the decisions and judgments that affect our lives: whether or not we get approved for a home loan, how our job performance is evaluated, where we go to school, and how our communities are policed. We like to think that math is neutral and these models unbiased—certainly fairer than humans, with their opinions and prejudices.
But as data scientist Cathy O’Neil shows in her revealing book, Weapons of Math Destruction: How Big Data Increases Inequality, the reality is far more complicated. “The math-powered applications powering the data economy were based on choices made by fallible human beings,” O’Neil writes of what she observed in 2010, as big data was increasingly asserting itself in human affairs. “Some of these choices were no doubt made with the best intentions. Nevertheless, many of these models encoded human prejudice, misunderstanding, and bias into the software systems that increasingly managed their lives.”
In her book, O’Neil breaks down these complex issues, helping us understand how algorithms rule our lives and what we—and the data scientists responsible for building these models—can do to make them more fair. At a book launch event held at the Ford Foundation, O’Neil explored some of these ideas in conversation with MSNBC contributor Dorian Warren. Below are highlights from her presentation, and their discussion.
Math isn’t pure.
Most of us tend to think of math and data as inherently honest—objective and true. Cathy O’Neil used to think that, too. But soon after she left academia for the world of finance, O’Neil recognized that “mathematical lies” were a reality. “I started realizing that mathematics was actually being used as a shield for corrupt practices,” she says. Take the triple-A ratings that credit rating agencies handed out to mortgage-backed securities in the lead-up to the 2008 financial crisis: People trusted them because they trust math. But it turned out those ratings didn’t reflect reality.
Machine-learning algorithms don’t know everything.
Algorithms look for patterns. When scanning job applications, for example, an algorithm evaluates them based on characteristics that have been successful in the past. If successful candidates have historically been white and male, the algorithm will show a preference for those applications and will be prone to filtering out women and people of color. The model doesn’t ask why; it just follows a pattern. And so when we use this kind of algorithm, O’Neil explains, “the best we can do is repeat the past.”
What’s efficient for political campaigns is inefficient for democracy.
Political campaigns have built detailed profiles on every voter, which they use to decide what messages to put in front of us. This works well for the campaigns, but is ultimately bad for voters and democracy as a whole. In an ideal world, candidates would clearly lay out their entire platforms and voters would be able to use that information to make informed decisions. Micro-targeting cuts down on complexity, which can be an advantage for people short on time and attention. But it means campaigns decide what to show us—what they think, based on the data, we should see. It’s easy enough for that to be all we see.
It’s not data scientists’ job to make algorithms fair. But maybe it should be.
An algorithm used by the HR department at Xerox revealed that employees who lived farther away from work were more likely to “churn”—to join the company and leave quickly. An easy conclusion might be that the company should hire fewer of these workers. But at the same time, Xerox noticed that the people who lived farther away tended to live in poorer neighborhoods—clearly, the algorithm wasn’t taking into account the full picture of their lives or the many things that might be leading to high turnover.
Instead of leaning on the algorithm’s ready conclusions, Xerox adjusted the formula. O’Neil says that it’s great that the company was able and willing to adjust its approach. But it underscores that real people exist as more than data points, and their lives will always be more complicated than any algorithm can capture. And while Xerox noticed this issue, what about the things they and other companies didn’t find, that nonetheless inform the way they hire and do business?
Historically, data scientists’ chief responsibility has been to build algorithms that are accurate (which often means maximizing profits for companies). O’Neil argues that because these algorithms have such a profound impact on people’s lives, it’s time for the people who build them to recognize that they have a responsibility to make fairness a part of their models, too.
Watch the full event: