Susan Athey wants to help machine-learning applications look beyond correlation and into root causes
Data-fueled machine learning has spread to many corners of science and industry and is beginning to make waves in addressing public-policy questions as well. It’s relatively easy these days to automatically classify complex things like text, speech, and photos, or to predict website traffic tomorrow. It’s a whole different ballgame to ask a computer to explore how raising the minimum wage might affect employment or to design an algorithm to assign optimal treatments to every patient in a hospital.
The vast majority of machine-learning applications today are just highly functioning versions of simple tasks, says Susan Athey, professor of economics at Stanford Graduate School of Business. They rely in large part on something computers are especially good at: sifting through vast reams of data to identify connections and patterns and thus make accurate predictions. Prediction problems are simple, because, in a stable environment, it doesn’t really matter how or why the algorithm operates; it’s easy to measure performance just by seeing how well the program works on test data. All of which means that you don’t have to be an expert to deploy prediction algorithms with confidence.
Despite the proliferation of data collection and computing prowess, machine-learning algorithms aren’t so good at distinguishing between correlation and causation — determining whether the connection between statistically linked patterns is coincidental or the result of some cause-and-effect force. “Some problems simply aren’t solvable with more data or more complex algorithms,” Athey says.
If machine-learning techniques are going to help address public-policy problems, Athey says, we need to develop new ways of marrying them with causal-inference methods. Doing so would greatly expand the potential of big-data applications and transform our ability to design, evaluate, and improve public-policy work.What Predictive Models Miss
As government agencies and other public sector groups embrace big data, Athey says it’s important to understand the realistic limitations of current machine-learning methods. In a recent article published in Science, she outlined a number of scenarios that highlight the distinction between prediction problems and causal-inference problems, and where common machine-learning applications would have trouble drawing useful conclusions about cause and effect.
One question that comes up in businesses is whether a firm should target resources on retaining customers who have a high risk of attrition, or “churn.” Predicting churn can be accomplished with off-the-shelf machine-learning methods. However, the real problem is calculating the best allocation of resources, which requires identifying those customers for whom the causal effect of an intervention, such as offering discounts or sending out targeted emails, is the highest. That’s a harder thing to measure; it might require the firm to conduct a randomized experiment to learn where intervention has the biggest benefits. Athey points to a recent study that showed that in one firm that carried out a more rigorous analysis, the overlap between customers with high risk of churn and those for whom intervention works best was only 50%.
In another case, predictive models already have been used to identify patients who, though eligible for hip replacement surgery, should not be given the operation due to the likelihood that they will soon die of other causes. What those methods fail to solve is the much harder problem of prioritizing patients who would most benefit from the procedure.
“If you’re just trying to crunch big data and not thinking about everything that can go wrong in confusing correlation and causality, you might think that putting a bigger machine on your problem is going to solve things,” Athey says. “But sometimes the answer’s just not in the data.”
That’s especially true in many of the real-world contexts where public policy takes shape, she says.
The gold standard for picking apart correlation and causality is the randomized controlled experiment, which allows for relatively straightforward inferences about cause and effect. Such experiments are commonly used to test the efficacy of new drugs: A randomly selected group of people with a particular illness is given the drug while a second group with the same illness is given a placebo. If a significant portion of the first group gets better, the drug is probably the cause.
But such experiments are not feasible in many real-world settings. For instance, it would be politically and practically impossible to conduct a massive controlled experiment examining what happens when the minimum wage is raised or lowered across a variety of locations. Instead, policy analysts have to rely on “observational data,” or data generated in ways other than through random assignment. And drawing useful conclusions from observational data — which are often muddied by uncontrolled and thus unreliable input — is a challenge beyond the reach of common predictive methods.
It’s here that Athey hopes her research will push the envelope of what machine learning can accomplish. Combining pure prediction with causal inference, she says, will get us closer to being able to address the really hard problems that involve sussing out all of the alternate outcomes that could result from implementing different policies.
“How can we continue to modify and build new techniques that really fully exploit big data?” Athey asks, noting that many public-policy problems have questions of causal inference at their core. “That’s the really hard stuff, and you have to proceed with caution to understand the effect of something. But that’s most of the world.”Computational Power to the People
While those advances may still be around the corner, Athey says the momentum of big data and machine learning in academic research and practical applications is invigorating. “The gap between research and practice, which used to be insurmountable, is disappearing,” she says. “It’s so cool when our research gets adopted within months.”
She finds it especially gratifying to witness the widespread adoption of predictive methods that not too long ago were the exclusive province of a specialized cadre of data scientists. “It’s amazing, because you’re empowering people who in a previous generation wouldn’t have used a computer for anything other than word processing,” Athey says. “Now it’s not just the geeky engineers, but people at high levels of a company are interested in the most recent research. They recognize the power of being able to use data to optimize decisions and investments. They’re building big-data models and open-source software to make great predictions with cutting-edge techniques. It’s been completely democratized, and I think that’s a huge success story.”