Problem Spotlight: Algorithmic bias and health

Problem Spotlight: Algorithmic bias and health

Aspirin was prescribed as a pain reliever for nearly 100 years before researchers were able to explain “the precise chemical mechanism of how aspirin stops pain and inflammation.” Harvard’s Jonathan Zittrain says this “answers first, explanations later” approach to discovery results in “intellectual debt,” and it’s not limited to drug development. Across medicine — and now, artificial intelligence and machine learning at large — we’re advancing technologies and solutions without a full understanding of how and why they work.

For the most part, we enjoy modernity and benefit greatly from massive breakthroughs in health, science, and technology. Most would agree that the 20th century was better with aspirin than without it, and artificial intelligence offers irresistible advantages to many. However, intellectual debt has consequences, and algorithmic bias is one of the more problematic outcomes.

When it comes to health, algorithmic bias can be particularly dangerous. In theory, AI and machine learning can either fulfill a promise to democratize healthcare or exacerbate inequality; in reality, both things are happening. At best, algorithmic bias excludes people. At worst, people could die. How can organizations pursue wild innovation while mitigating risk?

What are algorithms?

“An algorithm is a set of instructions for how a computer should accomplish a particular task. … Algorithms are most often compared to recipes, which take a specific set of ingredients and transform them through a series of explainable steps into a predictable output. Combining calculation, processing, and reasoning, algorithms can be exceptionally complex, encoding for thousands of variables across millions of data points.” (Data & Society)

And what is bias?

“Bias” is a broad term used to describe “outcomes which are systematically less favorable to individuals within a particular group and where there is no relevant difference between groups that justifies such harms.” (Brookings)

How we got here

Charles Babbage invented punch-card programming in 1834. Less than a decade later, Ada Lovelace wrote the first algorithm. But until the latter half of the 20th century, machine learning was more science fiction than reality. In recent decades, data storage and computing power have increased exponentially and costs have dropped, paving the way for “big data” and its applications across industries.

Historically, health data sources have included clinical trials, electronic health records, insurance claims, and more. With the rise of sensors and smartphones, all data becomes health data. Connected devices collect data everywhere we go, and the same digital breadcrumbs that reveal our shopping habits can be used to understand our health. One-quarter of Americans are using wearable devices — often for fitness tracking and health monitoring — and worldwide, DNA testing services have collected genetic data from nearly 30 million people.

All that data is too much for humans alone to analyze. But once “trained,” algorithms can use data to diagnose skin cancer or lung cancer, or predict the risk of seizures or diabetic retinopathy or C-diff infections. Beyond detection and diagnosis, algorithms can be used to optimize treatment plans and drug dosages. Organizations are also using machine learning and other technologies to accelerate drug development — at least 148 startups are using artificial intelligence for drug discovery — and improve clinical trials.

Effective solutions require accurate data and thoughtful decision-making. But data and algorithms start with humans, and humans have biases. In medical research and treatment, bias has been a problem for far longer than algorithms have been in use. Though women and men are equally impacted by cardiovascular disease, the typical clinical trial population is 85% male; women also don’t receive the same level of treatment as men who suffer from heart disease. A 2016 study found that only 5% of the genetic traits linked to asthma in European Americans applied to African Americans; neglected by research, African American children have died from asthma at 10 times the rate of non-Hispanic white children. Racial and ethnic minorities also tend to be undertreated for pain when compared with white patients.

These long-standing biases can creep in at many stages of the deep-learning process — from framing the problem to collecting and preparing data — and standard practices in computer science aren’t designed to detect them.

STAT recently reported that nearly all of the largest manufacturers of wearable heart rate trackers rely on technology that could be less reliable for people who have darker skin, and a 2018 study found that step-counting apps aren’t generating accurate data — especially if the way you walk differs from “developers’ perceptions of usual behavior.” Many commercial apps aren’t designed for research, and the data collected isn’t consistent or accurate enough to inform health decisions.

What’s at stake

Incomplete or unrepresentative data isn’t a big deal if you’re tracking your physical activity for fun. But data and algorithms play an enormous role in all aspects of health — from prevention and detection to diagnosis and treatment — as well as healthcare coverage. Algorithms fed with big data can replicate existing biases at a speed and scale that can create irreparable harm. Health is already plagued by disparity; AI and machine learning “risk making dangerous biases automated and invisible.”

Deep-learning predictions can fail “if they encounter unusual data points, such as unique medical cases, for the first time, or when they learn peculiar patterns in specific data sets that do not generalize well to new medical cases.” IBM’s Watson for Oncology was trained by a couple dozen physicians at a single hospital in New York before it was deployed in more than 50 hospitals across five continents, with varying results. A Toronto-based startup published a study that claimed 90% accuracy of its Alzheimer’s test — before realizing it only worked on people who speak a particular Canadian English dialect. Even proven algorithms, trained on large and varied data sets, can’t safely be applied to new populations without careful testing and oversight.

In health and beyond, algorithmic bias doesn’t just impede innovation. It reinforces damaging stereotypes and bad decision-making that could lead to human harm. Inadequate data and historical biases are often to blame, but part of the problem is that we’ve come to trust machines, and we perceive computers as objective and faultless. Planes crash when pilots trust the system too much; bad things can happen when humans aren’t willing to override the machine.

Potential solutions

Many incidents of algorithmic bias begin with bad data or bad assumptions. A willingness to question and demystify algorithms — including the sources of data and how the data is used — can help improve outcomes. Leaders across industries should learn to ask smart questions that cut through AI hype. Physicians need to understand how algorithms work and how to guard against overreliance on them. Even kids can become responsible algorithm users: A researcher at MIT Media Lab has developed a curriculum for middle-school students. We’re keeping an eye on several different approaches to mitigating algorithmic risk.

Ethical frameworks and research

Artificial intelligence and algorithms aren’t unlike science and medicine; new ideas can validate hypotheses — or prove old ideas wrong. Overcoming intellectual debt will require rigorous research and an emphasis on ethics.

Guidelines and governance

Many ethical frameworks rely on self-regulation, but some organizations are calling for broader, formalized approaches to oversight.

Product improvement and design

A more transparent approach to data collection and software development — and a commitment to continuous monitoring and iteration — can make algorithms more equitable.

Diversity in health and tech

Diverse teams, inclusive workplaces, and empathetic collaboration can help organizations better anticipate unintended consequences of innovation.

  • Stanford’s Unconscious Bias in Medicine, a free online course, is designed to help physicians recognize and correct unconscious bias in their daily interactions.
  • A data analytics team at University of Chicago Medicine partnered with UCM’s Diversity and Equity Committee to correct biased algorithms and proactively use machine learning to improve equity.
  • Onboard Health is helping healthcare organizations recruit talent — including engineers, data scientists, product managers, design, and statisticians — from underrepresented groups and develop teams that are more representative of diverse patient populations.

Are you studying this problem or working on a solution? We’d love to hear your thoughts — send a message to


Sara Holoubek
Founding Partner and CEO
Jessica Hibbard
Head of Content & Community