Problem Spotlight: Algorithmic bias and health

Aspirin was prescribed as a pain reliever for nearly 100 years before researchers were able to explain “the precise chemical mechanism of how aspirin stops pain and inflammation.” Harvard’s Jonathan Zittrain says this “answers first, explanations later” approach to discovery results in “intellectual debt,” and it’s not limited to drug development. Across medicine — and now, artificial intelligence and machine learning at large — we’re advancing technologies and solutions without a full understanding of how and why they work.

For the most part, we enjoy modernity and benefit greatly from massive breakthroughs in health, science, and technology. Most would agree that the 20th century was better with aspirin than without it, and artificial intelligence offers irresistible advantages to many. However, intellectual debt has consequences, and algorithmic bias is one of the more problematic outcomes.

When it comes to health, algorithmic bias can be particularly dangerous. In theory, AI and machine learning can either fulfill a promise to democratize healthcare or exacerbate inequality; in reality, both things are happening. At best, algorithmic bias excludes people. At worst, people could die. How can organizations pursue wild innovation while mitigating risk?

What are algorithms?

“An algorithm is a set of instructions for how a computer should accomplish a particular task. … Algorithms are most often compared to recipes, which take a specific set of ingredients and transform them through a series of explainable steps into a predictable output. Combining calculation, processing, and reasoning, algorithms can be exceptionally complex, encoding for thousands of variables across millions of data points.” (Data & Society)

And what is bias?

“Bias” is a broad term used to describe “outcomes which are systematically less favorable to individuals within a particular group and where there is no relevant difference between groups that justifies such harms.” (Brookings)

How we got here

Charles Babbage invented punch-card programming in 1834. Less than a decade later, Ada Lovelace wrote the first algorithm. But until the latter half of the 20th century, machine learning was more science fiction than reality. In recent decades, data storage and computing power have increased exponentially and costs have dropped, paving the way for “big data” and its applications across industries.

Historically, health data sources have included clinical trials, electronic health records, insurance claims, and more. With the rise of sensors and smartphones, all data becomes health data. Connected devices collect data everywhere we go, and the same digital breadcrumbs that reveal our shopping habits can be used to understand our health. One-quarter of Americans are using wearable devices — often for fitness tracking and health monitoring — and worldwide, DNA testing services have collected genetic data from nearly 30 million people.

All that data is too much for humans alone to analyze. But once “trained,” algorithms can use data to diagnose skin cancer or lung cancer, or predict the risk of seizures or diabetic retinopathy or C-diff infections. Beyond detection and diagnosis, algorithms can be used to optimize treatment plans and drug dosages. Organizations are also using machine learning and other technologies to accelerate drug development — at least 148 startups are using artificial intelligence for drug discovery — and improve clinical trials.

Effective solutions require accurate data and thoughtful decision-making. But data and algorithms start with humans, and humans have biases. In medical research and treatment, bias has been a problem for far longer than algorithms have been in use. Though women and men are equally impacted by cardiovascular disease, the typical clinical trial population is 85% male; women also don’t receive the same level of treatment as men who suffer from heart disease. A 2016 study found that only 5% of the genetic traits linked to asthma in European Americans applied to African Americans; neglected by research, African American children have died from asthma at 10 times the rate of non-Hispanic white children. Racial and ethnic minorities also tend to be undertreated for pain when compared with white patients.

These long-standing biases can creep in at many stages of the deep-learning process — from framing the problem to collecting and preparing data — and standard practices in computer science aren’t designed to detect them.

STAT recently reported that nearly all of the largest manufacturers of wearable heart rate trackers rely on technology that could be less reliable for people who have darker skin, and a 2018 study found that step-counting apps aren’t generating accurate data — especially if the way you walk differs from “developers’ perceptions of usual behavior.” Many commercial apps aren’t designed for research, and the data collected isn’t consistent or accurate enough to inform health decisions.

What’s at stake

Incomplete or unrepresentative data isn’t a big deal if you’re tracking your physical activity for fun. But data and algorithms play an enormous role in all aspects of health — from prevention and detection to diagnosis and treatment — as well as healthcare coverage. Algorithms fed with big data can replicate existing biases at a speed and scale that can create irreparable harm. Health is already plagued by disparity; AI and machine learning “risk making dangerous biases automated and invisible.”

Deep-learning predictions can fail “if they encounter unusual data points, such as unique medical cases, for the first time, or when they learn peculiar patterns in specific data sets that do not generalize well to new medical cases.” IBM’s Watson for Oncology was trained by a couple dozen physicians at a single hospital in New York before it was deployed in more than 50 hospitals across five continents, with varying results. A Toronto-based startup published a study that claimed 90% accuracy of its Alzheimer’s test — before realizing it only worked on people who speak a particular Canadian English dialect. Even proven algorithms, trained on large and varied data sets, can’t safely be applied to new populations without careful testing and oversight.

In health and beyond, algorithmic bias doesn’t just impede innovation. It reinforces damaging stereotypes and bad decision-making that could lead to human harm. Inadequate data and historical biases are often to blame, but part of the problem is that we’ve come to trust machines, and we perceive computers as objective and faultless. Planes crash when pilots trust the system too much; bad things can happen when humans aren’t willing to override the machine.

Potential solutions

Many incidents of algorithmic bias begin with bad data or bad assumptions. A willingness to question and demystify algorithms — including the sources of data and how the data is used — can help improve outcomes. Leaders across industries should learn to ask smart questions that cut through AI hype. Physicians need to understand how algorithms work and how to guard against overreliance on them. Even kids can become responsible algorithm users: A researcher at MIT Media Lab has developed a curriculum for middle-school students. We’re keeping an eye on several different approaches to mitigating algorithmic risk.

Ethical frameworks and research

Artificial intelligence and algorithms aren’t unlike science and medicine; new ideas can validate hypotheses — or prove old ideas wrong. Overcoming intellectual debt will require rigorous research and an emphasis on ethics.

Camille Nebeker and other leading academics are building the case for actionable ethics in digital health research.
MIT researcher Joy Buolamwini founded the Algorithmic Justice League to highlight bias and develop practices for accountability. Her research on racial and gender bias in AI services has challenged facial recognition software based on training data sets that are overwhelmingly white and male.
AI researchers are working to better define fairness, detect hidden biases within training data and models, and hold companies accountable.
AI experts have suggested an “Algorithmic Bill of Rights” and a grassroots organization called I Am The Calvary has proposed a “Hippocratic Oath for Connected Medical Devices.”

Guidelines and governance

Many ethical frameworks rely on self-regulation, but some organizations are calling for broader, formalized approaches to oversight.

In the United States, the FDA has already issued approvals for algorithms in medicine. Earlier this year, the agency requested feedback on its Proposed Regulatory Framework for Modifications to Artificial Intelligence/Machine Learning (AI/ML)-Based Software as a Medical Device (SaMD).
Data & Society has called for policies to protect U.S. consumer and civil rights. The Algorithmic Accountability Act would require companies to audit systems for bias, secure sensitive data, and take corrective action when issues are identified.
Beyond the United States, the Organization for Economic Cooperation and Development (OECD) and European Union have released ethics guidelines for artificial intelligence.
The AI Now Institute at New York University published recommendations for algorithmic impact assessments, a framework similar to an environmental impact assessment, for agencies bringing oversight to automated decision systems.

Product improvement and design

A more transparent approach to data collection and software development — and a commitment to continuous monitoring and iteration — can make algorithms more equitable.

A new set of companies want to improve labor conditions for the humans enlisted to clean, categorize, and label data before it’s used in AI systems.
IBM has proposed that developers publish a Supplier’s Declaration of Conformity (SDoC) to disclose information about algorithms before they’re used.
#WeHeartHackers and Biohacking Village have initiated collaborations between the medical device and security researcher communities to proactively address vulnerabilities of devices and data.

Diversity in health and tech

Diverse teams, inclusive workplaces, and empathetic collaboration can help organizations better anticipate unintended consequences of innovation.

Stanford’s Unconscious Bias in Medicine, a free online course, is designed to help physicians recognize and correct unconscious bias in their daily interactions.
A data analytics team at University of Chicago Medicine partnered with UCM’s Diversity and Equity Committee to correct biased algorithms and proactively use machine learning to improve equity.
Onboard Health is helping healthcare organizations recruit talent — including engineers, data scientists, product managers, design, and statisticians — from underrepresented groups and develop teams that are more representative of diverse patient populations.

Are you studying this problem or working on a solution? We’d love to hear your thoughts — send a message to editor@luminary-labs.com.

Problem Spotlight: Algorithmic bias and health

What are algorithms?

And what is bias?

How we got here

What’s at stake

Potential solutions

Ethical frameworks and research

Guidelines and governance

Product improvement and design

Diversity in health and tech

Publication Date

Authors

Share

Problem Spotlight: Algorithmic bias and health

What are algorithms?

And what is bias?

How we got here

What’s at stake

Potential solutions

Ethical frameworks and research

Guidelines and governance

Product improvement and design

Diversity in health and tech

Publication Date

Authors

Share

More Insights and Updates

Connecting the dots between health, tech, and ethics: a reading list

Problem Spotlight: The opioid crisis

The innovation arc: from shiny object to creating enterprise value