AI for Cancer Diagnosis Reveals Unforeseen Demographic Bias

AI for Cancer Diagnosis Reveals Unforeseen Demographic Bias

The intersection of artificial intelligence and medicine promises a future of unprecedented diagnostic precision. Yet, as these powerful tools are integrated into clinical practice, they bring with them the hidden risk of inheriting and amplifying human biases. We sat down with Ivan Kairatov, a biopharma expert at the forefront of this challenge, to discuss his team’s groundbreaking research. Their work uncovered how AI models developed for pathology can unexpectedly infer patient demographics from tissue slides, leading to diagnostic biases. More importantly, they’ve pioneered a solution, a framework called FAIR-Path, that significantly reduces these disparities. In our conversation, we explore the subtle signals these algorithms detect, the profound clinical implications of their inaccuracies, and the path toward building a truly equitable future for AI in healthcare.

Your research highlights the surprise that AI can infer patient demographics from pathology slides, a task considered a “mission impossible” for humans. Beyond the data, what subtle biological signals might these models be detecting, and could you share an anecdote from the research that illustrates this discovery?

It truly was a moment of shock for the entire team. As pathologists, we are trained to see the disease, not the person. We look at a slide—that beautiful, chaotic swirl of pink and purple—and our focus is entirely on cellular structure, nuclear atypia, and mitotic rates. The patient’s race or age is irrelevant and, we believed, invisible. So, when our models started correctly predicting demographics from these “anonymous” slides with alarming accuracy, it felt like a magic trick at first. I remember one specific instance when we were testing a model on slides from 20 different cancer types. It kept flagging demographic markers, and we initially thought it was a bug, some kind of data leakage. But after running extensive checks, we had to face the incredible reality: the AI was seeing something we couldn’t. It’s likely picking up on incredibly subtle, “obscure biological signals”—perhaps minute variations in tissue architecture or molecular expressions linked to ancestry or age-related cellular changes—that are completely imperceptible to the human eye. This discovery shifted our perspective entirely; we realized that because AI is so powerful, it can learn patterns related more to a person’s background than their disease, which was a sobering and critical insight for our work.

The study identifies three sources of bias, noting that the problem is deeper than just unequal sample sizes. Could you walk us through a specific example of how a model struggled with a cancer diagnosis in one group, even when its sample size was comparable to another?

This was another crucial realization for us. The easy explanation for bias is always unequal data—if you train a model on fewer samples from one group, it will naturally perform worse on that group. But we found the problem was far more insidious. For instance, our models struggled to differentiate lung cancer subtypes in African American patients, even when the number of slides from that group was statistically comparable to others in the training set. This is where “differential disease incidence” comes into play. It’s not just about the number of samples, but the biological nature of the disease within those samples. Certain cancers, or even specific mutations in cancer driver genes, are more common in some demographic groups than others. The AI, in its quest for efficiency, learns the most common patterns as the “standard” for a diagnosis. So, when it encounters a lung cancer sample from an African American patient that presents with less common molecular features, it falters. The model has become an expert on the majority presentation and is essentially a novice when faced with a variant, even if it has seen plenty of slides from that demographic group.

FAIR-Path is an impressive solution, reducing diagnostic disparities by about 88% using contrastive learning. Can you explain the step-by-step process of how this framework teaches a model to prioritize disease features over demographic ones and what this adjustment looks like in practice for your team?

FAIR-Path was born from asking a simple question: can we force the model to learn the “right” things? We used a concept called contrastive learning, which is a wonderfully intuitive approach. In practice, during the training phase, we essentially show the model pairs of slides. We tell it, “These two slides are both from patients with breast cancer. Focus on the features that make them both breast cancer, and ignore the fact that one patient is 25 and the other is 65.” Simultaneously, we show it another slide and say, “This one is renal cancer. Focus on what makes it fundamentally different from the breast cancer slides.” By doing this repeatedly, we teach the model to minimize the “distance” between samples of the same disease type and maximize the “distance” between samples of different disease types, regardless of the demographic group they come from. It learns to find the core, robust biological signals of the cancer itself. For our team, implementing this was a small adjustment to the training code, but the results were dramatic. Seeing that diagnostic disparity number plummet by about 88% was a huge victory and proved that we can actively build fairness into these systems without needing perfectly balanced datasets, which are often impossible to acquire.

The study found performance gaps in nearly 29% of diagnostic tasks, including differentiating lung cancer in African American men and breast cancer in younger patients. Moving beyond the statistics, what are the potential real-world clinical consequences of these specific diagnostic inaccuracies for patient care and outcomes?

That 29% figure is what keeps us motivated because it represents real patients whose lives are on the line. These aren’t just statistical errors; they are potential clinical disasters. Take the example of differentiating breast cancer subtypes in younger patients. Certain subtypes are more aggressive and require immediate, targeted therapies. If an AI model, biased by training data dominated by older patients, misclassifies the cancer in a younger woman, her treatment could be delayed or incorrect. This could give a highly aggressive cancer a chance to progress, directly impacting her prognosis and survival. Similarly, for an African American man whose lung cancer subtype is misidentified, the consequence could be receiving a standard chemotherapy regimen that is less effective for his specific cancer, when a more targeted therapy could have been an option. It creates a devastating ripple effect: a flawed diagnosis leads to a suboptimal treatment plan, which contributes to poorer patient outcomes and tragically widens the healthcare disparities we are all working so hard to close.

What is your forecast for the future of AI in pathology?

My forecast is one of cautious optimism. The potential for AI to revolutionize pathology is immense. I see a future where AI acts as an indispensable partner for every pathologist, a tireless assistant that can screen thousands of slides, highlight areas of concern with superhuman accuracy, and help manage the crushing workload that leads to burnout. This will free up human experts to focus on the most complex and ambiguous cases, ultimately improving the speed and quality of cancer care. However, this future is not guaranteed. It is entirely conditional on our community’s commitment to building these tools responsibly and equitably. Our work with FAIR-Path shows that we can bake fairness into the architecture of these models from the very beginning. The future isn’t just about creating more powerful algorithms; it’s about creating wiser, more conscientious ones. I believe that if we remain vigilant and intentional about how we design and deploy these systems, we can build a new generation of AI that enhances care for every single patient, regardless of their background, and truly fulfills the promise of precision medicine.

Subscribe to our weekly news digest.

Join now and become a part of our fast-growing community.

Invalid Email Address
Thanks for Subscribing!
We'll be sending you our best soon!
Something went wrong, please try again later