Home / Tech & Innovation / Foundational Multi-Task Model Tackles Data Scarcity in Biomedical Imaging

Foundational Multi-Task Model Tackles Data Scarcity in Biomedical Imaging

Aug 12, 2024

Chloe BotaineBiopharmaceutical Research Specialist

Deep learning has revolutionized various domains, and biomedical imaging is no exception. These advancements are due in no small part to the ability to learn and extract useful representations from complex datasets. Traditional approaches involve pretraining models on extensive natural image datasets like ImageNet-1K or LAION and subsequently fine-tuning them for specific target tasks. However, a significant barrier to this approach in biomedical imaging is the scarcity of large, annotated datasets that the field requires. The challenge of data scarcity becomes apparent as smaller, specialized datasets are more common in this domain. To tackle this challenge, researchers have proposed multi-task learning strategies that optimize deep learning models across diverse imaging tasks and label types. Here, we delve into a foundational multi-task model trained on various biomedical imaging tasks to address data scarcity effectively.

Preparing Effective Deep Learning Models for Biomedical Imaging

The deep learning landscape, especially in medical imaging, has been shaped significantly by foundational models pretrained on large and diverse datasets. However, in the biomedical domain, datasets are typically smaller and more specialized, presenting an inherent challenge. Initiatives to develop multi-task learning frameworks address this challenge by enabling models to generalize across multiple tasks without the constraints of vast memory requirements. This paper introduces a universal biomedical pretrained model (UMedPT) trained on a multi-task database that includes tomographic, microscopic, and X-ray images, employing various labeling strategies such as classification, segmentation, and object detection. The UMedPT foundational model outperformed previous models by leveraging information from a scant amount of data.

While foundational models like UMedPT can be fine-tuned for specific tasks, a significant advantage lies in their ability to maintain performance even when trained with limited datasets. In scenarios where only 1% of the original training data is available, UMedPT exhibits robust performance, demonstrating that fewer data do not necessarily compromise model efficiency. Additionally, for tasks outside the pretraining domain, UMedPT requires as little as 50% of the original training data to perform adequately. This optimization opens new avenues for advancing medical research and applications, particularly in areas with inherently scarce data, such as rare diseases and pediatric imaging.

Leveraging Multi-Task Learning for Enhanced Performance

Multi-task learning (MTL) offers a promising solution to data scarcity by training a single model to generalize across multiple tasks. This method maximizes the utility of diverse small- to medium-sized biomedical datasets by integrating various label types and data sources. MTL can be applied to distinct tasks such as classification and segmentation, effectively utilizing different types of labels for individual images. A significant benefit of MTL is its ability to share features across various tasks, thereby improving performance.

The strategy behind UMedPT involves a multi-task training loop that efficiently handles memory constraints, irrespective of the number of training tasks. This model was trained on 17 distinct tasks with their original annotations, using a gradient accumulation-based training approach to manage memory usage. The UMedPT model demonstrated its efficacy in cross-center transferability, setting new standards for cross-domain medical imaging applications.

Step-by-Step Implementation of the Multi-Task Training Procedure

Prepare the Common Modules: Initialize shared blocks, including normalization and task-specific modules.
Initialize Gradient Buffer: Clear all gradients in the optimizer.
Iterate Through Steps: Loop through the number of steps per epoch.
Fetch Task and Data: Obtain the next batch and associated task from the task sampler.
Calculate Loss: Compute the loss for the current task and batch using the shared blocks.
Backpropagate Gradients: Perform backpropagation to accumulate gradients.
Check for Optimization Step: If it’s time for an optimization step (e.g., after processing each task):
Update Parameters: Use the optimizer to update the model’s parameters.
Clear Gradients: Reset the gradient buffer in the optimizer.
End Loop: Repeat until all steps per epoch are processed.

This procedure ensures that the learning process remains efficient and memory-constrained, a critical factor when training on diverse and large-scale datasets.

Practical Implementation and Data Processing

Effective data processing is crucial when dealing with multiple types of biomedical imaging data, ranging from tomographic images and standard 2D images to gigapixel microscopic slides. To maintain a standardized input for the model, all data types are converted into a 2D image format and normalized, ensuring compatibility across different imaging modalities. This preprocessing involves specific augmentation techniques tailored to the characteristics of each dataset, enhancing the robustness and generalizability of the model.

Three primary categories of tasks—classification, segmentation, and object detection—are integrated into the UMedPT model. For classification tasks, a fully connected layer processes the latent representation computed by the encoder to obtain classification scores. Segmentation tasks utilize a U-Net-like architecture with a shared encoder and pixel-dense decoder, ensuring pixel-level class assignments. For object detection, the FCOS method, an anchor-free approach, is employed, predicting bounding boxes for detected objects. Each of these tasks is normalized to prevent the dominance of any single task during training, ensuring balanced learning across all tasks.

Clinical Validation and Benchmarking

The clinical applicability of UMedPT was validated using a diverse set of highly relevant tasks. These evaluations included both in-domain and out-of-domain benchmarks to measure the model’s ability to generalize and adapt to new tasks. The in-domain benchmark tested the model’s capability to recall and adapt previously learned skills to new yet related tasks. The tasks included colorectal cancer tissue classification, pneumonia diagnosis in pediatric chest X-rays, and nuclei detection in whole slide images, among others. The results demonstrated that UMedPT could match or exceed the performance of models pretrained on traditional datasets like ImageNet, using significantly fewer data.

The out-of-domain benchmark evaluated UMedPT’s performance on tasks distinctly different from the pretraining database, such as tuberculosis diagnosis from chest X-rays and central nervous system neoplasia diagnosis from MRI scans. The results highlighted the model’s versatility and robustness, affirming that UMedPT could establish new performance standards using limited training data. In external validation, the UMedPT model achieved outstanding results in classifying colorectal cancer histopathology images based on data from multiple clinical centers, demonstrating strong cross-center transferability.

Implications and Future Directions

The introduction of UMedPT underscores the potential of multi-task foundational models in overcoming data scarcity in biomedical imaging. By consolidating knowledge across various tasks and datasets, UMedPT exemplifies how foundational models can transform medical research and applications. Moreover, the ability to perform well with limited data makes such models particularly valuable in areas with inherent data constraints, such as rare disease research. Future directions could involve expanding the scale of pretraining, integrating self-supervised learning tasks, and refining fine-tuning strategies to enhance model generalizability further. Integrating these improvements could extend the applicability of foundational models like UMedPT, pushing the boundaries of what is possible in medical imaging and diagnostics.