Volume 22 No 5 (2024)
Download PDF
Enhanced Brain Tumor Detection Using Transformer Model with Self-Supervised Learning for Multimodal Approach with Contrastive Techniques
Akkipalli Sowjanya, Amjan Shaik
Abstract
This paper provides an in-depth analysis on the brain tumor detection with self-supervised learning (SSL) via contrastive loss, transformers and multimodal learning approach. To this end, we introduce a selfsupervised Vision Transformer (ViT) pre-training architecture based on the BYOL (Bootstrap Your Own Latent), which is further fine-tuned for brain tumor segmentation and classification using MRI datasets including popular benchmarks BraTS 2021, BraTS 2020, and TCGA-GBM. Our method outperforms other models in terms of accuracy (91.4%) and Dice coefficient (0.91) compared to the stuff segmentations using a traditional U-Net model, as in [32] ( 88.5% and 0.85). Some models, like 3D U-Net + Transformer by Zhou et al. are included in the comparison as well. (2022), and increased accuracy from 90.3% to 93.7% and Dice coefficient from 0.88 to 0.91. Additionally, multimodal integration (MRI combining with CT scans) resulted in an accuracy of 91.5% and Dice coefficient of 0.90 demonstrating the benefit of using multi-imaging modalities for tumor detection. Compared to the studies of Isensee et al. Our model demonstrates superior generalization and segmentation compared to earlier work from Long et al. (2021) that employed nnU-Net and reported an accuracy of 89.7% with a Dice coefficient of 0.88. Through this study, we propose a new transformer-based architecture combined with self-supervised learning for medical imaging, which yields state-of-the-art performance through strong generalization to brain tumor detection across various multimodal datasets.
Keywords
Transformer (ViT), U-Net, Self-Supervised Learning (SSL), Multimodal Imaging (MRI + CT), Medical Image Segmentation, Dice Coefficient, Deep Learning in Healthcare, Contrastive Learning, BYOL (Bootstrap Your Own Latent), Brain Tumor Detection
Copyright
Copyright © Neuroquantology
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Articles published in the Neuroquantology are available under Creative Commons Attribution Non-Commercial No Derivatives Licence (CC BY-NC-ND 4.0). Authors retain copyright in their work and grant IJECSE right of first publication under CC BY-NC-ND 4.0. Users have the right to read, download, copy, distribute, print, search, or link to the full texts of articles in this journal, and to use them for any other lawful purpose.