Dissertation and Thesis

Permanent URI for this communityhttp://164.52.219.250:4000/handle/10263/2146

Browse

Search Results

Now showing 1 - 4 of 4
  • Item
    Enhancing Medical Image Analysis through Deep Learning:
    (Indian Statistical Institute, Kolkata, 2025-05) Ghosh, Susmita
    Medical image analysis has become indispensable for accurate diagnosis and treatment planning. However, despite advances in deep learning, several critical challenges persist, ranging from more efficient models to the integration of multiple tasks within a unified framework. This thesis addresses these challenges by proposing innovative deep learn- ing architectures that enhance medical image classification, segmentation, and multitask learning. At the heart of this research is the goal of developing models that deliver high performance and tackle the nuanced complexities of medical data. Existing clas- sification models often overlook valuable information hidden in the spectral domain of images. I address this by integrating spatial and spectral features, demonstrating their complementary power to detect diseases such as COVID-19 from chest radiographs. This approach facilitates a more holistic understanding of medical images, improving the ac- curacy and reliability of diagnostic systems. To further enhance image classification, I explore hybrid architectures that combine convolutional and transformer-based models. These models leverage the strengths of both architectures, capturing fine-grained visual details and long-range dependencies. This significantly improves various medical imaging datasets, offering deeper interpretability and superior classification accuracy, particularly in complex diagnostic scenarios. Moving beyond classification, I tackle the fundamen- tal challenge of segmenting complex and irregular regions within medical images, where traditional deep learning models often struggle. To overcome this, I introduce a novel segmentation framework that combines the power of deep neural networks with trainable morphological operations. This leads to a more precise delineation of regions of inter- est, even in challenging clinical scenarios, setting a new benchmark for medical image segmentation. One of the most pressing issues in medical imaging is the inefficiency of current multitask learning models, which often require vast computational resources and struggle to generalize across different tasks. I present a lightweight multitask learn- ing framework that excels at both segmentation and classification, particularly in breast tumor analysis. Using novel morphological attention mechanisms and the sharing of task- specific knowledge, proposed model significantly reduces computational complexity while improving performance. Importantly, this framework demonstrates versatility across various medical imaging domains, from gland segmentation and malignancy detection in histology images to skin lesion analysis, demonstrating its robustness and applicability in real-world settings. Altogether, this thesis offers solutions to some of the most pressing problems in medical image analysis, providing models that are not only more accurate but also computationally efficient, making them suitable for deployment in clinical practice.
  • Item
    2.5D Dual-Encoder U-Net for Lesion Segmentation in Chest CT Scans
    (Indian Statistical Institute, Kolkata, 2025-06) Mukkara, Jagannath
    Accurate segmentation of lesions in chest CT scans plays a vital role in diagnosing and monitoring pulmonary diseases such as COVID-19. In this, we introduce a novel 2.5D[1] dual-encoder U-Net model[2] that utilizes both the central slice and its neighboring slices to improve segmentation accuracy while keeping computational demands manageable. Our model incorporates residual connections[3] and feature fusion[4] to effectively merge multi-slice contextual information, overcoming the limitations found in traditional 2D and 3D methods. To ensure a reliable evaluation and avoid data leakage, we used patient-level data splitting. We validate our approach on a carefully curated chest CT dataset, showing enhanced segmentation performance and better generalization compared to standard U-Net models. Through extensive experiments, including ablation studies and visualizations, we demonstrate the advantages of combining 2.5D learning with a dual-encoder architecture for medical image segmentation tasks.
  • Item
    On Automated Analysis of Lung Images with Deep Learning for Healthcare
    (Indian Statistical Institute, Kolkata, 2025-06) Pal, Surochita
    Automated detection and diagnosis of lung diseases through medical image analysis offers a noninvasive alternative to invasive procedures, especially considering the challenges and potential risks associated with repeat lung operations. Noninvasive image-guided diagnostic techniques, such as lung imaging, have become essential in clinical practice. This thesis focuses on the development of a computer-aided system aimed at enhancing the classification, detection, and segmentation of lung diseases, specifically caused by COVID-19 and lung tumors, leveraging advanced computational methods. Novel segmentation algorithms, such as EFMC and WDU-Net, are devised based on encoder-decoder architectures within deep convolution networks. These algorithms undergo rigorous validation against ground truth or manual segmentation by radiologists, ensuring their accuracy and reliability. The EFMC algorithm employs a selective focus mechanism with multi-resolution blocks, allowing for precise delineation of COVID-19 affected regions in lung CT scans. Its performance is validated through extensive comparison with expert annotations, demonstrating its effectiveness in capturing subtle abnormalities while accurately segmenting lung anomalies. Similarly, WDU-Net integrates weighted deformable convolution. Here the deformable convolution modules enhance its ability to capture irregular shapes and features in COVID-19 and lung tumors. Validation against manual segmentation reveals its robustness and accuracy in segmenting COVID-19 and lung tumors from CT images; thereby, showcasing its potential for aiding clinical diagnosis and treatment planning. Next automated classification of lung tumors is devised, in the multi-modal PET-CT framework, using the innovative DEMF model. The network leverages deep convolution networks, in conjunction with dimensionality reduction, to efficiently detect and classify lung abnormalities. This demonstrates superior performance in lung cancer classification across multimodal images. Finally, the DGMC is developed to enhance diagnosis and classification of diseases, by co-learning from multimodal images. Utilizing a novel multihead classifier, the DGMC can efficiently distinguish between COVID-19, tumors, and healthy slices of the lung. The input signal encompasses CT, along with EIT-processed CT scans, in order to provide a multimodal flavour. It captures granular details of the infection, while visualizing the activation regions. Together, these advancements represent significant progress in the automated analysis of lung diseases, by providing valuable tools for the early detection and diagnosis in clinical settings.
  • Item
    Self-Supervised Learning and its Applications in Medical Image Analysis
    (Indian Statistical Institute, Kolkata, 2025-04) Manna, Siladittya
    Self-supervised learning (SSL) enables learning robust representations from unlabeled data and it consists of two stages: pretext and downstream. The representations learnt in the pretext task are transferred to the downstream task. Self-supervised learning has appli- cations in various domains, such as computer vision tasks, natural language processing, speech and audio processing, etc. In transfer learning scenarios, due to differences in the data distribution of the source and the target data, the hierarchical co-adaptation of the representations is destroyed, and hence proper fine-tuning is required to achieve satisfactory performance. With self-supervised pre-training, it is possible to learn repre- sentations aligned with the target data distribution, thereby making it easier to fine-tune the parameters in the downstream task in the data-scarce medical image analysis domain. The primary objective of this thesis is to propose self-supervised learning frameworks that deal with specific challenges. Initially, jigsaw puzzle-solving strategy-based frameworks are devised where a semi-parallel architecture is used to decouple the representations of patches of a slice from a magnetic resonance scan to prevent learning of low-level signals and to learn context-invariant representations. The literature shows that contrastive learn- ing tasks are better than context-based tasks in learning representations. Thus, we propose a novel binary contrastive learning framework based on classifying a pair as positive or neg- ative. We also investigate the ability of self-supervised pre-training to boost the quality of transferable representations. To effectively control the uniformity-alignment trade-off, we re-formulate the binary contrastive framework from a variational perspective. We further improve this vanilla formulation by eliminating positive-positive repulsion and amplifying negative-negative repulsion. The reformulated binary contrastive learning framework out- performs the state-of-the-art contrastive and non-contrastive frameworks on benchmark datasets. Empirically, we observe that the temperature hyper-parameter plays a signifi- cant role in controlling the uniformity-alignment trade-off, consequently determining the downstream performance. Hence, we derive a form of the temperature function by solving a first-order differential equation obtained from the gradient of the InfoNCE loss with respect to the cosine similarity of a negative pair. This enables controlling the uniformity- alignment trade-off by computing an optimal temperature for each sample pair. From experimental evidence, we observe that the proposed temperature function improves the performance of a weak baseline framework to outperform the state-of-the-art contrastive and non-contrastive frameworks. Finally, to maximise the transferability of representa- tions, we propose a self-supervised few-shot segmentation pretext task to minimise the disparity between the pretext and downstream tasks. Using the Felzenszwalb-based seg- mentation method to generate the pseudo-masks, we train a segmentation network that learns representations aligned with the downstream task of one-shot segmentation. We propose a correlation-weighted prototype aggregation step to incorporate contextual in- formation efficiently. In the downstream task, we conduct inference without fine-tuning and the proposed self-supervised one-shot framework performs better or at par with the contemporary self-supervised segmentation frameworks. In conclusion, the proposed self-supervised learning frameworks offer significant improve- ments in representation learning, and enhancing performance on downstream medical im- age analysis tasks, as observed from the different experimental results of the thesis.