Exploring Resource-Efficient Deep Learning for Medical Image Segmentation

Dutta, Pallabi

Exploring Resource-Efficient Deep Learning for Medical Image Segmentation

dc.contributor.author	Dutta, Pallabi
dc.date.accessioned	2026-05-20T05:40:21Z
dc.date.issued	2026-05-19
dc.description	This thesis has been completed under the supervision of Prof. Sushmita Mitra
dc.description.abstract	Automated medical image segmentation improves diagnostic accuracy by au tomating the precise delineation of target anatomical structures in the input images. Artificial Intelligence (AI), and specifically, Deep Learning (DL), has emerged as a state-of-the-art approach for this task. However, the significant computational demands of DL approaches often hinders their deployment. Ad vanced models, including Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs), require substantial processing power and a large memory footprint, limiting their use in resource-constrained settings. This thesis aims to address this challenge by developing a series of novel, resource-efficient DL models that achieve high segmentation accuracy with reduced computational costs. The research follows a logical progression of architectural novelty. First, global context-aware attention frameworks, FuDSA-Net and VoCANet, are in troduced by leveraging multi-scalar features and global-context aware attention for efficient 2D/3D segmentation. The spatial and spectral domains are then integrated using a novel hybrid CNN-ViT framework WaveCoformer for learn ing robust representation of the target structure. The developed model achieves high segmentation accuracy with a lower parameter count. Subsequently, the research investigates a computationally efficient alternative to ViTs for segmen tation, called Vision-xLSTM, by developing the U-VixLSTM model. This is extended to the Rot-UViL architecture, capable of modeling cross-dimensional dependencies in volumetric inputs with its novel rotational attention. Finally, the thesis presents a prompt-driven pruning framework for ViT-based segmenta tion models, called PrATo, which dynamically prunes irrelevant ViT tokens with a parameter-free prompt-driven scoring mechanism. The framework achieves ∼ 35−55% reduction of processed tokens. The frameworks developed in this thesis are validated across multiple publicly available datasets; demonstrating their high segmentation accuracy along with computational efficiency.
dc.identifier.citation	165p.
dc.identifier.uri	http://hdl.handle.net/10263/7685
dc.language.iso	en
dc.relation.ispartofseries	ISI Ph.D Thesis; TH685
dc.subject	Medical Image Segmentation
dc.subject	Efficient Deep Learning
dc.subject	Vision Transformers
dc.subject	CNN
dc.subject	Vision-xLSTM
dc.subject	Model compression
dc.title	Exploring Resource-Efficient Deep Learning for Medical Image Segmentation
dc.type	Thesis

Files

Original bundle

Now showing 1 - 2 of 2

Name:: Form-17 Pallabi Dutta.pdf
Size:: 334.62 KB
Format:: Adobe Portable Document Format

Download

Name:: Thesis-Pallabi Dutta.pdf
Size:: 4.07 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Theses