Dissertation and Thesis
Permanent URI for this communityhttp://164.52.219.250:4000/handle/10263/2146
Browse
4 results
Search Results
Item Enhancing Medical Image Analysis through Deep Learning:(Indian Statistical Institute, Kolkata, 2025-05) Ghosh, SusmitaMedical image analysis has become indispensable for accurate diagnosis and treatment planning. However, despite advances in deep learning, several critical challenges persist, ranging from more efficient models to the integration of multiple tasks within a unified framework. This thesis addresses these challenges by proposing innovative deep learn- ing architectures that enhance medical image classification, segmentation, and multitask learning. At the heart of this research is the goal of developing models that deliver high performance and tackle the nuanced complexities of medical data. Existing clas- sification models often overlook valuable information hidden in the spectral domain of images. I address this by integrating spatial and spectral features, demonstrating their complementary power to detect diseases such as COVID-19 from chest radiographs. This approach facilitates a more holistic understanding of medical images, improving the ac- curacy and reliability of diagnostic systems. To further enhance image classification, I explore hybrid architectures that combine convolutional and transformer-based models. These models leverage the strengths of both architectures, capturing fine-grained visual details and long-range dependencies. This significantly improves various medical imaging datasets, offering deeper interpretability and superior classification accuracy, particularly in complex diagnostic scenarios. Moving beyond classification, I tackle the fundamen- tal challenge of segmenting complex and irregular regions within medical images, where traditional deep learning models often struggle. To overcome this, I introduce a novel segmentation framework that combines the power of deep neural networks with trainable morphological operations. This leads to a more precise delineation of regions of inter- est, even in challenging clinical scenarios, setting a new benchmark for medical image segmentation. One of the most pressing issues in medical imaging is the inefficiency of current multitask learning models, which often require vast computational resources and struggle to generalize across different tasks. I present a lightweight multitask learn- ing framework that excels at both segmentation and classification, particularly in breast tumor analysis. Using novel morphological attention mechanisms and the sharing of task- specific knowledge, proposed model significantly reduces computational complexity while improving performance. Importantly, this framework demonstrates versatility across various medical imaging domains, from gland segmentation and malignancy detection in histology images to skin lesion analysis, demonstrating its robustness and applicability in real-world settings. Altogether, this thesis offers solutions to some of the most pressing problems in medical image analysis, providing models that are not only more accurate but also computationally efficient, making them suitable for deployment in clinical practice.Item Understanding Batch-Normalization in Deep Neural Networks(Indian Statistical Institute, Kolkata, 2025-06) Srujan, Pendyala SaiBatch Normalization (BN) is a commonly used technique in various deep learning architectures for tasks such as image classification and object detection. It stabilizes and accelerates training by normalizing the activations of intermediate layers using mean and variance of the batch, allowing the use of higher learning rates and often improving generalization through implicit regularization. During inference, BN uses running estimates of batch statistics accumulated during training. However, if individual batches are not representative of the overall data distribution, these accumulated statistics may not accurately approximate the population statistics. This discrepancy can lead to a phenomenon known as **estimation shift**, which impairs the model’s generalization performance. In this project, we study the behavior of estimation shift in deep learning models using BN and explore techniques to mitigate its effects. Specifically, we introduce **dynamicity** in the momentum parameter of BN layer (DMBN) while computing exponential moving averages and evaluate its impact under various architectural configurations. We use MNIST, FashionMNIST, and CIFAR-10/100 datasets to train and test both simple Deep Neural Networks (DNNs) as well as deeper Convolutional Neural Networks (CNNs) such as ResNet-50. Our experiments are conducted in two phases: first, by varying the static momentum parameter across different values, and second, by introducing layer-wise dynamic momentum where each layer is assigned the momentum (or equivalently, β) that minimizes estimation shift. The performance of the proposed method, DMBN, is evaluated using various performance metrics such as sensitivity, specificity, accuracy, and F-score. The DMBN is compared with existing BN-BFN method and is observed to be performing better in most of cases. For example, for fashionMNIST data, the accuracy values achieved by DMBN and BN-BFN are 0.889 and 0.853, respectively.Item Addressing class imbalance problems to improve animal detection through aerial image data(Indian Statistical Institute, Kolkata, 2025-06) Koushal, SuryangMonitoring animal populations in wildlife reserves is essential for conservation, especially for endangered species, but manual censuses are costly, risky, and logistically challenging due to vast, inaccessible terrains. Unmanned Aerial Vehicles (UAVs) with digital cameras provide a safer, scalable solution for collecting aerial imagery to estimate animal populations. However, semi-automated processing of these images faces significant challenges due to class imbalance in datasets, including foreground-background disparities, where background terrain dominates over sparse animal instances, and inter-class imbalances from uneven species representation and varied visual appearances (e.g., species, sizes, fur patterns) against diverse backgrounds like deserts or forests. These imbalances hinder Convolutional Neural Networks (CNNs) used for object detection, leading to inaccurate population estimates. This project addresses these issues using a dataset of 561 aerial images from Tsavo National Parks (March 2014) and Laikipia-Samburu Ecosystem (May 2015), collected by the Kenya Wildlife Service. We propose a clustering-based approach to categorize background terrain into distinct classes (e.g., desert, grassland), aiming to mitigate imbalances and improve animal detection accuracy in UAV imagery, supporting reliable, data-driven conservation strategies.Item An Attempt to Design a Neural Network Exploiting Biological Neurons(Indian Statistical Institute, Kolkata., 2021-07) Singh, Neeraj KumarA multilayer perceptron network is a very effective tool for both classification and regression type problems, which has been successfully used in many areas. In this era of artificial intelligence, deep neural networks such as Convolutional Neural Networks (CNNs) have been found to be extremely successful in solving many difficult problems, often defeating human performance. Often deep networks are viewed as ”all-cure” solutions. Unfortunately, most of these networks are generally of ”black-box” nature and their functioning usually is not related to the way biological neural network works. Some of these networks have millions of free parameters! Moreover, training deep networks often demands a huge volume of training data. In this study we intend to incorporate knowledge of biological neurons in some of the layers of convolutional neural networks. In particular, we study the computational models of some of the cells like Lateral Geniculate Nucleus (LGN) cells and Retinal Ganglion Cells and make use of such models to extract features from images to be used as input with the intention that if such features help in improving performance, such computational models will be built into the deep neural network. We also hope that this will enable us to reduce the size of the network because instead of blindly extracting features, it will try to mimic, to some extent, the way the brain extracts features. In this context, first we use the Combination of Receptive Fields (CORF) model. But our experiments do not exhibit the expected results. Then we propose another CNN model that uses the Difference of Gaussians (DoG) filters in some of the layers of the network because the CORF model makes all computations on DoG. This has resulted in noticeable improvement in performance with fewer trainable parameters than the ResNet-18 (we use ResNet-18 as the base network due to limited computing resources).
