Dissertation and Thesis

Permanent URI for this communityhttp://164.52.219.250:4000/handle/10263/2146

Browse

Search Results

Now showing 1 - 5 of 5
  • Item
    Domain Obedient Deep Learning
    (Indian Statistical Institute, Kolkata, 2025-07) Saha, Soumadeep
    Deep learning, a family of data-driven artificial intelligence techniques, has shown immense promise in a plethora of applications, and it has even outpaced experts in several domains. However, unlike symbolic approaches to learning, these methods fall short when it comes to abiding by and learning from pre-existing established principles. This is a significant deficit for deployment in critical applications such as robotics, medicine, industrial automation, etc. For a decision system to be considered for adoption in such fields, it must demonstrate the ability to adhere to specified constraints, an ability missing in deep learning-based approaches. Exploring this problem serves as the core tenet of this dissertation. This dissertation starts with an exploration of the abilities of conventional deep learning-based systems vis-à-vis domain coherence. A large-scale rule-annotated dataset is introduced to mitigate the pronounced lack of suitable constraint adherence evaluation benchmarks, and with its aid, the rule adherence abilities of vision systems are analyzed. Additionally, this study probes language models to elicit their performance characteristics with regard to domain consistency. Examination of these language models with interventions illustrates their ineptitude at obeying domain principles, and a mitigation strategy is proposed. This is followed by an exploration of techniques for imbuing deep learning systems with domain constraint information. Also, a comprehensive study of standard evaluation metrics and their blind spots pertaining to domain-aware performance estimation is undertaken. Finally, a novel technique to enforce constraint compliance in models without training is introduced, which pairs a search strategy with large language models to achieve cutting-edge performance. A key highlight of this dissertation is the emphasis on addressing pertinent real-world problems with scalable and practicable solutions. We hope the results presented here pave the way for wider adoption of deep learning-based systems in pivotal situations with enhanced confidence in their trustworthiness.
  • Item
    Texture Classification through Deep Residual Networks and Feature Interpretability
    (Indian Statistical Institute, Kolkata, 2025-06) Kumar, Ankit
    Texture classification plays a critical role in various real-world and industrial applications such as material recognition in manufacturing, medical image diagnostics, surface defect detection, and agricultural monitoring. The ability to distinguish textures reliably enables automation and enhances the precision of intelligent systems. Traditional methods like Local Binary Patterns (LBP), Gabor filters, and wavelet-based descriptors have been used extensively for texture analysis. While these techniques are effective under controlled conditions, they suffer from limited robustness to changes in illumination, scale, and viewpoint. Moreover, handcrafted features often fail to capture the intricate texture structures present in real-world surfaces. The KTH-TIPS2a dataset introduces several challenges, notably large intra-class variations due to changes in scale, illumination, and pose. Additionally, the dataset includes materials with complex and fine-grained textures, making it difficult to extract discriminative features using shallow or traditional models. Addressing these challenges requires models capable of learning invariant and hierarchical representations. Deep convolutional neural networks (CNNs), such as ResNet, provide a promising solution by automatically learning multi-scale, texture-rich features that are resilient to visual variability, thereby improving classification performance on such complex datasets.
  • Item
    Development of some Neural Network Models for Non-negative Matrix Factorization: Dimensionality Reduction
    (Indian Statistical Institute, Kolkata, 2025-01) Dutta, Prasun
    Recent research has been driven by the abundance of data, leading to the develop- ment of systems that enhance understanding across various fields. Effective machine learning algorithms are crucial for managing high-dimensional data, with dimension reduction being a key strategy to improve algorithm efficiency and decision-making. Non-negative Matrix Factorization (NMF) stands out as a method that transforms large datasets into interpretable, lower-dimensional forms by decomposing a matrix with non-negative elements into a pair of non-negative factors. This approach addresses the curse of dimensionality by dimensionally reducing data while preserving meaningful information. Dimension reduction techniques rely on extracting high-quality features from large datasets. Machine learning algorithms offer a solution by learning and optimizing fea- ture representations, which often outperform manually crafted ones. Artificial Neural Networks (ANNs) emulate human brain processing and excel in handling complex and nonlinear data relationships. Deep neural network models learn hierarchical patterns from data without explicit human intervention, making them ideal for large datasets. Traditional NMF technique employs block coordinate descent to update input ma- trix factors, whereas, we aim for simultaneous update. Our research work attempts to combine the strengths of NMF and neural networks to develop novel architectures that optimize low-dimensional data representation. We introduce five novel neural net- work architectures for NMF, accompanied by tailored objective functions and learning strategies to enhance the low rank approximation of input matrices in our thesis. In this thesis, first of all, n2MFn2, a model based on shallow neural network architec- ture, has been developed. An approximation of the input matrix has been ensured by the formulation of an appropriate objective function and adaptive learning scheme. Ac- tivation functions and weight initialization strategies have also been adjusted to adapt to the circumstances. On top of this shallow model, two deep neural network models, named DN3MF and MDSR-NMF, have been designed. To achieve the robustness of the deep neural network framework, the models have been designed as a two stage architecture, viz., pre-training and stacking. To find the closest realization of the con- ventional NMF technique as well as the closest approximation of the input, a novel neu- ral network architecture has been proposed in MDSR-NMF. Finally, two deep learning models, named IG-MDSR-NMF and IG-MDSR-RNMF, have been developed to imitate the human-centric learning strategy while guaranteeing a distinct pair of factor ma- trices that yields a better approximation of the input matrix. In IG-MDSR-NMF and IG-MDSR-RNMF the layers not only receive the hierarchically processed input from the previous layer but also refer to the original data whenever needed to ensure that the learning path is correct. A novel kind of non-negative matrix factorization tech- nique known as Relaxed NMF has been developed for IG-MDSR-RNMF, in which only one factor matrix meets the non-negativity requirements while the other one does not. This novel NMF technique allows the model to generate the best possible low dimen- sional representation of the input matrix while the confrontation of maintaining a pair of non-negative factors is removed
  • Item
    A Lightweight Multi-Attention Deep Architecture for Liver Tumor Segmentation with Limited Samples
    (Indian Statistical Institute, Kolkata, 2024-06) Nag, Chandradipa
    Liver tumor segmentation from CT images is of paramount importance in medical image analysis. Accurate segmentation of liver tumor is crucial for effective diagnosis and treatment planning in hepatocellular carcinoma and other liver malignancies. Manual as well as traditional segmentation approaches often struggle with the complex and heterogeneous nature of liver tumors, necessitating advanced deep learning techniques. In this regard, the thesis introduces a supervised lightweight multi-attention deep architecture, termed as LiMAU, for liver tumor segmentation. It judiciously integrates the merits of an enhanced U-Net architecture, known as U-Net3+, traditional attention gates, and the convolutional block attention module (CBAM). The U-Net3+ represents a refined version of the traditional U-Net design, enriching it with full-scale skip connections and deep supervision, thereby enhancing its architectural sophistication. The full-scale skip connections merge low-level details with high-level semantics from feature maps at different scales, while deep supervision learns hierarchical representations from the fully aggregated feature maps. This structure is particularly beneficial for organs appearing at varying scales. The incorporation of U-Net3+ in the proposed LiMAU reduces the number of network parameters, thereby enhancing computational efficiency. The integration of traditional attention gates allows the proposed supervised model to selectively focus on relevant regions, enhancing feature learning by suppressing irrelevant background noise. On the other hand, the CBAM, which sequentially applies channel and spatial attention, further refines this focus by enhancing the model’s ability to capture contextual and fine-grained details essential for precise tumor delineation. The proposed LiMAU features batch normalization layers in each double convolution block, which leads to higher segmentation accuracy. Next, the thesis introduces a deep framework for semi-supervised learning as a promising solution for liver tumor segmentation with limited labeled samples. The proposed LiMAU serves as the cornerstone of the proposed semi-supervised approach. It integrates a novel adversarial consistency learning architecture, which effectively utilizes less labeled data while providing high segmentation accuracy. The proposed semi-supervised framework harnesses both labeled and unlabeled data to mitigate the requirement for extensive annotated data. The proposed framework judiciously integrates deep adversarial networks and the Π model. The Π model is based on the concept of consistency learning, which maintains the consistency of segmentation output during training across various random perturbations of both labeled and unlabeled data. The deep adversarial network consists of a segmentation network (SN) and two evaluation networks (ENs). While the SN is used for the segmentation task, the ENs are used to assess segmentation quality. The proposed LiMAU is used as the SN, while a variant of VGG16 is used for both ENs. During training, the first EN is incentivized to differentiate between annotated and unannotated image segmentation, the second one 3 is encouraged to distinguish between perturbed and unperturbed data, while the SN is encouraged to produce segmentations for unlabeled images similar to those for annotated ones. The performance of the proposed supervised and semi-supervised models is evaluated on two benchmark data sets, namely, MICCAI 2017 Liver Tumor Segmentation Challenge (LiTS17) data and MICCAI-SLiver07 data, and compared with that of several state-of-the-art approaches. Experimental results demonstrate a significant improvement in segmentation accuracy over baseline models, with higher Dice similarity coefficients. This indicates that the combined use of traditional attention mechanisms and CBAM in the U-Net3+ architecture in supervised implementation as well as the semi-supervised adversarial network implementation significantly enhances the model’s ability to manage the variability and complexity of liver tumor morphology. These findings suggest that the proposed models hold great potential for clinical applications, offering improved precision in liver tumor segmentation.
  • Item
    Deep Learning for Classification of COVID-19 Chest CT Scan
    (Indian Statistical Institute, Kolkata., 2021-07) Ghosh, Subhadip
    The latest threat to global health is COVID-19. It has a tremendous diffusion rate and to combat with this pandemic, large scale testing and diagnosis is required. RT-PCR is the most accurate screening for validating COVID19 infection, but it is highly dependent on swab technique and needs time and resources. Thus, we need to find an alternative way to predict COVID19. Many researchers already conclude that COVID-19 is very related to Pneumonia and lungs feature of COVID is related to that of Pneumonia. There is ongoing research to detect Pneumonia [13] from Chest CT scans. Lung segmentation can help us to detect pulmonary abnormalities[10]. In this article first we try to segment lungs from chest CT scan and investigate the problems we face for COVID cases in deep learning architectures for lung segmentation. We propose an classical image processing algorithm to detect Lung from chest CT. As already mentioned that CNN is a great architecture to classify images, we are going to use a deep CNN model for lung classification. Covid is a new disease and we have to move faster to detect it. Hence, we are going to use transfer learning approach and use knowledge of pneumonia detection to classify COVID-19. In deep learning weight initialization for deep neural network is a major factor and can lead us to very different performance. In this article we are going to propose an weight initialization technique for transfer learning that can use not only the information about the architecture but also the information of the new class with respect to other known classes.