Theses

Permanent URI for this collectionhttps://dspace.isical.ac.in/handle/10263/2744

Browse

Search Results

Now showing 1 - 2 of 2

Domain Obedient Deep Learning
(Indian Statistical Institute, Kolkata, 2025-07) Saha, Soumadeep
Deep learning, a family of data-driven artificial intelligence techniques, has shown immense promise in a plethora of applications, and it has even outpaced experts in several domains. However, unlike symbolic approaches to learning, these methods fall short when it comes to abiding by and learning from pre-existing established principles. This is a significant deficit for deployment in critical applications such as robotics, medicine, industrial automation, etc. For a decision system to be considered for adoption in such fields, it must demonstrate the ability to adhere to specified constraints, an ability missing in deep learning-based approaches. Exploring this problem serves as the core tenet of this dissertation. This dissertation starts with an exploration of the abilities of conventional deep learning-based systems vis-à-vis domain coherence. A large-scale rule-annotated dataset is introduced to mitigate the pronounced lack of suitable constraint adherence evaluation benchmarks, and with its aid, the rule adherence abilities of vision systems are analyzed. Additionally, this study probes language models to elicit their performance characteristics with regard to domain consistency. Examination of these language models with interventions illustrates their ineptitude at obeying domain principles, and a mitigation strategy is proposed. This is followed by an exploration of techniques for imbuing deep learning systems with domain constraint information. Also, a comprehensive study of standard evaluation metrics and their blind spots pertaining to domain-aware performance estimation is undertaken. Finally, a novel technique to enforce constraint compliance in models without training is introduced, which pairs a search strategy with large language models to achieve cutting-edge performance. A key highlight of this dissertation is the emphasis on addressing pertinent real-world problems with scalable and practicable solutions. We hope the results presented here pave the way for wider adoption of deep learning-based systems in pivotal situations with enhanced confidence in their trustworthiness.
Development of some Neural Network Models for Non-negative Matrix Factorization: Dimensionality Reduction
(Indian Statistical Institute, Kolkata, 2025-01) Dutta, Prasun
Recent research has been driven by the abundance of data, leading to the develop- ment of systems that enhance understanding across various fields. Effective machine learning algorithms are crucial for managing high-dimensional data, with dimension reduction being a key strategy to improve algorithm efficiency and decision-making. Non-negative Matrix Factorization (NMF) stands out as a method that transforms large datasets into interpretable, lower-dimensional forms by decomposing a matrix with non-negative elements into a pair of non-negative factors. This approach addresses the curse of dimensionality by dimensionally reducing data while preserving meaningful information. Dimension reduction techniques rely on extracting high-quality features from large datasets. Machine learning algorithms offer a solution by learning and optimizing fea- ture representations, which often outperform manually crafted ones. Artificial Neural Networks (ANNs) emulate human brain processing and excel in handling complex and nonlinear data relationships. Deep neural network models learn hierarchical patterns from data without explicit human intervention, making them ideal for large datasets. Traditional NMF technique employs block coordinate descent to update input ma- trix factors, whereas, we aim for simultaneous update. Our research work attempts to combine the strengths of NMF and neural networks to develop novel architectures that optimize low-dimensional data representation. We introduce five novel neural net- work architectures for NMF, accompanied by tailored objective functions and learning strategies to enhance the low rank approximation of input matrices in our thesis. In this thesis, first of all, n2MFn2, a model based on shallow neural network architec- ture, has been developed. An approximation of the input matrix has been ensured by the formulation of an appropriate objective function and adaptive learning scheme. Ac- tivation functions and weight initialization strategies have also been adjusted to adapt to the circumstances. On top of this shallow model, two deep neural network models, named DN3MF and MDSR-NMF, have been designed. To achieve the robustness of the deep neural network framework, the models have been designed as a two stage architecture, viz., pre-training and stacking. To find the closest realization of the con- ventional NMF technique as well as the closest approximation of the input, a novel neu- ral network architecture has been proposed in MDSR-NMF. Finally, two deep learning models, named IG-MDSR-NMF and IG-MDSR-RNMF, have been developed to imitate the human-centric learning strategy while guaranteeing a distinct pair of factor ma- trices that yields a better approximation of the input matrix. In IG-MDSR-NMF and IG-MDSR-RNMF the layers not only receive the hierarchically processed input from the previous layer but also refer to the original data whenever needed to ensure that the learning path is correct. A novel kind of non-negative matrix factorization tech- nique known as Relaxed NMF has been developed for IG-MDSR-RNMF, in which only one factor matrix meets the non-negativity requirements while the other one does not. This novel NMF technique allows the model to generate the best possible low dimen- sional representation of the input matrix while the confrontation of maintaining a pair of non-negative factors is removed

Theses

Browse

Filters

Settings

Sort By

Results per page

Search Results