Dissertations - M Tech (CS)

Permanent URI for this collectionhttps://dspace.isical.ac.in/handle/10263/2147

These Dissertations were submitted in partial fulfilment of the requirements for the award of M TECH (Computer Science) Degree of Indian Statistical Institute

Browse

Search Results

Now showing 1 - 5 of 5

Texture Classification through Deep Residual Networks and Feature Interpretability
(Indian Statistical Institute, Kolkata, 2025-06) Kumar, Ankit
Texture classification plays a critical role in various real-world and industrial applications such as material recognition in manufacturing, medical image diagnostics, surface defect detection, and agricultural monitoring. The ability to distinguish textures reliably enables automation and enhances the precision of intelligent systems. Traditional methods like Local Binary Patterns (LBP), Gabor filters, and wavelet-based descriptors have been used extensively for texture analysis. While these techniques are effective under controlled conditions, they suffer from limited robustness to changes in illumination, scale, and viewpoint. Moreover, handcrafted features often fail to capture the intricate texture structures present in real-world surfaces. The KTH-TIPS2a dataset introduces several challenges, notably large intra-class variations due to changes in scale, illumination, and pose. Additionally, the dataset includes materials with complex and fine-grained textures, making it difficult to extract discriminative features using shallow or traditional models. Addressing these challenges requires models capable of learning invariant and hierarchical representations. Deep convolutional neural networks (CNNs), such as ResNet, provide a promising solution by automatically learning multi-scale, texture-rich features that are resilient to visual variability, thereby improving classification performance on such complex datasets.
A Lightweight Multi-Attention Deep Architecture for Liver Tumor Segmentation with Limited Samples
(Indian Statistical Institute, Kolkata, 2024-06) Nag, Chandradipa
Liver tumor segmentation from CT images is of paramount importance in medical image analysis. Accurate segmentation of liver tumor is crucial for effective diagnosis and treatment planning in hepatocellular carcinoma and other liver malignancies. Manual as well as traditional segmentation approaches often struggle with the complex and heterogeneous nature of liver tumors, necessitating advanced deep learning techniques. In this regard, the thesis introduces a supervised lightweight multi-attention deep architecture, termed as LiMAU, for liver tumor segmentation. It judiciously integrates the merits of an enhanced U-Net architecture, known as U-Net3+, traditional attention gates, and the convolutional block attention module (CBAM). The U-Net3+ represents a refined version of the traditional U-Net design, enriching it with full-scale skip connections and deep supervision, thereby enhancing its architectural sophistication. The full-scale skip connections merge low-level details with high-level semantics from feature maps at different scales, while deep supervision learns hierarchical representations from the fully aggregated feature maps. This structure is particularly beneficial for organs appearing at varying scales. The incorporation of U-Net3+ in the proposed LiMAU reduces the number of network parameters, thereby enhancing computational efficiency. The integration of traditional attention gates allows the proposed supervised model to selectively focus on relevant regions, enhancing feature learning by suppressing irrelevant background noise. On the other hand, the CBAM, which sequentially applies channel and spatial attention, further refines this focus by enhancing the model’s ability to capture contextual and fine-grained details essential for precise tumor delineation. The proposed LiMAU features batch normalization layers in each double convolution block, which leads to higher segmentation accuracy. Next, the thesis introduces a deep framework for semi-supervised learning as a promising solution for liver tumor segmentation with limited labeled samples. The proposed LiMAU serves as the cornerstone of the proposed semi-supervised approach. It integrates a novel adversarial consistency learning architecture, which effectively utilizes less labeled data while providing high segmentation accuracy. The proposed semi-supervised framework harnesses both labeled and unlabeled data to mitigate the requirement for extensive annotated data. The proposed framework judiciously integrates deep adversarial networks and the Π model. The Π model is based on the concept of consistency learning, which maintains the consistency of segmentation output during training across various random perturbations of both labeled and unlabeled data. The deep adversarial network consists of a segmentation network (SN) and two evaluation networks (ENs). While the SN is used for the segmentation task, the ENs are used to assess segmentation quality. The proposed LiMAU is used as the SN, while a variant of VGG16 is used for both ENs. During training, the first EN is incentivized to differentiate between annotated and unannotated image segmentation, the second one 3 is encouraged to distinguish between perturbed and unperturbed data, while the SN is encouraged to produce segmentations for unlabeled images similar to those for annotated ones. The performance of the proposed supervised and semi-supervised models is evaluated on two benchmark data sets, namely, MICCAI 2017 Liver Tumor Segmentation Challenge (LiTS17) data and MICCAI-SLiver07 data, and compared with that of several state-of-the-art approaches. Experimental results demonstrate a significant improvement in segmentation accuracy over baseline models, with higher Dice similarity coefficients. This indicates that the combined use of traditional attention mechanisms and CBAM in the U-Net3+ architecture in supervised implementation as well as the semi-supervised adversarial network implementation significantly enhances the model’s ability to manage the variability and complexity of liver tumor morphology. These findings suggest that the proposed models hold great potential for clinical applications, offering improved precision in liver tumor segmentation.
Deep Learning for Classification of COVID-19 Chest CT Scan
(Indian Statistical Institute, Kolkata., 2021-07) Ghosh, Subhadip
The latest threat to global health is COVID-19. It has a tremendous diffusion rate and to combat with this pandemic, large scale testing and diagnosis is required. RT-PCR is the most accurate screening for validating COVID19 infection, but it is highly dependent on swab technique and needs time and resources. Thus, we need to find an alternative way to predict COVID19. Many researchers already conclude that COVID-19 is very related to Pneumonia and lungs feature of COVID is related to that of Pneumonia. There is ongoing research to detect Pneumonia [13] from Chest CT scans. Lung segmentation can help us to detect pulmonary abnormalities[10]. In this article first we try to segment lungs from chest CT scan and investigate the problems we face for COVID cases in deep learning architectures for lung segmentation. We propose an classical image processing algorithm to detect Lung from chest CT. As already mentioned that CNN is a great architecture to classify images, we are going to use a deep CNN model for lung classification. Covid is a new disease and we have to move faster to detect it. Hence, we are going to use transfer learning approach and use knowledge of pneumonia detection to classify COVID-19. In deep learning weight initialization for deep neural network is a major factor and can lead us to very different performance. In this article we are going to propose an weight initialization technique for transfer learning that can use not only the information about the architecture but also the information of the new class with respect to other known classes.
Recognition of Strokes in Tennis Videos Using Deep Learning
(Indian Statistical Institute,Kolkata, 2019-07) Singanporia, Kushal
Prior introduction of neural nets to domain of computer vision, action recognition requires specific domain knowledge. Still domain knowledge is useful in action recognition but with availability of huge data and neural nets, data-driven feature learning methods have emerged as an alternative. Recent trends in action recognition uses LSTM and its various modifications, as LSTM have memory retaining capability which other architectures lake. In this work we performed action recognition on different tennis strokes. Our work relay on architecture proposed By Husain, Dellen, and Torras, 2016. Architecture is comprised of various modified VGG-nets connected in parallel. As it doesn’t include LSTM, which makes it different than other works.
Sports Video Action Recognition
(Indian Statistical Institute, Kolkata, 2019-06-30) Datta, Santanu
From playing games to driving cars, deep learning has achieved great success in the recent past.In this dissertation, we apply deep learning to recognize sports videos. We have implemented state of the art VGG3D model on different challenging state of the art video datasets. In this paper , we communicate our findings.

Dissertations - M Tech (CS)

Browse

Filters

Settings

Sort By

Results per page

Search Results