Dissertation and Thesis

Permanent URI for this communityhttp://164.52.219.250:4000/handle/10263/2146

Browse

Search Results

Now showing 1 - 2 of 2
  • Item
    Scene Text Detection and Recognition
    (Indian Statistical Institute, Kolkata, 2024-06) Dhara, Saurav
    Deep learning methods have significantly reduced the difficulties related to multi-oriented text detection in recent scene text detection advances. The restrictions of conventional text representations, like horizontal boxes, rotated rectangles, or quadrangles, make it difficult to recognize curved writing. In order to tackle this problem, we provide a novel approach that uses instance-aware segmentation to identify irregular scene texts. Our method presents a semantic segmentation model that is led by attention and is intended to accurately label the weighted borders of text areas. Tests on multiple popular benchmarks show that, In contrast to cutting-edge techniques, our methodology delivers better performance on curved text datasets and maintains comparable results on multi-oriented text datasets. Simultaneously, despite encouraging results in scene text detection, the complexity of the multi-stage pipelines used by present approaches sometimes causes them to fail in difficult settings. We offer a strong and simplified pipeline that uses a single neural network to predict words or text lines of variable quadrilateral forms and orientations in complete images, removing the need for needless intermediate steps. This simplicity makes it possible to concentrate on creating neural network designs and loss functions. Our examinations using reference datasets reveal that our suggested approach performs substantially superior to the majority advanced methods concerning precision and efficiency.
  • Item
    A Novel Approach to Medical Image Segmentation with Convformer-Based Attention Mechanism and UNet
    (Indian Statistical Institute, Kolkata, 2024-06) Nandi, Swastik
    Accurate segmentation of medical images is a critical task in the field of healthcare, aiding in precise diagnosis and effective treatment planning. This project explores the enhancement of image segmentation models through the integration of advanced attention mechanisms. Our primary objective is to compare various attention techniques to develop a lightweight yet highly accurate model suitable for real-time applications. Given the significant body of work in medical image segmentation, our approach seeks to balance accuracy with computational efficiency. By incorporating different attention mechanisms and rigorously evaluating their performance, we aim to identify the optimal strategy for improving segmentation outcomes. The results demonstrate the potential for improved segmentation accuracy and efficiency, highlighting the effectiveness of attention-based models in capturing intricate patterns and dependencies within medical imaging data. We found out in our work that the CNN-based attention mechanism, or Convformer, effectively overcomes the issues related to the training conflict between CNNs and transformers. This project sets the groundwork for future advancements in semi-supervised and weakly-supervised learning, and we plan to expand our model’s applicability across a broader range of medical imaging scenarios. Our ultimate objective is to contribute towards the development of robust, efficient, and adaptable segmentation models that can enhance diagnostic accuracy and patient care in various medical fields.