Dissertations - M Tech (CS)

Permanent URI for this collectionhttp://164.52.219.250:4000/handle/10263/2147

These Dissertations were submitted in partial fulfilment of the requirements for the award of M TECH (Computer Science) Degree of Indian Statistical Institute

Browse

Search Results

Now showing 1 - 3 of 3

Efficient Blending of Large Language Models
(Indian Statistical Institute, Kolkata, 2025-06) Chatterjee, Sandeep
Due tothelimitedcapabilitiesofsingleLargeLanguageModels(LLMs),multipleLLMscanbe employedintandemforbetterreliabilityofanswers.Blendingreferstocombiningthestrengths of variousLLMstomakeuseoftheircomplementarycapabilitiesforgeneratinghigh-quality responses.Itisanon-trivialproblem,andthetaskbecomesevenmoredifficultwhenaiming for minimallatencyandsupervisingtheblendingcomponents.Thestandardframework,LLM- Blender, approachesthisinthreestages:responsegeneration,candidateselectionviaranking, and responsefusionthroughsummarization.However,thispipelinefacestwocriticallimita- tions—high latencyduetorepeatedrankingsteps,andheavyrelianceonexternal,supervised componentsincludingalearnedencoderforrankingandaseparatesequence-to-sequencesum- marizer forfusion. In thisthesis,weproposenovel,efficientalternativestoovercomethesechallenges.Thisthesis comprises twoworks.First,weshowthatreducingthefrequencyofrankingwithinmulti- turn conversationssignificantlyimproveslatencywithminimaldegradationinoutputquality. Second, weintroduceapeer-review-basedresponsefusionmechanism,whereLLMscollectively evaluateandreviseeachother’sresponses,removingtheneedforanyexternallytrainedrankers or summarizers.Thiscollaborativemethodenablesfullyself-containedLLMblendingwithout additional trainingorsupervision. WeassessourproposedmethodsonthetaskofConversationalQuestionAnsweringacrossfive multi-turnconversationalbenchmarks—ConvQuestions,Atlas-Converse,CoQA,QuAC,and DoQA—using tendiverse,publiclyavailableopen-weightLLMs.Experimentalresultsdemon- strate thatourpeer-review-drivenframeworkwithreducedrankingachievesqualityonparwith existing approacheswhilebeingsubstantiallymoreefficient.Ourworkpresentsasteptoward scalable, modularLLMensemblingforreal-worldopen-domaindialoguesystems.
Explanation and Judgement of IR Ranking using LLM
(Indian Statistical Institute, Kolkata, 2024-06) Mondal, Santanu
Pretrained transformer models such as BERT and T5 have significantly advanced the performance of information retrieval (IR) systems when fine-tuned with large-scale labeled datasets. However, their effectiveness diminishes notably in low-resource scenarios where annotated query-passage pairs are limited. This thesis explores an alternative supervision strategy by leveraging natural language explanations to enhance training signals during fine-tuning. We propose a novel methodology that augments traditional relevance labels with textual explanations generated by a large language model (LLM) using few-shot prompting. To achieve this, we generate explanations for 30,000 query-passage-label triples from the MS MARCO dataset using the open-source model google/gemma-2b, allowing for cost-free and scalable inference. These augmented samples are then used to fine-tune a T5-base sequence-to-sequence model, with the objective of producing both the relevance label and an accompanying explanation. During inference, the model predicts the label token, and the probability of that token is used as a soft relevance score, enabling efficient ranking. Empirical results demonstrate that our explanation-augmented retriever outperforms strong baselines, including BM25, a BERT reranker, and a T5 model trained with labels only. We further analyze the effectiveness of explanation order, training data size, and the quality of generated rationales. Our findings suggest that natural language explanations offer a powerful form of supervision, particularly valuable in data-scarce IR settings, and present a compelling direction for improving neural retrievers with minimal annotation overhead.
Binary Document Filtering for Retrieval-Augmented Generation
(Indian Statistical Institute, Kolkata, 2025-06) Saha, Sreyan
Retrieval-Augmented Generation (RAG) has become a popular technique to enhance Large Language Models (LLMs) with access to external information sources. However, the success of RAG systems critically depends on the relevance and quality of the retrieved documents. In particular, supplying irrelevant or noisy context can lead to degraded downstream generation quality. To address this, our project focuses on improving the document filtering stage in a RAG pipeline through binary relevance classification — deciding whether a retrieved document is suitable to include in the final context window based on its usefulness in directly answering the user query. We explore a wide range of approaches to this task, including rule-based retrieval methods (TF-IDF, BM25), classical machine learning classifiers (logistic regression, SVM), deep neural networks, and LLM-based methods, both in zero-shot and few-shot settings. Our final pipeline leverages instruction-tuned LLMs to act as strict binary classifiers, with a focus on maximizing precision over recall, thereby ensuring that only the most relevant and high-quality documents are passed to the generation module. Experiments are conducted on a Reddit-based query-document dataset tailored to subjective and opinion-heavy queries. Our evaluations suggest that LLMs, even without fine-tuning, can outperform traditional methods in this setting, o”ering a strong foundation for further enhancement through supervised fine-tuning

Dissertations - M Tech (CS)

Browse

Filters

Settings

Sort By

Results per page

Search Results