Efficient Blending of Large Language Models
| dc.contributor.author | Chatterjee, Sandeep | |
| dc.date.accessioned | 2025-07-22T09:31:42Z | |
| dc.date.available | 2025-07-22T09:31:42Z | |
| dc.date.issued | 2025-06 | |
| dc.description | Dissertation under the supervision of Dr. Debapriyo Majumdar and Dr. AmitChintamaniAwekar | en_US |
| dc.description.abstract | Due tothelimitedcapabilitiesofsingleLargeLanguageModels(LLMs),multipleLLMscanbe employedintandemforbetterreliabilityofanswers.Blendingreferstocombiningthestrengths of variousLLMstomakeuseoftheircomplementarycapabilitiesforgeneratinghigh-quality responses.Itisanon-trivialproblem,andthetaskbecomesevenmoredifficultwhenaiming for minimallatencyandsupervisingtheblendingcomponents.Thestandardframework,LLM- Blender, approachesthisinthreestages:responsegeneration,candidateselectionviaranking, and responsefusionthroughsummarization.However,thispipelinefacestwocriticallimita- tions—high latencyduetorepeatedrankingsteps,andheavyrelianceonexternal,supervised componentsincludingalearnedencoderforrankingandaseparatesequence-to-sequencesum- marizer forfusion. In thisthesis,weproposenovel,efficientalternativestoovercomethesechallenges.Thisthesis comprises twoworks.First,weshowthatreducingthefrequencyofrankingwithinmulti- turn conversationssignificantlyimproveslatencywithminimaldegradationinoutputquality. Second, weintroduceapeer-review-basedresponsefusionmechanism,whereLLMscollectively evaluateandreviseeachother’sresponses,removingtheneedforanyexternallytrainedrankers or summarizers.Thiscollaborativemethodenablesfullyself-containedLLMblendingwithout additional trainingorsupervision. WeassessourproposedmethodsonthetaskofConversationalQuestionAnsweringacrossfive multi-turnconversationalbenchmarks—ConvQuestions,Atlas-Converse,CoQA,QuAC,and DoQA—using tendiverse,publiclyavailableopen-weightLLMs.Experimentalresultsdemon- strate thatourpeer-review-drivenframeworkwithreducedrankingachievesqualityonparwith existing approacheswhilebeingsubstantiallymoreefficient.Ourworkpresentsasteptoward scalable, modularLLMensemblingforreal-worldopen-domaindialoguesystems. | en_US |
| dc.identifier.citation | 52p. | en_US |
| dc.identifier.uri | http://hdl.handle.net/10263/7592 | |
| dc.language.iso | en | en_US |
| dc.publisher | Indian Statistical Institute, Kolkata | en_US |
| dc.relation.ispartofseries | MTech(CS) Dissertation;23-18 | |
| dc.subject | Large Language Models | en_US |
| dc.title | Efficient Blending of Large Language Models | en_US |
| dc.type | Other | en_US |
Files
Original bundle
1 - 2 of 2
No Thumbnail Available
- Name:
- Sandeep_Chatterjee_MTech_Thesis_ISI.pdf
- Size:
- 2.99 MB
- Format:
- Adobe Portable Document Format
- Description:
- Dissertations - M Tech (CS)
No Thumbnail Available
- Name:
- Sandeep_Chatterjee_MTech_Thesis-Plagiarism.pdf
- Size:
- 1.01 MB
- Format:
- Adobe Portable Document Format
- Description:
- Plagiarism_report
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 1.71 KB
- Format:
- Item-specific license agreed upon to submission
- Description:
