BRIEF: Bridging Retrieval and Inference for Multi-hop Reasoning via Compression
Yuankai Li, Jia-Chen Gu, Di Wu, Kai-Wei Chang, and Nanyun Peng, in Findings of the 2025 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-Findings), 2025.
Download the full text
Abstract
Retrieval-augmented generation (RAG) can supplement large language models (LLMs) by integrating external knowledge. However, as the number of retrieved documents increases, the input length to LLMs grows linearly, causing a dramatic increase in latency and a degradation in long-context understanding—especially for multi-hop questions that require reasoning across documents. We introduce \textbfBRIEF (Bridging Retrieval and Inference through Evidence Fusion), a lightweight approach that first compresses retrieved documents into dense, query-aware summaries and then feeds these into in-context RAG. We create synthetic training data by extracting atomic propositions from source documents, enabling learning of compression for multi-hop reasoning entirely with open-source tools. BRIEF produces far more concise summaries than prior methods and boosts open-domain QA: on HotpotQA, it doubles the compression rate while improving Flan-UL2 accuracy by +3.0% EM / +4.2% F1, and even surpasses GPT-3.5 with near-identical QA performance despite being markedly smaller.
Bib Entry
@inproceedings{li2025brief, author = {Li, Yuankai and Gu, Jia-Chen and Wu, Di and Chang, Kai-Wei and Peng, Nanyun}, title = {BRIEF: Bridging Retrieval and Inference for Multi-hop Reasoning via Compression}, booktitle = {Findings of the 2025 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-Findings)}, year = {2025} }