BRIEF: Bridging Retrieval and Inference for Multi-hop Reasoning via Compression

Yuankai Li, Jia-Chen Gu, Di Wu, Kai-Wei Chang, and Nanyun Peng, in Findings of the 2025 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-Findings), 2025.

Download the full text

Abstract

Retrieval-augmented generation (RAG) can supplement large language models (LLMs) by integrating external knowledge. However, as the number of retrieved documents increases, the input length to LLMs grows linearly, causing a dramatic increase in latency and a degradation in long-context understanding—especially for multi-hop questions that require reasoning across documents. We introduce \textbfBRIEF (Bridging Retrieval and Inference through Evidence Fusion), a lightweight approach that first compresses retrieved documents into dense, query-aware summaries and then feeds these into in-context RAG. We create synthetic training data by extracting atomic propositions from source documents, enabling learning of compression for multi-hop reasoning entirely with open-source tools. BRIEF produces far more concise summaries than prior methods and boosts open-domain QA: on HotpotQA, it doubles the compression rate while improving Flan-UL2 accuracy by +3.0% EM / +4.2% F1, and even surpasses GPT-3.5 with near-identical QA performance despite being markedly smaller.

Bib Entry

@inproceedings{li2025brief,
  author = {Li, Yuankai and Gu, Jia-Chen and Wu, Di and Chang, Kai-Wei and Peng, Nanyun},
  title = {BRIEF: Bridging Retrieval and Inference for Multi-hop Reasoning via Compression},
  booktitle = {Findings of the 2025 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-Findings)},
  year = {2025}
}