Retrieval-augmented language models (RALMs) improve performance by accessing long-tail and up-to-date knowledge from external data stores, but are challenging to build. Existing approaches require either expensive retrieval-specific modifications to LM pre-training or use post-hoc integration of the data store that leads to suboptimal performance. We introduce Retrieval-Augmented Dual Instruction Tuning (RA-DIT), a lightweight fine-tuning methodology that provides a third option by retrofitting any large language model (LLM) with retrieval capabilities. Our approach operates in two distinct fine-tuning steps: (1) one updates a pretrained LM to better use retrieved information, while (2) the other updates the retriever to return more relevant results, as preferred by the LM. By fine-tuning over tasks that require both knowledge utilization and contextual awareness, we demonstrate that each stage yields significant performance improvements, and using both leads to additional gains. Our best model, RA-DIT 65B, achieves stateof-the-art performance across a range of knowledge-intensive zero- and few-shot learning benchmarks, significantly outperforming existing in-context RALM approaches by up to +8.9% in 0-shot setting and +1.4% in 5-shot setting on average.
Few key points about RA-DIT:
- RA-DIT is a method for retrofitting pre-trained language models with retrieval capabilities using lightweight instruction tuning. It has two main steps:
1. Language model fine-tuning (LM-ft): The language model is fine-tuned on NLP tasks with retrieved passages prepended to the input prompt. This helps the model learn to utilize relevant retrieved knowledge and ignore distracting/irrelevant text.
2. Retriever fine-tuning (R-ft): The neural retriever is fine-tuned using a generalized form of LM-Supervised Retrieval (LSR) to retrieve passages that improve the language model’s ability to generate correct predictions.
- The proposed RA-DIT framework significantly outperforms prior work like REPLUG in zero-shot and few-shot evaluations on knowledge-intensive benchmarks like MMLU, NQ, and TQA.
- RA-DIT also competes well against retrieval-augmented LMs like ATLAS that require extensive continuous pre-training, demonstrating the effectiveness of the lightweight tuning approach.
- Analyses show both LM-ft and R-ft offer significant gains individually, and combining them leads to further improvements. The benefits are consistent across different LM sizes.
- Overall, RA-DIT provides an effective way to imbue any pre-trained LM with retrieval abilities and knowledge utilization skills using simple fine-tuning, without expensive architectural changes or pre-training.