Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue

Jia-Chen Gu, Hao-Xiang Xu, Jun-Yu Ma, Pan Lu, Zhen-Hua Ling, Kai-Wei Chang, and Nanyun Peng, in Proceedings of The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024.

Abstract

Model editing is a technique that edits large language models (LLMs) with updated knowledge to alleviate hallucinations without resource-intensive retraining. This paper systematically analyzes the side effects of model editing methods and proposes a regularization method to address the overfitting. Our experiments show that it is challenging for current editing methods to improve factuality while maintaining general abilities. We propose RECT (RElative Change in weighT) to mitigate side effects, showing significant performance retention.

Source Code

Bib Entry

@inproceedings{gu2024model,
  author = {Gu, Jia-Chen and Xu, Hao-Xiang and Ma, Jun-Yu and Lu, Pan and Ling, Zhen-Hua and Chang, Kai-Wei and Peng, Nanyun},
  title = {Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue},
  booktitle = {Proceedings of The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  year = {2024}
}

Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue