Are Large Language Models Capable of Generating Human-Level Narratives?

Yufei Tian, Tenghao Huang, Miri Liu, Derek Jiang, Alexander Spangher, Muhao Chen, Jonathan May, and Nanyun Peng, in Proceedings of The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024.

🏆 Outstanding Paper Award (<0.4%)

Download the full text

Abstract

This paper investigates the capability of LLMs in storytelling, focusing on narrative development and plot progression. We introduce a novel computational framework to analyze narratives through three discourse-level aspects: i) story arcs, ii) turning points, and iii) affective dimensions, including arousal and valence. By leveraging expert and automatic annotations, we uncover significant discrepancies between the LLM- and human- written stories. While human-written stories are suspenseful, arousing, and diverse in narrative structures, LLM stories are homogeneously positive and lack tension. Next, we measure narrative reasoning skills as a precursor to generative capacities, concluding that most LLMs fall short of human abilities in discourse understanding. Finally, we show that explicit integration of aforementioned discourse features can enhance storytelling, as is demonstrated by over 40% improvement in neural storytelling in terms of diversity, suspense, and arousal.

Source Code

Bib Entry

@inproceedings{tian2024are,
  author = {Tian, Yufei and Huang, Tenghao and Liu, Miri and Jiang, Derek and Spangher, Alexander and Chen, Muhao and May, Jonathan and Peng, Nanyun},
  title = {Are Large Language Models Capable of Generating Human-Level Narratives?},
  booktitle = {Proceedings of The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  year = {2024}
}