PhonologyBench: Evaluating Phonological Skills of Large Language Models
Ashima Suvarna, Harshita Khandelwal, and Nanyun Peng, in Workshop Towards Knowledgeable Language Models at The 62nd Annual Meeting of the Association for Computational Linguistics (ACL), 2024.
Download the full text
Abstract
Phonological competence—grapheme-to-phoneme mapping, syllable counting, rhyme generation—is under-explored in LLM research. PhonologyBench provides three English diagnostic tasks targeting these skills. Despite never seeing speech data, leading LLMs show promising performance, yet still trail humans by 17% (rhyme generation) and 45% (syllable counting). Results urge greater attention to phonology when deploying LLMs in speech-related applications.
Bib Entry
@inproceedings{suvarna2024phonologybench, author = {Suvarna, Ashima and Khandelwal, Harshita and Peng, Nanyun}, title = {PhonologyBench: Evaluating Phonological Skills of Large Language Models}, booktitle = {Workshop Towards Knowledgeable Language Models at The 62nd Annual Meeting of the Association for Computational Linguistics (ACL)}, year = {2024} }