My Google Scholar

Download the bibfile


Preprint

    2024

    1. Adaptable Logical Control for Large Language Models

      Honghua Zhang, Po-Nien Kung, Masahiro Yoshida, Guy Van den Broeck, and Nanyun Peng, in Proceedings of The Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS), 2024.
      Abstract BibTeX Details
      Despite the success of Large Language Models (LLMs) in performing various tasks with provided instructions, controlling model generation during inference poses a persistent challenge. In this paper, we introduce Ctrl-G, an adaptable framework that facilitates tractable and flexible control over LLM generation. Ctrl-G can combine any production-ready LLMs with a Hidden Markov Model (HMM), enabling output generation that adheres to logical constraints represented as deterministic finite automata (DFAs), including keyword control, length control, and insertion. Our study demonstrates that Ctrl-G, coupled with a TULU-2-7B model, outperforms GPT3.5 and GPT4 models in human evaluations for interactive text editing by 30% overall satisfaction rate, and exhibits high-quality generation with 100% constraint satisfaction. Additionally, our experiment on the Grade School Math (GSM) dataset highlights the potential of applying Ctrl-G beyond natural language generation (NLG) tasks. By guiding the reasoning process with logical constraints, we achieved a 3.4% improvement on the GSM subset, underscoring Ctrl-G’s broader applicability.
      @inproceedings{zhang2024adaptable,
        title = {Adaptable Logical Control for Large Language Models},
        author = {Zhang, Honghua and Kung, Po-Nien and Yoshida, Masahiro and den Broeck, Guy Van and Peng, Nanyun},
        year = {2024},
        booktitle = {Proceedings of The Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS)}
      }
      
      Details
    2. SafeWorld: Geo-Diverse Safety Alignment

      Da Yin, Haoyi Qiu, Kung-Hsiang Huang, Kai-Wei Chang, and Nanyun Peng, in Proceedings of The Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS), 2024.
      Abstract BibTeX Details
      In the rapidly evolving field of Large Language Models (LLMs), ensuring safety is a crucial and widely discussed topic. However, existing works often overlook the geo-diversity of cultural and legal standards across the world. To reveal the challenges posed by geo-diverse safety standards, we introduce SafeWorld, a novel benchmark specifically designed to evaluate LLMs’ ability to generate responses that are not only helpful but also culturally sensitive and legally compliant across diverse global contexts. SafeWorld encompasses 2,775 test user queries, each grounded in high-quality, human-verified cultural norms and legal policies from 50 countries and 493 regions/races. On top of it, we propose a multi-dimensional automatic safety evaluation framework that assesses the contextual appropriateness, accuracy, and comprehensiveness of responses. Our evaluations reveal that current LLMs struggle to meet these criteria effectively. To enhance LLMs’ alignment with geo-diverse safety standards, we synthesize helpful preference pairs for Direct Preference Optimization (DPO) alignment. The preference pair construction aims to encourage LLMs to behave appropriately and provide precise references to relevant cultural norms and policies when necessary. Our trained SafeWorldLM outperforms all competing models, including GPT-4o on all the three evaluation dimensions by a large margin. Global human evaluators also note a nearly 20% higher winning rate in helpfulness and harmfulness evaluation.
      @inproceedings{yin2024safeworld,
        title = {SafeWorld: Geo-Diverse Safety Alignment},
        author = {Yin, Da and Qiu, Haoyi and Huang, Kung-Hsiang and Chang, Kai-Wei and Peng, Nanyun},
        year = {2024},
        booktitle = {Proceedings of The Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS)}
      }
      
      Details
    3. Matryoshka Query Transformer for Large Vision-Language Models

      Wenbo Hu, Zi-Yi Dou, Liunian Harold Li, Amita Kamath, Nanyun Peng, and Kai-Wei Chang, in Proceedings of The Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS), 2024.
      Full Text Code Abstract BibTeX Details
      Large Vision-Language Models (LVLMs) typically encode an image into a fixed number of visual tokens (e.g., 576) and process these tokens with a language model. Despite their strong performance, LVLMs face challenges in adapting to varying computational constraints. This raises the question: can we achieve flexibility in the number of visual tokens to suit different tasks and computational resources? We answer this with an emphatic yes. Inspired by Matryoshka Representation Learning, we introduce the Matryoshka Query Transformer (MQT), capable of encoding an image into m visual tokens during inference, where m can be any number up to a predefined maximum. This is achieved by employing a query transformer with M latent query tokens to compress the visual embeddings. During each training step, we randomly select m ≤M latent query tokens and train the model using only these first m tokens, discarding the rest. Combining MQT with LLaVA, we train a single model once, and flexibly and drastically reduce the number of inference-time visual tokens while maintaining similar or better performance compared to training independent models for each number of tokens. Our model, MQT-LLaVA, matches LLaVA-1.5 performance across 11 benchmarks using a maximum of 256 tokens instead of LLaVA’s fixed 576. Reducing to 16 tokens (8x less TFLOPs) only sacrifices the performance by 2.4 points on MMBench. On certain tasks such as ScienceQA and MMMU, we can even go down to only 2 visual tokens with performance drops of just 3% and 6% each. Our exploration of the trade-off between the accuracy and computational cost brought about by the number of visual tokens facilitates future research to achieve the best of both worlds.
      @inproceedings{hu2024mqt,
        title = {Matryoshka Query Transformer for Large Vision-Language Models},
        author = {Hu, Wenbo and Dou, Zi-Yi and Li, Liunian Harold and Kamath, Amita and Peng, Nanyun and Chang, Kai-Wei},
        year = {2024},
        booktitle = {Proceedings of The Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS)},
        project_website = {https://gordonhu608.github.io/mqtllava/}
      }
      
      Details
    4. DACO: Towards Application-Driven and Comprehensive Data Analysis via Code Generation

      Xueqing Wu, Rui Zheng, Jingzhen Sha, Te-Lin Wu, Hanyu Zhou, Tang Mohan, Kai-Wei Chang, Nanyun Peng, and Haoran Huang, in Proceedings of The Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS), Datasets and Benchmarks Track, 2024.
      Full Text Code Abstract BibTeX Details
      Data analysis is a crucial analytical process essential for deriving insights from real-world databases. As shown in Figure 1, the need for data analysis typically arises from specific application scenarios, and requires diverse reasoning skills including mathematical reasoning, logical reasoning, and strategic reasoning. Existing work often focus on simple factual retrieval or arithmetic resolutions and thus are insufficient for addressing complex real-world queries. This work aims to propose new resources and benchmarks on this crucial yet challenging and under-explored task. Due to the prohibitively high cost of collecting expert annotations, we use large language models (LLMs) enhanced by code generation to automatically generate high-quality data analysis, which will later be refined by human annotators. We construct the DACO dataset, containing (1) 440 databases (of tabular data) collected from real-world scenarios, (2)  2k automatically generated query-answer pairs that can serve as weak supervision for model training, and (3) a concentrated but high-quality test set with human refined annotations that serves as our main evaluation benchmark. Experiments show that while LLMs like GPT-4 exhibit promising data analysis capabilities, they are still evaluated as less helpful than human-written analysis on 58.1% cases. Leveraging our weak supervision data, we experiment with various fine-tuning methods, including supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF). Our trained model outperforms existing baselines for table question answering, and RLHF further boosts the helpfulness of generated analysis on 58.5% cases.
      @inproceedings{wu2024daco,
        title = {DACO: Towards Application-Driven and Comprehensive Data Analysis via Code Generation},
        author = {Wu, Xueqing and Zheng, Rui and Sha, Jingzhen and Wu, Te-Lin and Zhou, Hanyu and Mohan, Tang and Chang, Kai-Wei and Peng, Nanyun and Huang, Haoran},
        year = {2024},
        booktitle = {Proceedings of The Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS), Datasets and Benchmarks Track}
      }
      
      Details
    5. Measuring Psychological Depth in Language Models

      Fabrice Y. Harel-Canada, Hanyu Zhou, Sreya Muppalla, Zeynep Senahan Yildiz, Miryung Kim, Amit Sahai, and Nanyun Peng, in Proceedings of The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024.
      Full Text Code Abstract BibTeX Details 🏆 Outstanding Paper Award (<0.4%)
      Evaluations of creative stories generated by large language models (LLMs) often focus on objective properties of the text, such as its style, coherence, and diversity. While these metrics are indispensable, they do not speak to a story’s subjective, psychological impact from a reader’s perspective. We introduce the Psychological Depth Scale (PDS), a novel framework rooted in literary theory that measures an LLM’s ability to produce authentic and narratively complex stories that provoke emotion, empathy, and engagement. We empirically validate our framework by showing that humans can consistently evaluate stories based on PDS (0.72 Krippendorff’s alpha). We also explore techniques for automating the PDS to easily scale future analyses. GPT-4o, combined with a novel Mixture-of-Personas (MoP) prompting strategy, achieves an average Spearman correlation of 0.51 with human judgment while Llama-3-70B with constrained decoding scores as high as 0.68 for empathy. Finally, we compared the depth of stories authored by both humans and LLMs. Surprisingly, GPT-4 stories either surpassed or were statistically indistinguishable from highly-rated human-written stories sourced from Reddit. By shifting the focus from text to reader, the Psychological Depth Scale is a validated, automated, and systematic means of measuring the capacity of LLMs to connect with humans through the stories they tell.
      @inproceedings{harel2024measuring,
        author = {Harel-Canada, Fabrice Y and Zhou, Hanyu and Muppalla, Sreya and Yildiz, Zeynep Senahan and Kim, Miryung and Sahai, Amit and Peng, Nanyun},
        title = {Measuring Psychological Depth in Language Models},
        booktitle = {Proceedings of The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        year = {2024}
      }
      
      Details
    6. Are Large Language Models Capable of Generating Human-Level Narratives?

      Yufei Tian, Tenghao Huang, Miri Liu, Derek Jiang, Alexander Spangher, Muhao Chen, Jonathan May, and Nanyun Peng, in Proceedings of The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024.
      Full Text Code Abstract BibTeX Details 🏆 Outstanding Paper Award (<0.4%)
      This paper investigates the capability of LLMs in storytelling, focusing on narrative development and plot progression. We introduce a novel computational framework to analyze narratives through three discourse-level aspects: i) story arcs, ii) turning points, and iii) affective dimensions, including arousal and valence. By leveraging expert and automatic annotations, we uncover significant discrepancies between the LLM- and human- written stories. While human-written stories are suspenseful, arousing, and diverse in narrative structures, LLM stories are homogeneously positive and lack tension. Next, we measure narrative reasoning skills as a precursor to generative capacities, concluding that most LLMs fall short of human abilities in discourse understanding. Finally, we show that explicit integration of aforementioned discourse features can enhance storytelling, as is demonstrated by over 40% improvement in neural storytelling in terms of diversity, suspense, and arousal.
      @inproceedings{tian2024are,
        author = {Tian, Yufei and Huang, Tenghao and Liu, Miri and Jiang, Derek and Spangher, Alexander and Chen, Muhao and May, Jonathan and Peng, Nanyun},
        title = {Are Large Language Models Capable of Generating Human-Level Narratives?},
        booktitle = {Proceedings of The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        year = {2024}
      }
      
      Details
    7. Do LLMs Plan Like Human Writers? Comparing Journalist Coverage of Press Releases with LLMs

      Alexander Spangher, Nanyun Peng, Sebastian Gehrmann, and Mark Dredze, in Proceedings of The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024.
      Abstract BibTeX Details 🏆 Outstanding Paper Award (<0.4%)
      Journalists engage in multiple steps in the news writing process that depend on human creativity, like exploring different “angles” (i.e., story directions). These can potentially be aided by large language models (LLMs). By affecting planning decisions, such interventions can have an outsize impact on creative output. We advocate a careful approach to evaluating these interventions, to ensure alignment with human values, by comparing LLM decisions to previous human decisions. In a case study of journalistic coverage of press releases, we assemble a large dataset of 250k press releases and 650k human-written articles covering them. We develop methods to identify news articles that challenge and contextualize press releases. Finally, we evaluate suggestions made by LLMs for these articles and compare these with decisions made by human journalists.
      @inproceedings{spangher2024llm_planning,
        author = {Spangher, Alexander and Peng, Nanyun and Gehrmann, Sebastian and Dredze, Mark},
        title = {Do LLMs Plan Like Human Writers? Comparing Journalist Coverage of Press Releases with LLMs},
        booktitle = {Proceedings of The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        year = {2024}
      }
      
      Details
    8. Evaluating LLMs’ Capability in Satisfying Lexical Constraints

      Bingxuan Li, Yiwei Wang, Tao Meng, Kai-Wei Chang, and Nanyun Peng, in Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024.
      Abstract BibTeX Details 🏆 Best Paper Nomination (2%)
      This paper analyzes the performance of LLMs in Lexical Constrained Generation (LCG) tasks, identifying key limitations and proposing the Divide and Conquer Generation strategy. Our approach significantly enhances LLMs’ success rate in satisfying lexical constraints across various tasks, providing insights into improving text generation applications.
      @inproceedings{li2024evaluating,
        title = {Evaluating LLMs' Capability in Satisfying Lexical Constraints},
        author = {Li, Bingxuan and Wang, Yiwei and Meng, Tao and Chang, Kai-Wei and Peng, Nanyun},
        booktitle = {Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        year = {2024}
      }
      
      Details
    9. Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue

      Jia-Chen Gu, Hao-Xiang Xu, Jun-Yu Ma, Pan Lu, Zhen-Hua Ling, Kai-Wei Chang, and Nanyun Peng, in Proceedings of The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024.
      Full Text Code Abstract BibTeX Details 🏆 Best Paper Nomination (2%)
      Model editing is a technique that edits large language models (LLMs) with updated knowledge to alleviate hallucinations without resource-intensive retraining. This paper systematically analyzes the side effects of model editing methods and proposes a regularization method to address the overfitting. Our experiments show that it is challenging for current editing methods to improve factuality while maintaining general abilities. We propose RECT (RElative Change in weighT) to mitigate side effects, showing significant performance retention.
      @inproceedings{gu2024model,
        author = {Gu, Jia-Chen and Xu, Hao-Xiang and Ma, Jun-Yu and Lu, Pan and Ling, Zhen-Hua and Chang, Kai-Wei and Peng, Nanyun},
        title = {Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue},
        booktitle = {Proceedings of The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        year = {2024}
      }
      
      Details
    10. Explaining and Improving Contrastive Decoding by Extrapolating the Probabilities of a Huge and Hypothetical LM

      Haw-Shiuan Chang, Nanyun Peng, Mohit Bansal, Anil Ramakrishna, and Tagyoung Chung, in Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024.
      Abstract BibTeX Details 🏆 Best Paper Nomination (2%)
      Contrastive decoding (CD) improves the next-token distribution of a large expert language model (LM) using a small amateur LM. This paper theoretically explains why CD works well and introduces a new method, Asymptotic Probability Decoding (APD), to overcome its limitations. Experiments show that APD significantly boosts factuality in open-ended text generation and achieves new state-of-the-art results across multiple datasets.
      @inproceedings{chang2024contrastive,
        title = {Explaining and Improving Contrastive Decoding by Extrapolating the Probabilities of a Huge and Hypothetical LM},
        author = {Chang, Haw-Shiuan and Peng, Nanyun and Bansal, Mohit and Ramakrishna, Anil and Chung, Tagyoung},
        booktitle = {Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        year = {2024}
      }
      
      Details
    11. Re-ReST: Reflection-Reinforced Self-Training for Language Agents

      Zi-Yi Dou, Cheng-Fu Yang, Xueqing Wu, Kai-Wei Chang, and Nanyun Peng, in Proceedings of The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024.
      Full Text Code Abstract BibTeX Details
      Finetuning language agents with reasoning-action trajectories is effective, but obtaining these trajectories from human annotations or stronger models is costly and sometimes impractical. In this paper, we investigate the use of self-training in language agents, which can generate supervision from the agent itself, offering a promising alternative without relying on human or stronger model demonstrations. Self-training, however, requires high-quality model-generated samples, which are hard to obtain for challenging language agent tasks. To address this, we present Reflection-Reinforced Self-Training (Re-ReST), which uses a reflector to refine low-quality generated samples during self-training. The reflector takes the agent’s output and feedback from an external environment to produce improved samples. We conduct extensive experiments on open-source language agents across tasks, demonstrating the effectiveness of self-training and Re-ReST in language agent tasks.
      @inproceedings{dou2024rerest,
        author = {Dou, Zi-Yi and Yang, Cheng-Fu and Wu, Xueqing and Chang, Kai-Wei and Peng, Nanyun},
        title = {Re-ReST: Reflection-Reinforced Self-Training for Language Agents},
        booktitle = {Proceedings of The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        year = {2024},
        keywords = {agent}
      }
      
      Details
    12. SPEED++: A Multilingual Event Extraction Framework for Epidemic Prediction and Preparedness

      Tanmay Parekh, Jeffrey Kwan, Jiarui Yu, Sparsh Johri, Hyosang Ahn, Sreya Muppalla, Kai-Wei Chang, Wei Wang, and Nanyun Peng, in Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024.
      Code Abstract BibTeX Details
      We introduce SPEED++, the first multilingual Event Extraction framework for extracting epidemic-related information from social media. Our framework is capable of providing epidemic warnings in diverse languages and demonstrates the efficacy of zero-shot cross-lingual models trained on English data for extracting information relevant to various diseases.
      @inproceedings{parekh2024speed,
        title = {SPEED++: A Multilingual Event Extraction Framework for Epidemic Prediction and Preparedness},
        author = {Parekh, Tanmay and Kwan, Jeffrey and Yu, Jiarui and Johri, Sparsh and Ahn, Hyosang and Muppalla, Sreya and Chang, Kai-Wei and Wang, Wei and Peng, Nanyun},
        booktitle = {Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        year = {2024}
      }
      
      Details
    13. Synchronous Faithfulness Monitoring for Trustworthy Retrieval-Augmented Generation

      Di Wu, Jia-Chen Gu, Fan Yin, Nanyun Peng, and Kai-Wei Chang, in Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024.
      Full Text Code Abstract BibTeX Details
      This paper proposes SynCheck, a lightweight monitor that detects unfaithful sentences in retrieval-augmented language models (RALMs). By integrating fine-grained decoding dynamics, SynCheck outperforms existing baselines in faithfulness detection. We also introduce FOD, a faithfulness-oriented decoding algorithm that significantly improves the faithfulness of long-form generation outputs.
      @inproceedings{wu2024synchronous,
        title = {Synchronous Faithfulness Monitoring for Trustworthy Retrieval-Augmented Generation},
        author = {Wu, Di and Gu, Jia-Chen and Yin, Fan and Peng, Nanyun and Chang, Kai-Wei},
        booktitle = {Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        year = {2024}
      }
      
      Details
    14. QUDSELECT: Selective Decoding for Questions Under Discussion Parsing

      Ashima Suvarna, Xiao Liu, Tanmay Parekh, Kai-Wei Chang, and Nanyun Peng, in Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), short, 2024.
      Abstract BibTeX Details
      Question Under Discussion (QUD) is a discourse framework that uses implicit questions to reveal discourse relationships between sentences. In QUD parsing, each sentence is viewed as an answer to a question triggered by an anchor sentence in prior context. The resulting QUD structure is required to conform to several theoretical criteria, making QUD parsing a challenging task. We introduce QUDSELECT, a joint-training framework that selectively decodes the QUD dependency structures considering the QUD criteria. Our method outperforms state-of-the-art baseline models by 9% in human evaluation and 4% in automatic evaluation, demonstrating the effectiveness of our framework.
      @inproceedings{suvarna2024qudselect,
        title = {QUDSELECT: Selective Decoding for Questions Under Discussion Parsing},
        author = {Suvarna, Ashima and Liu, Xiao and Parekh, Tanmay and Chang, Kai-Wei and Peng, Nanyun},
        booktitle = {Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), short},
        year = {2024}
      }
      
      Details
    15. Detecting Machine-Generated Long-Form Content with Latent-Space Variables

      Yufei Tian, Zeyu Pan, and Nanyun Peng, in Proceedings of the Findings of ACL at The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP-Findings), 2024.
      Full Text Abstract BibTeX Details
      We propose a robust method to detect machine-generated long-form text by incorporating abstract elements as key deciding factors, leading to a 31% improvement over existing baselines.
      @inproceedings{tian2024detecting,
        author = {Tian, Yufei and Pan, Zeyu and Peng, Nanyun},
        title = {Detecting Machine-Generated Long-Form Content with Latent-Space Variables},
        booktitle = {Proceedings of the Findings of ACL at The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP-Findings)},
        year = {2024}
      }
      
      Details
    16. VDebugger: Harnessing Execution Feedback for Debugging Visual Programs

      Xueqing Wu, Zongyu Lin, Songyan Zhao, Te-Lin Wu, Pan Lu, Nanyun Peng, and Kai-Wei Chang, in Proceedings of the Findings of ACL at The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP-Findings), 2024.
      Full Text Code Abstract BibTeX Details
      Visual programs are executable code generated by large language models to address visual reasoning problems. They decompose complex questions into multiple reasoning steps and invoke specialized models for each step to solve the problems. However, these programs are prone to logic errors, with our preliminary evaluation showing that 58% of the total errors are caused by program logic errors. Debugging complex visual programs remains a major bottleneck for visual reasoning. To address this, we introduce VDebugger, a novel critic-refiner framework trained to localize and debug visual programs by tracking execution step by step. VDebugger identifies and corrects program errors leveraging detailed execution feedback, improving interpretability and accuracy. The training data is generated through an automated pipeline that injects errors into correct visual programs using a novel mask-best decoding technique. Evaluations on six datasets demonstrate VDebugger’s effectiveness, showing performance improvements of up to 3.2% in downstream task accuracy. Further studies show VDebugger’s ability to generalize to unseen tasks, bringing a notable improvement of 2.3% on the unseen COVR task.
      @inproceedings{wu2024vdebugger,
        author = {Wu, Xueqing and Lin, Zongyu and Zhao, Songyan and Wu, Te-Lin and Lu, Pan and Peng, Nanyun and Chang, Kai-Wei},
        title = {VDebugger: Harnessing Execution Feedback for Debugging Visual Programs},
        booktitle = {Proceedings of the Findings of ACL at The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP-Findings)},
        year = {2024}
      }
      
      Details
    17. LLM-A*: Large Language Model Enhanced Incremental Heuristic Search on Path Planning

      Silin Meng, Yiwei Wang, Cheng-Fu Yang, Nanyun Peng, and Kai-Wei Chang, in Proceedings of the Findings of ACL at The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP-Findings), 2024.
      Abstract BibTeX Details
      Path planning is a fundamental scientific problem in robotics and autonomous navigation. We propose LLM-A*, a novel route planning method that combines the precise pathfinding capabilities of A* with the global reasoning capability of large language models (LLMs). This hybrid approach aims to enhance pathfinding efficiency in terms of time and space complexity while maintaining the integrity of path validity, especially in large-scale scenarios.
      @inproceedings{meng2024llm,
        title = {LLM-A*: Large Language Model Enhanced Incremental Heuristic Search on Path Planning},
        author = {Meng, Silin and Wang, Yiwei and Yang, Cheng-Fu and Peng, Nanyun and Chang, Kai-Wei},
        booktitle = {Proceedings of the Findings of ACL at The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP-Findings)},
        year = {2024}
      }
      
      Details
    18. LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for Enhanced Following of Instructions with Multiple Constraints

      Thomas Palmeira Ferraz, Kartik Mehta, Yu-Hsiang Lin, Haw-Shiuan Chang, Shereen Oraby, Sijia Liu, Vivek Subramanian, Tagyoung Chung, Mohit Bansal, and Nanyun Peng, in Proceedings of the Findings of ACL at The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP-Findings), 2024.
      Abstract BibTeX Details
      We investigate LLMs’ capability in following multi-constrained instructions, introducing the Decompose, Critique, and Refine (DeCRIM) self-correction pipeline. This approach significantly enhances the ability of LLMs to handle complex constraints, and our experiments demonstrate substantial improvements in instruction adherence across multiple evaluation metrics.
      @inproceedings{ferraz2024llm,
        title = {LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for Enhanced Following of Instructions with Multiple Constraints},
        author = {Ferraz, Thomas Palmeira and Mehta, Kartik and Lin, Yu-Hsiang and Chang, Haw-Shiuan and Oraby, Shereen and Liu, Sijia and Subramanian, Vivek and Chung, Tagyoung and Bansal, Mohit and Peng, Nanyun},
        booktitle = {Proceedings of the Findings of ACL at The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP-Findings)},
        year = {2024}
      }
      
      Details
    19. Explaining Mixtures of Sources in News Articles

      Alexander Spangher, James Youn, Matt DeButts, Nanyun Peng, and Jonathan May, in Proceedings of the Findings of ACL at The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP-Findings), 2024.
      Abstract BibTeX Details
      Human writers plan, then write. For large language models (LLMs) to play a role in longer-form article generation, we must understand the planning steps humans make before writing. We explore one kind of planning, source-selection in news, as a case-study for evaluating plans in long-form generation. We ask: why do specific stories call for specific kinds of sources? We imagine a process where sources are selected to fall into different categories. Learning the article’s plan means predicting the categorization scheme chosen by the journalist. Inspired by latent-variable modeling, we first develop metrics to select the most likely plan underlying a story. Then, working with professional journalists, we adapt five existing approaches to planning and introduce three new ones. We find that two approaches, or schemas: stance and social affiliation best explain source plans in most documents. However, other schemas like textual entailment explain source plans in factually rich topics like "Science". Finally, we find we can predict the most suitable schema given just the article’s headline with reasonable accuracy. We see this as an important case-study for human planning, and provides a framework and approach for evaluating other kinds of plans, like discourse or plot-oriented plans. We release a corpora, NewsSources, with schema annotations for 4M articles, for further study.
      @inproceedings{spangher2024source_explaining,
        author = {Spangher, Alexander and Youn, James and DeButts, Matt and Peng, Nanyun and May, Jonathan},
        title = {Explaining Mixtures of Sources in News Articles},
        booktitle = {Proceedings of the Findings of ACL at The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP-Findings)},
        year = {2024}
      }
      
      Details
    20. Uncertainty Calibration for Tool-Using Language Agents

      Hao Liu, Zi-Yi Dou, Yixin Wang, Nanyun Peng, and Yisong Yue, in Proceedings of the Findings of ACL at The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP-Findings), 2024.
      Abstract BibTeX Details
      There is increasing interest in equipping language models with the ability to leverage external tools for complex, goal-oriented tasks. However, interacting with external tools introduces inherent uncertainties due to imperfections and misalignments between the tools’ outputs and the agents’ internal models, often leading to suboptimal outcomes. We thus study the problem of tool-use calibration in language agents, and identify prompt design and execution trace selection as two primary areas that suffer from miscalibration. We then propose ProbeCal, which recalibrates the internal probabilities of tool-using language agents to better reflect the actual effectiveness of the tool, and enables a more appropriate selection of prompts and execution paths. We empirically show that ProbeCal can significantly and consistently improve off-the-shelf language models in tool-using applications.
      @inproceedings{liu2024uncertainty_calibration,
        author = {Liu, Hao and Dou, Zi-Yi and Wang, Yixin and Peng, Nanyun and Yue, Yisong},
        title = {Uncertainty Calibration for Tool-Using Language Agents},
        booktitle = {Proceedings of the Findings of ACL at The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP-Findings)},
        year = {2024}
      }
      
      Details
    21. Open-Domain Text Evaluation via Contrastive Distribution Methods

      Sidi Lu, Hongyi Liu, Asli Celikyilmaz, Tianlu Wang, and Nanyun Peng, in Proceedings of the Fortieth International Conference on Machine Learning (ICML), 2024.
      Full Text BibTeX Details
      @inproceedings{lu2024cdm,
        title = {Open-Domain Text Evaluation via Contrastive Distribution Methods},
        author = {Lu, Sidi and Liu, Hongyi and Celikyilmaz, Asli and Wang, Tianlu and Peng, Nanyun},
        booktitle = {Proceedings of the Fortieth International Conference on Machine Learning (ICML)},
        year = {2024}
      }
      
      Details
    22. On Prompt-Driven Safeguarding for Large Language Models

      Chujie Zheng, Fan Yin, Hao Zhou, Fandong Meng, Jie Zhou, Kai-Wei Chang, Minlie Huang, and Nanyun Peng, in Proceedings of the Fortieth International Conference on Machine Learning (ICML), 2024.
      Full Text BibTeX Details
      @inproceedings{zheng2024dro,
        title = {On Prompt-Driven Safeguarding for Large Language Models},
        author = {Zheng, Chujie and Yin, Fan and Zhou, Hao and Meng, Fandong and Zhou, Jie and Chang, Kai-Wei and Huang, Minlie and Peng, Nanyun},
        booktitle = {Proceedings of the Fortieth International Conference on Machine Learning (ICML)},
        year = {2024}
      }
      
      Details
    23. DiNADO: Norm-Disentangled Neurally-Decomposed Oracles for Controlling Language Models

      Sidi Lu, Wenbo Zhao, Chenyang Tao, Arpit Gupta, Shanchan Wu, Tagyoung Chung, and Nanyun Peng, in Proceedings of the Fortieth International Conference on Machine Learning (ICML), 2024.
      BibTeX Details
      @inproceedings{lu2024nado2,
        title = {DiNADO: Norm-Disentangled Neurally-Decomposed Oracles for Controlling Language Models},
        author = {Lu, Sidi and Zhao, Wenbo and Tao, Chenyang and Gupta, Arpit and Wu, Shanchan and Chung, Tagyoung and Peng, Nanyun},
        booktitle = {Proceedings of the Fortieth International Conference on Machine Learning (ICML)},
        year = {2024}
      }
      
      Details
    24. ConTextual: Evaluating Context-Sensitive Text-Rich Visual Reasoning in Large Multimodal Models

      Rohan Wadhawan, Hritik Bansal, Kai-Wei Chang, and Nanyun Peng, in Proceedings of the Fortieth International Conference on Machine Learning (ICML), 2024.
      Full Text BibTeX Details
      @inproceedings{wadhawan2024contextual,
        title = {ConTextual: Evaluating Context-Sensitive Text-Rich Visual Reasoning in Large Multimodal Models},
        author = {Wadhawan, Rohan and Bansal, Hritik and Chang, Kai-Wei and Peng, Nanyun},
        booktitle = {Proceedings of the Fortieth International Conference on Machine Learning (ICML)},
        year = {2024}
      }
      
      Details
    25. Improving Event Definition Following For Zero-Shot Event Detection

      Zefan Cai, Po-Nien Kung, Ashima Suvarna, Mingyu Derek Ma, Hritik Bansal, Baobao Chang, P. Jeffrey Brantingham, Wei Wang, and Nanyun Peng, in Proceedings of The 62nd Annual Meeting of the Association for Computational Linguistics (ACL), 2024.
      BibTeX Details
      @inproceedings{cai2024improving,
        title = {Improving Event Definition Following For Zero-Shot Event Detection},
        author = {Cai, Zefan and Kung, Po-Nien and Suvarna, Ashima and Ma, Mingyu Derek and Bansal, Hritik and Chang, Baobao and Brantingham, P. Jeffrey and Wang, Wei and Peng, Nanyun},
        booktitle = {Proceedings of The 62nd Annual Meeting of the Association for Computational Linguistics (ACL)},
        year = {2024}
      }
      
      Details
    26. Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models

      Haoyi Qiu, Wenbo Hu, Zi-Yi Dou, and Nanyun Peng, in Findings of the Association for Computational Linguistics: ACL (ACL-findings), 2024.
      Full Text Code BibTeX Details
      @inproceedings{Qiu2024,
        title = {Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models},
        author = {Qiu, Haoyi and Hu, Wenbo and Dou, Zi-Yi and Peng, Nanyun},
        booktitle = {Findings of the Association for Computational Linguistics: ACL (ACL-findings)},
        year = {2024},
        project_website = {https://gordonhu608.github.io/VALOR-Eval/}
      }
      
      Details
    27. Argument-Aware Approach To Event Linking

      I.-Hung Hsu, Zihan Xue, Nilay Pochhi, Sahil Bansal, Prem Natarajan, Jayanth Srinivasa, and Nanyun Peng, in Findings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL-Findings), 2024.
      BibTeX Details Details
    28. Tracking the Newsworthiness of Public Documents

      Alexander Spangher, Serdar Tumgoren, Ben Welsh, Nanyun Peng, Emilio Ferrara, and Jonathan May, in Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL), 2024.
      BibTeX Details
      @inproceedings{Spangher2024,
        title = {Tracking the Newsworthiness of Public Documents},
        author = {Spangher, Alexander and Tumgoren, Serdar and Welsh, Ben and Peng, Nanyun and Ferrara, Emilio and May, Jonathan},
        booktitle = {Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL)},
        year = {2024}
      }
      
      Details
    29. TextEE: Benchmark, Reevaluation, Reflections, and Future Challenges in Event Extraction

      Kuan-Hao Huang, I.-Hung Hsu, Tanmay Parekh, Zhiyu Xie, Zixuan Zhang, Prem Natarajan, Kai-Wei Chang, Nanyun Peng, and Heng Ji, in Findings of the Association for Computational Linguistics: ACL (ACL-findings), 2024.
      BibTeX Details
      @inproceedings{Huang2024,
        title = {TextEE: Benchmark, Reevaluation, Reflections, and Future Challenges in Event Extraction},
        author = {Huang, Kuan-Hao and Hsu, I-Hung and Parekh, Tanmay and Xie, Zhiyu and Zhang, Zixuan and Natarajan, Prem and Chang, Kai-Wei and Peng, Nanyun and Ji, Heng},
        booktitle = {Findings of the Association for Computational Linguistics: ACL (ACL-findings)},
        year = {2024}
      }
      
      Details
    30. CaLM: Contrasting Large and Small Language Models to Verify Grounded Generation

      I.-Hung Hsu, Zifeng Wang, Long Le, Lesly Miculicich, Nanyun Peng, Chen-Yu Lee, and Tomas Pfister, in Findings of the Association for Computational Linguistics: ACL (ACL-findings), 2024.
      BibTeX Details
      @inproceedings{Hsu2024b,
        title = {CaLM: Contrasting Large and Small Language Models to Verify Grounded Generation},
        author = {Hsu, I-Hung and Wang, Zifeng and Le, Long and Miculicich, Lesly and Peng, Nanyun and Lee, Chen-Yu and Pfister, Tomas},
        booktitle = {Findings of the Association for Computational Linguistics: ACL (ACL-findings)},
        year = {2024}
      }
      
      Details
    31. MacGyver: Are Large Language Models Creative Problem Solvers?

      Yufei Tian, Abhilasha Ravichander, Lianhui Qin, Ronan Le Bras, Raja Marjieh, Nanyun Peng, Yejin Choi, Thomas L. Griffiths, and Faeze Brahman, in Proceedings of the 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2024.
      Full Text BibTeX Details 🏆 Best Paper Nomination
      @inproceedings{tian2024macgyver,
        title = {MacGyver: Are Large Language Models Creative Problem Solvers?},
        author = {Tian, Yufei and Ravichander, Abhilasha and Qin, Lianhui and Bras, Ronan Le and Marjieh, Raja and Peng, Nanyun and Choi, Yejin and Griffiths, Thomas L. and Brahman, Faeze},
        booktitle = {Proceedings of the 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)},
        year = {2024}
      }
      
      Details
    32. AMRFact: Enhancing Summarization Factuality Evaluation with AMR-Driven Negative Samples Generation

      Haoyi Qiu, Kung-Hsiang Huang, Jingnong Qu, and Nanyun Peng, in Proceedings of the 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2024.
      Full Text Code BibTeX Details
      @inproceedings{qiu2024amrfact,
        title = {AMRFact: Enhancing Summarization Factuality Evaluation with AMR-Driven Negative Samples Generation},
        author = {Qiu, Haoyi and Huang, Kung-Hsiang and Qu, Jingnong and Peng, Nanyun},
        booktitle = {Proceedings of the 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)},
        year = {2024}
      }
      
      Details
    33. Contextual Label Projection for Cross-Lingual Structured Prediction

      Tanmay Parekh, I.-Hung Hsu, Kuan-Hao Huang, Kai-Wei Chang, and Nanyun Peng, in Proceedings of the 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2024.
      Full Text Code BibTeX Details 🏆 Best Paper Nomination
      @inproceedings{parekh2024clap,
        title = {Contextual Label Projection for Cross-Lingual Structured Prediction},
        author = {Parekh, Tanmay and Hsu, I-Hung and Huang, Kuan-Hao and Chang, Kai-Wei and Peng, Nanyun},
        booktitle = {Proceedings of the 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)},
        year = {2024}
      }
      
      Details
    34. Event Detection from Social Media for Epidemic Prediction

      Tanmay Parekh, Anh Mac, Jiarui Yu, Yuxuan Dong, Syed Shahriar, Bonnie Liu, Eric J. Yang, Kuan-Hao Huang, Wei Wang, Nanyun Peng, and Kai-Wei Chang, in Proceedings of the 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2024.
      Full Text Code BibTeX Details
      @inproceedings{parekh2024pipp,
        title = {Event Detection from Social Media for Epidemic Prediction},
        author = {Parekh, Tanmay and Mac, Anh and Yu, Jiarui and Dong, Yuxuan and Shahriar, Syed and Liu, Bonnie and Yang, Eric J and Huang, Kuan-Hao and Wang, Wei and Peng, Nanyun and Chang, Kai-Wei},
        booktitle = {Proceedings of the 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)},
        year = {2024}
      }
      
      Details
    35. Mitigating Bias for Question Answering Models by Tracking Bias Influence

      Mingyu Derek Ma, Jiun-Yu Kao, Arpit Gupta, Yu-Hsiang Lin, Wenbo Zhao, Tagyoung Chung, Wei Wang, Kai-Wei Chang, and Nanyun Peng, in Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2024.
      Full Text BibTeX Details
      @inproceedings{ma2024bias,
        title = {Mitigating Bias for Question Answering Models by Tracking Bias Influence},
        author = {Ma, Mingyu Derek and Kao, Jiun-Yu and Gupta, Arpit and Lin, Yu-Hsiang and Zhao, Wenbo and Chung, Tagyoung and Wang, Wei and Chang, Kai-Wei and Peng, Nanyun},
        booktitle = {Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)},
        year = {2024}
      }
      
      Details
    36. Human-in-the-Loop Synthetic Text Data Inspection with Provenance Tracking

      Hong Jin Kang*, Fabrice Y. Harel-Canada*, Muhammad Ali Gulzar, Nanyun Peng, and Miryung Kim, in Findings of Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-Findings), 2024.
      Full Text BibTeX Details
      @inproceedings{kang2024hitl,
        title = {Human-in-the-Loop Synthetic Text Data Inspection with Provenance Tracking},
        author = {Kang*, Hong Jin and Harel-Canada*, Fabrice Y and Gulzar, Muhammad Ali and Peng, Nanyun and Kim, Miryung},
        booktitle = {Findings of Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-Findings)},
        year = {2024}
      }
      
      Details
    37. RLCD: Reinforcement Learning from Contrast Distillation for Language Model Alignment

      Kevin Yang, Dan Klein, Asli Celikyilmaz, Nanyun Peng, and Yuandong Tian, in Proceedings of the Twelfth International Conference on Learning Representations (ICLR), 2024.
      Full Text BibTeX Details
      @inproceedings{yang2024rlcd,
        title = {RLCD: Reinforcement Learning from Contrast Distillation for Language Model Alignment},
        author = {Yang, Kevin and Klein, Dan and Celikyilmaz, Asli and Peng, Nanyun and Tian, Yuandong},
        booktitle = {Proceedings of the Twelfth International Conference on Learning Representations (ICLR)},
        year = {2024}
      }
      
      Details
    38. STAR: Boosting Low-Resource Information Extraction by Structure-to-Text Data Generation with Large Language Models

      Mingyu Derek Ma, Xiaoxuan Wang, Po-Nien Kung, P. Jeffrey Brantingham, Nanyun Peng, and Wei Wang, in Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI), 2024.
      Full Text BibTeX Details
      @inproceedings{ma2024star,
        title = {STAR: Boosting Low-Resource Information Extraction by Structure-to-Text Data Generation with Large Language Models},
        author = {Ma, Mingyu Derek and Wang, Xiaoxuan and Kung, Po-Nien and Brantingham, P. Jeffrey and Peng, Nanyun and Wang, Wei},
        booktitle = {Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI)},
        year = {2024}
      }
      
      Details
    39. MIDDAG: Where Does Our News Go? Investigating Information Diffusion via Community-Level Information Pathways

      Mingyu Derek Ma, Alexander K. Taylor, Nuan Wen, Yanchen Lin, Po-Nien Kung, Wenna Qin, Shicheng Wen, Azure Zhou, Diyi Yang, Xuezhe Ma, Nanyun Peng, and Wei Wang, in Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI), Demonstration Track, 2024.
      Full Text BibTeX Details
      @inproceedings{ma2024middag,
        title = {MIDDAG: Where Does Our News Go? Investigating Information Diffusion via Community-Level Information Pathways},
        author = {Ma, Mingyu Derek and Taylor, Alexander K. and Wen, Nuan and Lin, Yanchen and Kung, Po-Nien and Qin, Wenna and Wen, Shicheng and Zhou, Azure and Yang, Diyi and Ma, Xuezhe and Peng, Nanyun and Wang, Wei},
        booktitle = {Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI), Demonstration Track},
        year = {2024}
      }
      
      Details

    2023

    1. Harnessing Black-Box Control to Boost Commonsense in LMs’ Generation

      Yufei Tian, Felix Zhang, and Nanyun Peng, in The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023.
      Full Text BibTeX Details
      @inproceedings{tian2023harnessing,
        title = {Harnessing Black-Box Control to Boost Commonsense in LMs’ Generation},
        author = {Tian, Yufei and Zhang, Felix and Peng, Nanyun},
        booktitle = {The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        year = {2023}
      }
      
      Details
    2. Active Instruction Tuning: Improving Cross-Task Generalization by Training on Prompt Sensitive Tasks

      Po-Nien Kung, Fan Yin, Di Wu, Kai-Wei Chang, and Nanyun Peng, in The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023.
      Full Text Poster BibTeX Details
      @inproceedings{kung2023active,
        title = {Active Instruction Tuning: Improving Cross-Task Generalization by Training on Prompt Sensitive Tasks},
        author = {Kung, Po-Nien and Yin, Fan and Wu, Di and Chang, Kai-Wei and Peng, Nanyun},
        booktitle = {The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        year = {2023}
      }
      
      Details
    3. Gender Biases in Automatic Evaluation Metrics for Image Captioning

      Haoyi Qiu, Zi-Yi Dou, Tianlu Wang, Asli Celikyilmaz, and Nanyun Peng, in The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023.
      Full Text Code BibTeX Details
      @inproceedings{qiu2023gender,
        title = {Gender Biases in Automatic Evaluation Metrics for Image Captioning},
        author = {Qiu, Haoyi and Dou, Zi-Yi and Wang, Tianlu and Celikyilmaz, Asli and Peng, Nanyun},
        booktitle = {The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        year = {2023}
      }
      
      Details
    4. Localizing Active Objects from Egocentric Vision with Symbolic World Knowledge

      Te-Lin Wu*, Yu Zhou*, and Nanyun Peng, in The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023.
      Full Text Poster Video Code BibTeX Details
      @inproceedings{wu2023localizing,
        title = {Localizing Active Objects from Egocentric Vision with Symbolic World Knowledge},
        author = {Wu*, Te-Lin and Zhou*, Yu and Peng, Nanyun},
        booktitle = {The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        year = {2023}
      }
      
      Details
    5. ACQUIRED: A Dataset for Answering Counterfactual Questions In Real-Life Videos

      Te-Lin Wu*, Zi-Yi Dou*, Qingyuan Hu*, Yu Hou, Nischal Reddy Chandra, Marjorie Freedman, Ralph Weischedel, and Nanyun Peng, in The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023.
      Full Text BibTeX Details
      @inproceedings{wu2023acquired,
        title = {ACQUIRED: A Dataset for Answering Counterfactual Questions In Real-Life Videos},
        author = {Wu*, Te-Lin and Dou*, Zi-Yi and Hu*, Qingyuan and Hou, Yu and Chandra, Nischal Reddy and Freedman, Marjorie and Weischedel, Ralph and Peng, Nanyun},
        booktitle = {The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        year = {2023}
      }
      
      Details
    6. Evaluating Large Language Models on Controlled Generation Tasks

      Jiao Sun, Yufei Tian, Wangchunshu Zhou, Nan Xu, Qian Hu, Rahul Gupta, John Frederick Wieting, Nanyun Peng, and Xuezhe Ma, in The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023.
      Full Text BibTeX Details
      @inproceedings{sun2023eval,
        title = {Evaluating Large Language Models on Controlled Generation Tasks},
        author = {Sun, Jiao and Tian, Yufei and Zhou, Wangchunshu and Xu, Nan and Hu, Qian and Gupta, Rahul and Wieting, John Frederick and Peng, Nanyun and Ma, Xuezhe},
        booktitle = {The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        year = {2023}
      }
      
      Details
    7. Identifying Informational Sources in News Articles

      Alexander Spangher, Nanyun Peng, Emilio Ferrara, and Jonathan May, in The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023.
      Full Text BibTeX Details
      @inproceedings{spangher2023identifying,
        title = {Identifying Informational Sources in News Articles},
        author = {Spangher, Alexander and Peng, Nanyun and Ferrara, Emilio and May, Jonathan},
        booktitle = {The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        year = {2023}
      }
      
      Details
    8. “Kelly is a Warm Person, Joseph is a Role Model”: Gender Biases in LLM-Generated Reference Letters

      Yixin Wan, George Pu, Jiao Sun, Aparna Garimella, Kai-Wei Chang, and Nanyun Peng, in Findings of The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP-Findings), 2023.
      Full Text BibTeX Details
      @inproceedings{wan2023kelly,
        title = {“Kelly is a Warm Person, Joseph is a Role Model”: Gender Biases in LLM-Generated Reference Letters},
        author = {Wan, Yixin and Pu, George and Sun, Jiao and Garimella, Aparna and Chang, Kai-Wei and Peng, Nanyun},
        booktitle = {Findings of The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP-Findings)},
        year = {2023}
      }
      
      Details
    9. Are Personalized Stochastic Parrots More Dangerous? Evaluating Persona Biases in Dialogue Systems

      Yixin Wan, Jieyu Zhao, Aman Chadha, Nanyun Peng, and Kai-Wei Chang, in Findings of The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP-Findings), 2023.
      Full Text BibTeX Details
      @inproceedings{wan2023personalized,
        title = {Are Personalized Stochastic Parrots More Dangerous? Evaluating Persona Biases in Dialogue Systems},
        author = {Wan, Yixin and Zhao, Jieyu and Chadha, Aman and Peng, Nanyun and Chang, Kai-Wei},
        booktitle = {Findings of The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP-Findings)},
        year = {2023}
      }
      
      Details
    10. DesCo: Learning Object Recognition with Rich Language Descriptions

      Liunian Harold Li*, Zi-Yi Dou*, Nanyun Peng, and Kai-Wei Chang, in The 2023 Conference on Neural Information Processing Systems (NeurIPS), 2023.
      Full Text BibTeX Details
      @inproceedings{li2023desco,
        title = {DesCo: Learning Object Recognition with Rich Language Descriptions},
        author = {Li*, Liunian Harold and Dou*, Zi-Yi and Peng, Nanyun and Chang, Kai-Wei},
        booktitle = {The 2023 Conference on Neural Information Processing Systems (NeurIPS)},
        year = {2023}
      }
      
      Details
    11. Masked Path Modeling for Vision-and-Language Navigation

      Zi-Yi Dou, Feng Gao, and Nanyun Peng, in Findings of The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP-Findings), 2023.
      Full Text BibTeX Details
      @inproceedings{dou2023mpm,
        title = {Masked Path Modeling for Vision-and-Language Navigation},
        author = {Dou, Zi-Yi and Gao, Feng and Peng, Nanyun},
        booktitle = {Findings of The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP-Findings)},
        year = {2023}
      }
      
      Details
    12. Parameter-Efficient Low-Resource Dialogue State Tracking by Prompt Tuning

      Mingyu Derek Ma, Jiun-Yu Kao, Shuyang Gao, Arpit Gupta, Di Jin, Tagyoung Chung, and Nanyun Peng, in Proceedings of INTERSPEECH 2023, 2023.
      Full Text BibTeX Details
      @inproceedings{ma2023parameter,
        title = {Parameter-Efficient Low-Resource Dialogue State Tracking by Prompt Tuning},
        author = {Ma, Mingyu Derek and Kao, Jiun-Yu and Gao, Shuyang and Gupta, Arpit and Jin, Di and Chung, Tagyoung and Peng, Nanyun},
        booktitle = {Proceedings of INTERSPEECH 2023},
        year = {2023}
      }
      
      Details
    13. LEAF: Linguistically Enhanced Event Temporal Relation Framework

      Stanley Lim, Da Yin, and Nanyun Peng, in Workshop for Pattern-based Approaches to NLP in the Age of Deep Learning (PAN-DL) at EMNLP, 2023.
      BibTeX Details 🏆 Best Paper Award
      @inproceedings{lim2023leaf,
        title = {LEAF: Linguistically Enhanced Event Temporal Relation Framework},
        author = {Lim, Stanley and Yin, Da and Peng, Nanyun},
        booktitle = {Workshop for Pattern-based Approaches to NLP in the Age of Deep Learning (PAN-DL) at EMNLP},
        year = {2023}
      }
      
      Details
    14. AMPERE: AMR-Aware Prefix for Generation-Based Event Argument Extraction Model

      I.-Hung Hsu*, Zhiyu Xie*, Kuan-Hao Huang, Premkumar Natarajan, and Nanyun Peng, in Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (ACL), 2023.
      Full Text Poster Video Code BibTeX Details
      @inproceedings{hsu2023ampere,
        title = {AMPERE: AMR-Aware Prefix for Generation-Based Event Argument Extraction Model},
        author = {Hsu*, I-Hung and Xie*, Zhiyu and Huang, Kuan-Hao and Natarajan, Premkumar and Peng, Nanyun},
        booktitle = {Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (ACL)},
        year = {2023}
      }
      
      Details
    15. ACCENT: An Automatic Event Commonsense Evaluation Metric for Open-Domain Dialogue Systems

      Sarik Ghazarian*, Yijia Shao*, Rujun Han, Aram Galstyan, and Nanyun Peng, in Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (ACL), 2023.
      Full Text BibTeX Details
      @inproceedings{ghazarian2023accent,
        title = {ACCENT: An Automatic Event Commonsense Evaluation Metric for Open-Domain Dialogue Systems},
        author = {Ghazarian*, Sarik and Shao*, Yijia and Han, Rujun and Galstyan, Aram and Peng, Nanyun},
        booktitle = {Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (ACL)},
        year = {2023}
      }
      
      Details
    16. Learning Action Conditions from Instructional Manuals for Instruction Understanding

      Te-Lin Wu, Caiqi Zhang, Qingyuan Hu, Alex Spangher, and Nanyun Peng, in Proceedings of the Conference of the 61st Annual Meeting of the Association for Computational Linguistics (ACL), 2023.
      Full Text Abstract BibTeX Details
      The ability to infer pre- and postconditions of an action is vital for comprehending complex instructions, and is essential for applications such as autonomous instruction-guided agents and assistive AI that supports humans to perform physical tasks. In this work, we propose a task dubbed action condition inference, which extracts mentions of preconditions and postconditions of actions in instructional manuals. We propose a weakly supervised approach utilizing automatically constructed large-scale training instances from online instructions, and curate a densely human-annotated and validated dataset to study how well the current NLP models do on the proposed task. We design two types of models differ by whether contextualized and global information is leveraged, as well as various combinations of heuristics to construct the weak supervisions. Our experiments show a > 20% F1-score improvement with considering the entire instruction contexts and a > 6% F1-score benefit with the proposed heuristics. However, the best performing model is still well-behind human performance.
      @inproceedings{wu2023action,
        title = {Learning Action Conditions from Instructional Manuals for Instruction Understanding},
        author = {Wu, Te-Lin and Zhang, Caiqi and Hu, Qingyuan and Spangher, Alex and Peng, Nanyun},
        booktitle = {Proceedings of the Conference of the 61st Annual Meeting of the Association for Computational Linguistics (ACL)},
        year = {2023}
      }
      
      Details
    17. GENEVA: Benchmarking Generalizability for Event Argument Extraction with Hundreds of Event Types and Argument Roles

      Tanmay Parekh, I.-Hung Hsu, Kuan-Hao Huang, Kai-Wei Chang, and Nanyun Peng, in Proceedings of the Conference of the 61st Annual Meeting of the Association for Computational Linguistics (ACL), 2023.
      Full Text Slides Code BibTeX Details
      @inproceedings{parekh2023geneva,
        title = {GENEVA: Benchmarking Generalizability for Event Argument Extraction with Hundreds of Event Types and Argument Roles},
        author = {Parekh, Tanmay and Hsu, I-Hung and Huang, Kuan-Hao and Chang, Kai-Wei and Peng, Nanyun},
        booktitle = {Proceedings of the Conference of the 61st Annual Meeting of the Association for Computational Linguistics (ACL)},
        year = {2023}
      }
      
      Details
    18. Unsupervised Melody-to-Lyric Generation

      Yufei Tian, Anjali Narayan-Chen, Shereen Oraby, Alessandra Cervone, Gunnar Sigurdsson, Chenyang Tao, Wenbo Zhao, Tagyoung Chung, Jing Huang, and Nanyun Peng, in Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (ACL), 2023.
      Full Text Slides BibTeX Details
      @inproceedings{tian2023lyric,
        title = {Unsupervised Melody-to-Lyric Generation},
        author = {Tian, Yufei and Narayan-Chen, Anjali and Oraby, Shereen and Cervone, Alessandra and Sigurdsson, Gunnar and Tao, Chenyang and Zhao, Wenbo and Chung, Tagyoung and Huang, Jing and Peng, Nanyun},
        booktitle = {Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (ACL)},
        year = {2023}
      }
      
      Details
    19. Do Models Really Learn to Follow Instructions? An Empirical Study of Instruction Tuning

      Po-Nien Kung and Nanyun Peng, in Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (ACL), short, 2023.
      Full Text Poster BibTeX Details
      @inproceedings{kung2023models,
        title = {Do Models Really Learn to Follow Instructions? An Empirical Study of Instruction Tuning},
        author = {Kung, Po-Nien and Peng, Nanyun},
        booktitle = {Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (ACL), short},
        year = {2023}
      }
      
      Details
    20. DICE: Data-Efficient Clinical Event Extraction with Generative Models

      Mingyu Derek Ma, Alexander K. Taylor, Wei Wang, and Nanyun Peng, in Proceedings of the Conference of the 61st Annual Meeting of the Association for Computational Linguistics (ACL), 2023.
      Full Text Code BibTeX Details
      @inproceedings{ma2023dice,
        title = {DICE: Data-Efficient Clinical Event Extraction with Generative Models},
        author = {Ma, Mingyu Derek and Taylor, Alexander K. and Wang, Wei and Peng, Nanyun},
        booktitle = {Proceedings of the Conference of the 61st Annual Meeting of the Association for Computational Linguistics (ACL)},
        year = {2023}
      }
      
      Details
    21. TAGPRIME: A Unified Framework for Relational Structure Extraction

      I.-Hung Hsu*, Kuan-Hao Huang*, Shuning Zhang, Wenxing Cheng, Premkumar Natarajan, Kai-Wei Chang, and Nanyun Peng, in Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (ACL), 2023.
      Full Text Code BibTeX Details
      @inproceedings{hsu2023tagprime,
        title = {TAGPRIME: A Unified Framework for Relational Structure Extraction},
        author = {Hsu*, I-Hung and Huang*, Kuan-Hao and Zhang, Shuning and Cheng, Wenxing and Natarajan, Premkumar and Chang, Kai-Wei and Peng, Nanyun},
        booktitle = {Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (ACL)},
        year = {2023}
      }
      
      Details
    22. DOC: Improving Long Story Coherence With Detailed Outline Control

      Kevin Yang, Dan Klein, Nanyun Peng, and Yuandong Tian, in Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (ACL), 2023.
      Full Text BibTeX Details
      @inproceedings{yang2023doc,
        title = {DOC: Improving Long Story Coherence With Detailed Outline Control},
        author = {Yang, Kevin and Klein, Dan and Peng, Nanyun and Tian, Yuandong},
        booktitle = {Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (ACL)},
        year = {2023}
      }
      
      Details
    23. Are Fairy Tales Fair? Analyzing Gender Bias in Temporal Narrative Event Chains of Children’s Fairy Tales

      Paulina Toro Isaza, Guangxuan Xu, Toye Oloko, Yufang Hou, Nanyun Peng, and Dakuo Wang, in Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (ACL), 2023.
      Full Text BibTeX Details
      @inproceedings{isaza2023fairytales,
        title = {Are Fairy Tales Fair? Analyzing Gender Bias in Temporal Narrative Event Chains of Children's Fairy Tales},
        author = {Isaza, Paulina Toro and Xu, Guangxuan and Oloko, Toye and Hou, Yufang and Peng, Nanyun and Wang, Dakuo},
        booktitle = {Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics (ACL)},
        year = {2023}
      }
      
      Details
    24. SIMMC-VR: A Task-oriented Multimodal Dialog Dataset with Situated and Immersive VR Streams

      Te-Lin Wu, Satwik Kottur, Andrea Madotto, Mahmoud Azab, Pedro Rodriguez, Nanyun Peng, Babak Damavandi, and Seungwhan Moon, in Proceedings of the Conference of the 61st Annual Meeting of the Association for Computational Linguistics (ACL), 2023.
      Full Text Abstract BibTeX Details
      Building an AI assistant that can seamlessly converse and instruct humans, in a user-centric situated scenario, requires several essential abilities: (1) spatial and temporal understanding of the situated and real-time user scenes, (2) capability of grounding the actively perceived visuals of users to conversation contexts, and (3) conversational reasoning over past utterances to perform just-in-time assistance. However, we currently lack a large-scale benchmark that captures user–assistant interactions with all of the aforementioned features. To this end, we propose SIMMC-VR, extending the SIMMC 2.0 dataset, which only concerns static visual scenes, to a video-grounded task-oriented dialog dataset that captures real-world AI-assisted user scenarios in VR. We propose a novel data collection paradigm that involves (1) generating object-centric multimodal dialog flows with egocentric visual streams and visually-grounded templates, and (2) manually paraphrasing the simulated dialogs for naturalness and diversity while preserving multimodal dependencies.  To measure meaningful progress in the field, we propose four tasks to address the new challenges in SIMMC-VR, which require complex spatial-temporal dialog reasoning in active egocentric scenes. We benchmark the proposed tasks with strong multimodal models, and highlight the key capabilities that current models lack for future research directions.
      @inproceedings{wu2023simmcvr,
        title = {SIMMC-VR: A Task-oriented Multimodal Dialog Dataset with Situated and Immersive VR Streams},
        author = {Wu, Te-Lin and Kottur, Satwik and Madotto, Andrea and Azab, Mahmoud and Rodriguez, Pedro and Peng, Nanyun and Damavandi, Babak and Moon, Seungwhan},
        booktitle = {Proceedings of the Conference of the 61st Annual Meeting of the Association for Computational Linguistics (ACL)},
        year = {2023}
      }
      
      Details
    25. Code-Switching Text Synthesis in Unseen Language Pairs

      I.-Hung Hsu, Avik Ray, Shubham Grag, Nanyun Peng, and Jing Huang, in Findings of the Association for Computational Linguistics: ACL (ACL-findings), 2023.
      Full Text Slides Video BibTeX Details
      @inproceedings{hsu2023codeswitch,
        title = {Code-Switching Text Synthesis in Unseen Language Pairs},
        author = {Hsu, I-Hung and Ray, Avik and Grag, Shubham and Peng, Nanyun and Huang, Jing},
        booktitle = {Findings of the Association for Computational Linguistics: ACL (ACL-findings)},
        year = {2023}
      }
      
      Details
    26. Tractable Control for Autoregressive Language Generation

      Honghua Zhang, Meihua Dang, Nanyun Peng, and Guy Van den Broeck, in Proceedings of the Fortieth International Conference on Machine Learning (ICML), 2023.
      Full Text BibTeX Details Oral Paper (<2%)
      @inproceedings{zhang2023gelato,
        title = {Tractable Control for Autoregressive Language Generation},
        author = {Zhang, Honghua and Dang, Meihua and Peng, Nanyun and Broeck, Guy Van den},
        booktitle = {Proceedings of the Fortieth International Conference on Machine Learning (ICML)},
        year = {2023}
      }
      
      Details
    27. Generalized Decoding for Pixel, Image and Language

      Xueyan Zou*, Zi-Yi Dou*, Jianwei Yang*, Zhe Gan, Linjie Li, Chunyuan Li, Xiyang Dai, Harkirat Behl, Jianfeng Wang, Lu Yuan, Nanyun Peng, Lijuan Wang, Yong Jae Lee, and Jianfeng Gao, in The Conference on Computer Vision and Pattern Recognition (CVPR-23), 2023.
      Full Text Code BibTeX Details
      @inproceedings{xdecoder,
        title = {Generalized Decoding for Pixel, Image and Language},
        author = {Zou*, Xueyan and Dou*, Zi-Yi and Yang*, Jianwei and Gan, Zhe and Li, Linjie and Li, Chunyuan and Dai, Xiyang and Behl, Harkirat and Wang, Jianfeng and Yuan, Lu and Peng, Nanyun and Wang, Lijuan and Lee, Yong Jae and Gao, Jianfeng},
        booktitle = {The Conference on Computer Vision and Pattern Recognition (CVPR-23)},
        year = {2023}
      }
      
      Details
    28. Where Does Your News Come From? Predicting Information Pathways in Social Media

      Alexander Taylor, Nuan Wen, Po-Nien Kung, Jiaao Chen, Nanyun Peng, and Wei Wang, in Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2023.
      Full Text BibTeX Details
      @inproceedings{taylor2023pathway,
        title = {Where Does Your News Come From? Predicting Information Pathways in Social Media},
        author = {Taylor, Alexander and Wen, Nuan and Kung, Po-Nien and Chen, Jiaao and Peng, Nanyun and Wang, Wei},
        booktitle = {Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information
        Retrieval (SIGIR)},
        year = {2023}
      }
      
      Details
    29. MERCY: Multiple Response Ranking Concurrently in Realistic Open-Domain Conversational Systems

      Sarik Ghazarian, Behnam Hedayatnia, Di Jin, Sijia Liu, Nanyun Peng, Yang Liu, and Dilek Hakkani-Tur, in Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2023.
      Full Text Abstract BibTeX Details
      Automatic Evaluation (AE) and Response Selection (RS) models assign quality scores to various candidate responses and rank them in conversational setups. Prior response ranking research compares various models’ performance on synthetically generated test sets. In this work, we investigate the performance of model-based reference-free AE and RS models on our constructed response ranking datasets that mirror real-case scenarios of ranking candidates during inference time. Metrics’ unsatisfying performance can be interpreted as their low generalizability over more pragmatic conversational domains such as human-chatbot dialogs. To alleviate this issue we propose a novel RS model called MERCY that simulates human behavior in selecting the best candidate by taking into account distinct candidates concurrently and learns to rank them. In addition, MERCY leverages natural language feedback as another component to help the ranking task by explaining why each candidate response is relevant/irrelevant to the dialog context. These feedbacks are generated by prompting large language models in a few-shot setup. Our experiments show the better performance of MERCY over baselines for the response ranking task in our curated realistic datasets.
      @inproceedings{ghazarian-etal-2023-mercy,
        title = {{MERCY}: Multiple Response Ranking Concurrently in Realistic Open-Domain Conversational Systems},
        author = {Ghazarian, Sarik and Hedayatnia, Behnam and Jin, Di and Liu, Sijia and Peng, Nanyun and Liu, Yang and Hakkani-Tur, Dilek},
        booktitle = {Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue},
        year = {2023}
      }
      
      Details
    30. Investigating the Representation of Open Domain Dialogue Context for Transformer Models

      Vishakh Padmakumar, Behnam Hedayatnia, Di Jin, Patrick Lange, Seokhwan Kim, Nanyun Peng, Yang Liu, and Dilek Hakkani-Tur, in Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2023.
      Full Text Abstract BibTeX Details
      The bulk of work adapting transformer models to open-domain dialogue represents dialogue context as the concatenated set of turns in natural language. However, it is unclear if this is the best approach. In this work, we investigate this question by means of an empirical controlled experiment varying the dialogue context format from text-only formats (all recent utterances, summaries, selected utterances) as well as variants that are more structurally different (triples, AMR). We compare these formats based on fine-tuned model performance on two downstream tasks—knowledge selection and response generation. We find that simply concatenating the utterances works as a strong baseline in most cases, but is outperformed in longer contexts by a hybrid approach of combining a summary of the context with recent utterances. Through empirical analysis, our work highlights the need to examine the format of context representation and offers recommendations on adapting general-purpose language models to dialogue tasks.
      @inproceedings{padmakumar-etal-2023-investigating,
        title = {Investigating the Representation of Open Domain Dialogue Context for Transformer Models},
        author = {Padmakumar, Vishakh and Hedayatnia, Behnam and Jin, Di and Lange, Patrick and Kim, Seokhwan and Peng, Nanyun and Liu, Yang and Hakkani-Tur, Dilek},
        booktitle = {Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue},
        year = {2023}
      }
      
      Details

    2022

    1. Character-Centric Story Visualization via Visual Planning and Token Alignment

      Hong Chen, Rujun Han, Te-Lin Wu, Hideki Nakayama, and Nanyun Peng, in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022.
      Full Text BibTeX Details
      @inproceedings{hong2022Character,
        title = {Character-Centric Story Visualization via Visual Planning and Token Alignment},
        author = {Chen, Hong and Han, Rujun and Wu, Te-Lin and Nakayama, Hideki and Peng, Nanyun},
        booktitle = {Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        year = {2022}
      }
      
      Details
    2. ExPUNations: Augmenting Puns with Keywords and Explanations

      Jiao Sun, Anjali Narayan-Chen, Shereen Oraby, Alessandra Cervone, Tagyoung Chung, Jing Huang, Yang Liu, and Nanyun Peng, in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022.
      Full Text BibTeX Details
      @inproceedings{sun2022expun,
        title = {ExPUNations: Augmenting Puns with Keywords and Explanations},
        author = {Sun, Jiao and Narayan-Chen, Anjali and Oraby, Shereen and Cervone, Alessandra and Chung, Tagyoung and Huang, Jing and Liu, Yang and Peng, Nanyun},
        booktitle = {Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        year = {2022}
      }
      
      Details
    3. Context-Situated Pun Generation

      Jiao Sun, Anjali Narayan-Chen, Shereen Oraby, Shuyang Gao, Tagyoung Chung, Jing Huang, Yang Liu, and Nanyun Peng, in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022.
      Full Text BibTeX Details
      @inproceedings{sun2022context,
        title = {Context-Situated Pun Generation},
        author = {Sun, Jiao and Narayan-Chen, Anjali and Oraby, Shereen and Gao, Shuyang and Chung, Tagyoung and Huang, Jing and Liu, Yang and Peng, Nanyun},
        booktitle = {Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        year = {2022}
      }
      
      Details
    4. Re3: Generating Longer Stories With Recursive Reprompting and Revision

      Kevin Yang, Yuandong Tian, Nanyun Peng, and Dan Klein, in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022.
      Full Text BibTeX Details
      @inproceedings{yang2022re3,
        title = {Re3: Generating Longer Stories With Recursive Reprompting and Revision},
        author = {Yang, Kevin and Tian, Yuandong and Peng, Nanyun and Klein, Dan},
        booktitle = {Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        year = {2022}
      }
      
      Details
    5. A Unified Framework for Pun Generation with Humor Principles

      Yufei Tian, Divyanshu Arun Sheth, and Nanyun Peng, in Findings of the Association for Computational Linguistics: EMNLP (EMNLP-findings), 2022.
      Full Text BibTeX Details
      @inproceedings{tian2022unified,
        title = {A Unified Framework for Pun Generation with Humor Principles},
        author = {Tian, Yufei and Arun Sheth, Divyanshu and Peng, Nanyun},
        booktitle = {Findings of the Association for Computational Linguistics: EMNLP (EMNLP-findings)},
        year = {2022}
      }
      
      Details
    6. Sequentially Controlled Text Generation

      Alexander Spangher, Yao Ming, Xinyu Hua, and Nanyun Peng, in Findings of the Association for Computational Linguistics: EMNLP (EMNLP-findings), 2022.
      Full Text BibTeX Details
      @inproceedings{spangher2022sequentially,
        title = {Sequentially Controlled Text Generation},
        author = {Spangher, Alexander and Ming, Yao and Hua, Xinyu and Peng, Nanyun},
        booktitle = {Findings of the Association for Computational Linguistics: EMNLP (EMNLP-findings)},
        year = {2022}
      }
      
      Details
    7. Towards Robust NLG Evaluation with Syntactically-diverse Prompts

      Arshiya Aggarwal, Jiao Sun, and Nanyun Peng, in Findings of the Association for Computational Linguistics: EMNLP (EMNLP-findings), 2022.
      Full Text BibTeX Details
      @inproceedings{aggarwal2022towards,
        title = {Towards Robust NLG Evaluation with Syntactically-diverse Prompts},
        author = {Aggarwal, Arshiya and Sun, Jiao and Peng, Nanyun},
        booktitle = {Findings of the Association for Computational Linguistics: EMNLP (EMNLP-findings)},
        year = {2022}
      }
      
      Details
    8. EnDex: Evaluation of Dialogue Engagingness at Scale

      Guangxuan Xu, Nischal Reddy Chandra, Ruibo Liu, Fabrice Harel-Canada, and Nanyun Peng, in Findings of the Association for Computational Linguistics: EMNLP (EMNLP-findings), 2022.
      Full Text BibTeX Details
      @inproceedings{xu2022endex,
        title = {EnDex: Evaluation of Dialogue Engagingness at Scale},
        author = {Xu, Guangxuan and Chandra, Nischal Reddy and Liu, Ruibo and Harel-Canada, Fabrice and Peng, Nanyun},
        booktitle = {Findings of the Association for Computational Linguistics: EMNLP (EMNLP-findings)},
        year = {2022}
      }
      
      Details
    9. InsNet: An Efficient, Flexible, and Performant Insertion-based Text Generation Model

      Sidi Lu, Tao Meng, and Nanyun Peng, in Proceedings of the Thirty-Sixth Conference on Neural Information Processing Systems (NeurIPS), 2022.
      Full Text BibTeX Details
      @inproceedings{lu2022InsNet,
        title = {InsNet: An Efficient, Flexible, and Performant Insertion-based Text Generation Model},
        author = {Lu, Sidi and Meng, Tao and Peng, Nanyun},
        booktitle = {Proceedings of the Thirty-Sixth Conference on Neural Information Processing Systems (NeurIPS)},
        year = {2022}
      }
      
      Details
    10. Controllable Text Generation with Neurally-Decomposed Oracle

      Tao Meng, Sidi Lu, Nanyun Peng, and Kai-Wei Chang, in Proceedings of the Thirty-Sixth Conference on Neural Information Processing Systems (NeurIPS), 2022.
      Full Text BibTeX Details Oral Paper (<2%)
      @inproceedings{meng2022nado,
        title = {Controllable Text Generation with Neurally-Decomposed Oracle},
        author = {Meng, Tao and Lu, Sidi and Peng, Nanyun and Chang, Kai-Wei},
        booktitle = {Proceedings of the Thirty-Sixth Conference on Neural Information Processing Systems (NeurIPS)},
        year = {2022}
      }
      
      Details
    11. Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone

      Zi-Yi Dou, Aishwarya Kamath, Zhe Gan, Pengchuan Zhang, Jianfeng Wang, Linjie Li, Zicheng Liu, Ce Liu, Yann LeCun, Nanyun Peng, Jianfeng Gao, and Lijuan Wang, in Proceedings of the Thirty-Sixth Conference on Neural Information Processing Systems (NeurIPS), 2022.
      Full Text BibTeX Details
      @inproceedings{dou2022fiber,
        title = {Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone},
        author = {Dou, Zi-Yi and Kamath, Aishwarya and Gan, Zhe and Zhang, Pengchuan and Wang, Jianfeng and Li, Linjie and Liu, Zicheng and Liu, Ce and LeCun, Yann and Peng, Nanyun and Gao, Jianfeng and Wang, Lijuan},
        booktitle = {Proceedings of the Thirty-Sixth Conference on Neural Information Processing Systems (NeurIPS)},
        year = {2022}
      }
      
      Details
    12. Controllable Text Generation for Open-Domain Creativity and Fairness

      Nanyun Peng, in Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI-22), Early Career Track, 2022.
      Full Text BibTeX Details
      @inproceedings{peng2022controllable,
        title = {Controllable Text Generation for Open-Domain Creativity and Fairness},
        author = {Peng, Nanyun},
        booktitle = {Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI-22), Early Career Track},
        year = {2022}
      }
      
      Details
    13. NewsEdits: A News Article Revision Dataset and a Novel Document-Level Reasoning Challenge

      Alexander Spangher, Xiang Ren, Jonathan May, and Nanyun Peng, in Proceedings of the 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2022.
      Full Text Code BibTeX Details 🏆 Outstanding Paper Award (<0.4%)
      @inproceedings{spangher2022news,
        title = {NewsEdits: A News Article Revision Dataset and a Novel Document-Level Reasoning Challenge},
        author = {Spangher, Alexander and Ren, Xiang and May, Jonathan and Peng, Nanyun},
        booktitle = {Proceedings of the 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)},
        year = {2022}
      }
      
      Details
    14. Zero-Shot Sonnet Generation with Discourse-Level Planning and Aesthetics Features

      Yufei Tian and Nanyun Peng, in 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2022.
      Full Text Code BibTeX Details
      @inproceedings{tian2022sonnet,
        title = {Zero-Shot Sonnet Generation with Discourse-Level Planning and Aesthetics Features},
        author = {Tian, Yufei and Peng, Nanyun},
        booktitle = {2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)},
        year = {2022}
      }
      
      Details
    15. Multilingual Generative Language Models for Zero-Shot Cross-Lingual Event Argument Extraction

      Kuan-Hao Huang*, I.-Hung Hsu*, Premkumar Natarajan, Kai-Wei Chang, and Nanyun Peng, in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL), 2022.
      Full Text Slides Poster Code Abstract BibTeX Details
      We present a study on leveraging multilingual pre-trained generative language models for zero-shot cross-lingual event argument extraction (EAE). By formulating EAE as a language generation task, our method effectively encodes event structures and captures the dependencies between arguments. We design language-agnostic templates to represent the event argument structures, which are compatible with any language, hence facilitating the cross-lingual transfer. Our proposed model finetunes multilingual pre-trained generative language models to generate sentences that fill in the language-agnostic template with arguments extracted from the input passage. The model is trained on source languages and is then directly applied to target languages for event argument extraction. Experiments demonstrate that the proposed model outperforms the current state-of-the-art models on zero-shot cross-lingual EAE. Comprehensive studies and error analyses are presented to better understand the advantages and the current limitations of using generative language models for zero-shot cross-lingual transfer EAE.
      @inproceedings{huang2022multilingual,
        title = {Multilingual Generative Language Models for Zero-Shot Cross-Lingual Event Argument Extraction},
        author = {Huang*, Kuan-Hao and Hsu*, I-Hung and Natarajan, Premkumar and Chang, Kai-Wei and Peng, Nanyun},
        booktitle = {Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL)},
        year = {2022}
      }
      
      Details
    16. Go Back in Time: Generating Flashbacks in Stories with Event Temporal Prompts

      Rujun Han, Hong Chen, Yufei Tian, and Nanyun Peng, in 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2022.
      Full Text Code BibTeX Details
      @inproceedings{han2022go,
        title = {Go Back in Time: Generating Flashbacks in Stories with Event Temporal Prompts},
        author = {Han, Rujun and Chen, Hong and Tian, Yufei and Peng, Nanyun},
        booktitle = {2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)},
        year = {2022}
      }
      
      Details
    17. FOAM: A Follower-aware Speaker Model for Vision-and-Language Navigation

      Zi-Yi Dou and Nanyun Peng, in Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), short, 2022.
      Full Text Code BibTeX Details
      @inproceedings{dou2022foam,
        title = {FOAM: A Follower-aware Speaker Model for Vision-and-Language Navigation},
        author = {Dou, Zi-Yi and Peng, Nanyun},
        booktitle = {Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), short},
        year = {2022}
      }
      
      Details
    18. AmbiPun: Generating Humorous Puns with Ambiguous Context

      Anirudh Mittal, Yufei Tian, and Nanyun Peng, in 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), short, 2022.
      Full Text Code BibTeX Details
      @inproceedings{Mittal2022ambipun,
        title = {AmbiPun: Generating Humorous Puns with Ambiguous Context},
        author = {Mittal, Anirudh and Tian, Yufei and Peng, Nanyun},
        booktitle = {2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), short},
        year = {2022}
      }
      
      Details
    19. Socially Aware Bias Measurements for Hindi Language Representations

      Vijit Malik, Sunipa Dev, Akihiro Nishi, Nanyun Peng, and Kai-Wei Chang, in Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), short, 2022.
      Full Text BibTeX Details
      @inproceedings{malik2022socially,
        title = {Socially Aware Bias Measurements for Hindi Language Representations},
        author = {Malik, Vijit and Dev, Sunipa and Nishi, Akihiro and Peng, Nanyun and Chang, Kai-Wei},
        booktitle = {Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), short},
        year = {2022}
      }
      
      Details
    20. An Empirical Study of Training End-to-End Vision-and-Language Transformers

      Zi-Yi Dou, Yichong Xu, Zhe Gan, Jianfeng Wang, Shuohang Wang, Lijuan Wang, Chenguang Zhu, Pengchuan Zhang, Lu Yuan, Nanyun Peng, Zicheng Liu, and Michael Zeng, in The Conference on Computer Vision and Pattern Recognition (CVPR-22), 2022.
      Full Text Code Abstract BibTeX Details
      Vision-and-language (VL) pre-training has proven to be highly effective on various VL downstream tasks. While recent work has shown that fully transformer-based VL models can be more efficient than previous region-feature-based methods, their performance on downstream tasks often degrades significantly. In this paper, we present METER, a Multimodal End-to-end TransformER framework, through which we investigate how to design and pre-train a fully transformer-based VL model in an end-to-end manner. Specifically, we dissect the model designs along multiple dimensions: vision encoders (e.g., CLIP-ViT, Swin transformer), text encoders (e.g., RoBERTa, DeBERTa), multimodal fusion module (e.g., merged attention vs. co-attention), architectural design (e.g., encoder-only vs. encoder-decoder), and pre-training objectives (e.g., masked image modeling). We conduct comprehensive experiments and provide insights on how to train a performant VL transformer while maintaining fast inference speed. Notably, our best model achieves an accuracy of 77.64% on the VQAv2 test-std set using only 4M images for pre-training, surpassing the state-of-the-art region-feature-based model by 1.04%, and outperforming the previous best fully transformer-based model by 1.6%.
      @inproceedings{dou2022meter,
        title = {An Empirical Study of Training End-to-End Vision-and-Language Transformers},
        author = {Dou, Zi-Yi and Xu, Yichong and Gan, Zhe and Wang, Jianfeng and Wang, Shuohang and Wang, Lijuan and Zhu, Chenguang and Zhang, Pengchuan and Yuan, Lu and Peng, Nanyun and Liu, Zicheng and Zeng, Michael},
        booktitle = {The Conference on Computer Vision and Pattern Recognition (CVPR-22)},
        year = {2022}
      }
      
      Details
    21. DEAM: Dialogue Coherence Evaluation using AMR-based Semantic Manipulations

      Sarik Ghazarian, Nuan Wen, Aram Galstyan, and Nanyun Peng, in Proceedings of the Conference of the 60th Annual Meeting of the Association for Computational Linguistics (ACL), 2022.
      Full Text Abstract BibTeX Details
      Automatic evaluation metrics are essential for the rapid development of open-domain dialogue systems as they facilitate hyper-parameter tuning and comparison between models. Although recently proposed trainable conversation-level metrics have shown encouraging results, the quality of the metrics is strongly dependent on the quality of training data. Prior works mainly resort to heuristic text-level manipulations (e.g. utterances shuffling) to bootstrap incoherent conversations (negative examples) from coherent dialogues (positive examples). Such approaches are insufficient to appropriately reflect the incoherence that occurs in interactions between advanced dialogue models and humans. To tackle this problem, we propose DEAM, a Dialogue coherence Evaluation metric that relies on Abstract Meaning Representation (AMR) to apply semantic-level Manipulations for incoherent (negative) data generation. AMRs naturally facilitate the injection of various types of incoherence sources, such as coreference inconsistency, irrelevancy, contradictions, and decrease engagement, at the semantic level, thus resulting in more natural incoherent samples. Our experiments show that DEAM achieves higher correlations with human judgments compared to baseline methods on several dialog datasets by significant margins. We also show that DEAM can distinguish between coherent and incoherent dialogues generated by baseline manipulations, whereas those baseline models cannot detect incoherent examples generated by DEAM. Our results demonstrate the potential of AMR-based semantic manipulations for natural negative example generation.
      @inproceedings{ghazarian2022deam,
        title = {DEAM: Dialogue Coherence Evaluation using AMR-based Semantic Manipulations},
        author = {Ghazarian, Sarik and Wen, Nuan and Galstyan, Aram and Peng, Nanyun},
        booktitle = {Proceedings of the Conference of the 60th Annual Meeting of the Association for Computational Linguistics (ACL)},
        year = {2022}
      }
      
      Details
    22. DEGREE: A Data-Efficient Generative Event Extraction Model

      I.-Hung Hsu*, Kuan-Hao Huang*, Elizabeth Boschee, Scott Miller, Premkumar Natarajan, Kai-Wei Chang, and Nanyun Peng, in Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL), 2022.
      Full Text Slides Video Code Abstract BibTeX Details
      Event extraction requires high-quality expert human annotations, which are usually expensive. Therefore, learning a data-efficient event extraction model that can be trained with only a few labeled examples has become a crucial challenge. In this paper, we focus on low-resource end-to-end event extraction and propose DEGREE, a data-efficient model that formulates event extraction as a conditional generation problem. Given a passage and a manually designed prompt, DEGREE learns to summarize the events mentioned in the passage into a natural sentence that follows a predefined pattern. The final event predictions are then extracted from the generated sentence with a deterministic algorithm. DEGREE has three advantages to learn well with less training data. First, our designed prompts provide semantic guidance for DEGREE to leverage DEGREE and thus better capture the event arguments. Moreover, DEGREE is capable of using additional weakly-supervised information, such as the description of events encoded in the prompts. Finally, DEGREE learns triggers and arguments jointly in an end-to-end manner, which encourages the model to better utilize the shared knowledge and dependencies among them. Our experimental results demonstrate the strong performance of DEGREE for low-resource event extraction.
      @inproceedings{hsu2022degree,
        title = {DEGREE: A Data-Efficient Generative Event Extraction Model},
        author = {Hsu*, I-Hung and Huang*, Kuan-Hao and Boschee, Elizabeth and Miller, Scott and Natarajan, Premkumar and Chang, Kai-Wei and Peng, Nanyun},
        booktitle = {Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL)},
        year = {2022}
      }
      
      Details
    23. Understanding Multimodal Procedural Knowledge by Sequencing Multimodal Instructional Manuals

      Te-Lin Wu, Alex Spangher, Pegah Alipoormolabashi, Marjorie Freedman, Ralph Weischedel, and Nanyun Peng, in Proceedings of the Conference of the 60th Annual Meeting of the Association for Computational Linguistics (ACL), 2022.
      Full Text Abstract BibTeX Details
      The ability to sequence unordered events is evidence of comprehension and reasoning about real world tasks/procedures, and is essential for applications such as task planning and multi-source instruction summarization. It often requires thorough understanding of temporal common sense and multimodal information, since these procedures are often conveyed by a combination of texts and images. While humans are capable of reasoning about and sequencing unordered procedural instructions,  the extent to which the current machine learning methods possess such a capability is still an open question. In this work, we benchmark models’ capability of reasoning over and sequencing unordered multimodal instructions by curating datasets from online instructional manuals and collecting comprehensive human annotations. We find current state-of-the-art models not only perform significantly worse than humans but also seem incapable of efficiently utilizing  multimodal information. To improve machines’ performance on multimodal event sequencing, we propose sequence-aware pretraining techniques exploiting the sequential alignment properties of both texts and images, resulting in >5% improvements on perfect match ratio.
      @inproceedings{wu2022procedural,
        title = {Understanding Multimodal Procedural Knowledge by Sequencing Multimodal Instructional Manuals},
        author = {Wu, Te-Lin and Spangher, Alex and Alipoormolabashi, Pegah and Freedman, Marjorie and Weischedel, Ralph and Peng, Nanyun},
        booktitle = {Proceedings of the Conference of the 60th Annual Meeting of the Association for Computational Linguistics (ACL)},
        year = {2022}
      }
      
      Details
    24. Fantastic Questions and Where to Find Them: FairytaleQA–An Authentic Dataset for Narrative Comprehension

      Ying Xu, Dakuo Wang, Mo Yu, Daniel Ritchie, Bingsheng Yao, Tongshuang Wu, Zheng Zhang, Toby Jia-Jun Li, Nora Bradford, Branda Sun, Tran Hoang, Yisi Sang, Yufang Hou, Xiaojuan Ma, Diyi Yang, Nanyun Peng, Zhou Yu, and Mark Warschauer, in Proceedings of the Conference of the 60th Annual Meeting of the Association for Computational Linguistics (ACL), 2022.
      BibTeX Details
      @inproceedings{xu2022fairy,
        title = {Fantastic Questions and Where to Find Them: FairytaleQA--An Authentic Dataset for Narrative Comprehension},
        author = {Xu, Ying and Wang, Dakuo and Yu, Mo and Ritchie, Daniel and Yao, Bingsheng and Wu, Tongshuang and Zhang, Zheng and Li, Toby Jia-Jun and Bradford, Nora and Sun, Branda and Hoang, Tran and Sang, Yisi and Hou, Yufang and Ma, Xiaojuan and Yang, Diyi and Peng, Nanyun and Yu, Zhou and Warschauer, Mark},
        booktitle = {Proceedings of the Conference of the 60th Annual Meeting of the Association for Computational Linguistics (ACL)},
        year = {2022}
      }
      
      Details
    25. Sibylvariant Transformations for Robust Text Classification

      Fabrice Y. Harel-Canada, Muhammad Ali Gulzar, Nanyun Peng, and Miryung Kim, in Findings of the Conference of the 60th Annual Meeting of the Association for Computational Linguistics (ACL-findings), 2022.
      BibTeX Details
      @inproceedings{harel-canada2022sibyl,
        title = {Sibylvariant Transformations for Robust Text Classification},
        author = {Harel-Canada, Fabrice Y and Gulzar, Muhammad Ali and Peng, Nanyun and Kim, Miryung},
        booktitle = {Findings of the Conference of the 60th Annual Meeting of the Association for Computational Linguistics (ACL-findings)},
        year = {2022}
      }
      
      Details
    26. On the Safety of Conversational Models: Taxonomy, Dataset, and Benchmark

      Hao Sun, Guangxuan Xu, Jiawen Deng, Jiale Cheng, Chujie Zheng, Hao Zhou, Nanyun Peng, Xiaoyan Zhu, and Minlie Huang, in Findings of the Conference of the 60th Annual Meeting of the Association for Computational Linguistics (ACL-findings), 2022.
      Full Text Abstract BibTeX Details
      Dialogue safety problems severely limit the real-world deployment of neural conversational models and have attracted great research interests recently. However, dialogue safety problems remain under-defined and the corresponding dataset is scarce. We propose a taxonomy for dialogue safety specifically designed to capture unsafe behaviors in human-bot dialogue settings, with focuses on context-sensitive unsafety, which is under-explored in prior works. To spur research in this direction, we compile DiaSafety, a dataset with rich context-sensitive unsafe examples. Experiments show that existing safety guarding tools fail severely on our dataset. As a remedy, we train a dialogue safety classifier to provide a strong baseline for context-sensitive dialogue unsafety detection. With our classifier, we perform safety evaluations on popular conversational models and show that existing dialogue systems still exhibit concerning context-sensitive safety problems.
      @inproceedings{sun2022safe,
        title = {On the Safety of Conversational Models: Taxonomy, Dataset, and Benchmark},
        author = {Sun, Hao and Xu, Guangxuan and Deng, Jiawen and Cheng, Jiale and Zheng, Chujie and Zhou, Hao and Peng, Nanyun and Zhu, Xiaoyan and Huang, Minlie},
        booktitle = {Findings of the Conference of the 60th Annual Meeting of the Association for Computational Linguistics (ACL-findings)},
        year = {2022}
      }
      
      Details
    27. Zero-shot Commonsense Question Answering with Cloze Translation and Consistency Optimization

      Zi-Yi Dou and Nanyun Peng, in The Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI), 2022.
      Full Text Code Abstract BibTeX Details
      Commonsense question answering (CQA) aims to test if models can answer questions regarding commonsense knowledge that everyone knows. Prior works that incorporate external knowledge bases have shown promising results, but knowledge bases are expensive to construct and are often limited to a fixed set of relations. In this paper, we instead focus on better utilizing the implicit knowledge stored in pre-trained language models. While researchers have found that the knowledge embedded in pre-trained language models can be extracted by having them fill in the blanks of carefully designed prompts for relation extraction and text classification, it remains unclear if we can adopt this paradigm in CQA where the inputs and outputs take much more flexible forms. To this end, we investigate four translation methods that can translate natural questions into cloze-style sentences to better solicit commonsense knowledge from language models, including a syntactic-based model, an unsupervised neural model, and two supervised neural models. In addition, to combine the different translation methods, we propose to encourage consistency among model predictions on different translated questions with unlabeled data. We demonstrate the effectiveness of our methods on three CQA datasets in zero-shot settings. We show that our methods are complementary to a knowledge base improved model, and combining them can lead to state-of-the-art zero-shot performance. Analyses also reveal distinct characteristics of the different cloze translation methods and provide insights on why combining them can lead to great improvements.
      @inproceedings{dou2022improving,
        title = {Zero-shot Commonsense Question Answering with Cloze Translation and Consistency Optimization},
        author = {Dou, Zi-Yi and Peng, Nanyun},
        booktitle = {The Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI)},
        year = {2022}
      }
      
      Details
    28. Discourse-level Relation Extraction via Graph Pooling

      I.-Hung Hsu, Xiao Guo, Premkumar Natarajan, and Nanyun Peng, in The Thirty-Sixth AAAI Conference On Artificial Intelligence Workshop on Deep Learning on Graphs: Method and Applications (DLG-AAAI), 2022.
      BibTeX Details 🏆 Best Paper Award
      @inproceedings{hsu2021discourse,
        title = {Discourse-level Relation Extraction via Graph Pooling},
        author = {Hsu, I-Hung and Guo, Xiao and Natarajan, Premkumar and Peng, Nanyun},
        booktitle = {The Thirty-Sixth AAAI Conference On Artificial Intelligence Workshop on Deep Learning on Graphs: Method and Applications (DLG-AAAI)},
        year = {2022}
      }
      
      Details

    2021

    1. Document-level Entity-based Extraction as Template Generation

      Kung-Hsiang Huang, Sam Tang, and Nanyun Peng, in The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.
      Full Text Code Abstract BibTeX Details
      Document-level entity-based extraction (EE), aiming at extracting entity-centric information such as entity roles and entity relations, is key to automatic knowledge acquisition from text corpora for various domains. Most document-level EE systems build extractive models, which struggle to model long-term dependencies among entities at the document level. To address this issue, we propose a generative framework for two document-level EE tasks: role-filler entity extraction (REE) and relation extraction (RE). We first formulate them as a template generation problem, allowing models to efficiently capture cross-entity dependencies, exploit label semantics, and avoid the exponential computation complexity of identifying N-ary relations. A novel cross-attention guided copy mechanism, TopK Copy, is incorporated into a pre-trained sequence-to-sequence model to enhance the capabilities of identifying key information in the input document. Experiments done on the MUC-4 and SciREX dataset show new state-of-the-art results on REE (+3.26%), binary RE (+4.8%), and 4-ary RE (+2.7%) in F1 score.
      @inproceedings{huang2021tempgen,
        title = {Document-level Entity-based Extraction as Template Generation},
        author = {Huang, Kung-Hsiang and Tang, Sam and Peng, Nanyun},
        booktitle = {The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        year = {2021}
      }
      
      Details
    2. AESOP: Paraphrase Generation with Adaptive Syntactic Control

      Jiao Sun, Xuezhe Ma, and Nanyun Peng, in The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.
      Full Text Code Abstract BibTeX Details
      We propose to control paraphrase generation through carefully chosen target syntactic structures to generate more proper and higher quality paraphrases. Our model, AESOP, leverages a pretrained language model and adds deliberately chosen syntactical control via a retrieval-based selection module to generate fluent paraphrases. Experiments show that AESOP achieves state-of-the-art performances on semantic preservation and syntactic conformation on two benchmark datasets with ground-truth syntactic control from human-annotated exemplars. Moreover, with the retrieval-based target syntax selection module, AESOP generates paraphrases with even better qualities than the current best model using human-annotated target syntactic parses according to human evaluation. We further demonstrate the effectiveness of AESOP to improve classification models’ robustness to syntactic perturbation by data augmentation on two GLUE tasks.
      @inproceedings{sun2021aesop,
        title = {AESOP: Paraphrase Generation with Adaptive Syntactic Control},
        author = {Sun, Jiao and Ma, Xuezhe and Peng, Nanyun},
        booktitle = {The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        year = {2021}
      }
      
      Details
    3. ESTER: A Machine Reading Comprehension Dataset for Event Semantic Relation Reasoning

      Rujun Han, I.-Hung Hsu, Jiao Sun, Julia Baylon, Qiang Ning, Dan Roth, and Nanyun Peng, in The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.
      Full Text Code Abstract BibTeX Details
      Understanding how events are semantically related to each other is the essence of reading comprehension. Recent event-centric reading comprehension datasets focus mostly on event arguments or temporal relations. While these tasks partially evaluate machines’ ability of narrative understanding, human-like reading comprehension requires the capability to process event-based information beyond arguments and temporal reasoning. For example, to understand causality between events, we need to infer motivation or purpose; to establish event hierarchy, we need to understand the composition of events. To facilitate these tasks, we introduce ESTER, a comprehensive machine reading comprehension (MRC) dataset for Event Semantic Relation Reasoning. The dataset leverages natural language queries to reason about the five most common event semantic relations, provides more than 6K questions, and captures 10.1K event relation pairs. Experimental results show that the current SOTA systems achieve 22.1%, 63.3% and 83.5% for token-based exact-match (EM), F1 and event-based HIT@1 scores, which are all significantly below human performances (36.0%, 79.6%, 100% respectively), highlighting our dataset as a challenging benchmark.
      @inproceedings{han2021ester,
        title = {ESTER: A Machine Reading Comprehension Dataset for Event Semantic Relation Reasoning},
        author = {Han, Rujun and Hsu, I-Hung and Sun, Jiao and Baylon, Julia and Ning, Qiang and Roth, Dan and Peng, Nanyun},
        booktitle = {The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        year = {2021}
      }
      
      Details
    4. ECONET: Effective Continual Pretraining of Language Models for Event Temporal Reasoning

      Rujun Han, Xiang Ren, and Nanyun Peng, in The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.
      Full Text Code Abstract BibTeX Details
      While pre-trained language models (PTLMs) have achieved noticeable success on many NLP tasks, they still struggle for tasks that require event temporal reasoning, which is essential for event-centric applications. We present a continual pre-training approach that equips PTLMs with targeted knowledge about event temporal relations. We design self-supervised learning objectives to recover masked-out event and temporal indicators and to discriminate sentences from their corrupted counterparts (where event or temporal indicators got replaced). By further pre-training a PTLM with these objectives jointly, we reinforce its attention to event and temporal information, yielding enhanced capability on event temporal reasoning. This Effective CONtinual pre-training framework for Event Temporal reasoning (ECONET) improves the PTLMs’ fine-tuning performances across five relation extraction and question answering tasks and achieves new or on-par state-of-the-art performances in most of our downstream tasks.
      @inproceedings{han2021econet,
        title = {ECONET: Effective Continual Pretraining of Language Models for Event Temporal Reasoning},
        author = {Han, Rujun and Ren, Xiang and Peng, Nanyun},
        booktitle = {The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        year = {2021}
      }
      
      Details
    5. Improving Pre-trained Vision-and-Language Embeddings for Phrase Grounding

      Zi-Yi Dou and Nanyun Peng, in The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), short, 2021.
      Full Text Code Abstract BibTeX Details
      Phrase grounding aims to map textual phrases to their associated image regions, which can be a prerequisite for multimodal reasoning and can benefit tasks requiring identifying objects based on language. With pre-trained vision-and-language models achieving impressive performance across tasks, it remains unclear if we can directly utilize their learned embeddings for phrase grounding without fine-tuning. To this end, we propose a method to extract matched phrase-region pairs from pre-trained vision-and-language embeddings and propose four fine-tuning objectives to improve the model phrase grounding ability using image-caption data without any supervised grounding signals. Experiments on two representative datasets demonstrate the effectiveness of our objectives, outperforming baseline models in both weakly-supervised and supervised phrase grounding settings. In addition, we evaluate the aligned embeddings on several other downstream tasks and show that we can achieve better phrase grounding without sacrificing representation generality.
      @inproceedings{dou2021improving,
        title = {Improving Pre-trained Vision-and-Language Embeddings for Phrase Grounding},
        author = {Dou, Zi-Yi and Peng, Nanyun},
        booktitle = {The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), short},
        year = {2021}
      }
      
      Details
    6. Improving Zero-Shot Cross-Lingual Transfer Learning via Robust Training

      Kuan-Hao Huang, Wasi Uddin Ahmad, Nanyun Peng, and Kai-Wei Chang, in The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.
      Full Text Code Abstract BibTeX Details
      Pre-trained multilingual language encoders, such as multilingual BERT and XLM-R, show great potential for zero-shot cross-lingual transfer. However, these multilingual encoders do not precisely align words and phrases across languages. Especially, learning alignments in the multilingual embedding space usually requires sentence-level or word-level parallel corpora, which are expensive to be obtained for low-resource languages. An alternative is to make the multilingual encoders more robust; when fine-tuning the encoder using downstream task, we train the encoder to tolerate noise in the contextual embedding spaces such that even if the representations of different languages are not aligned well, the model can still achieve good performance on zero-shot cross-lingual transfer. In this work, we propose a learning strategy for training robust models by drawing connections between adversarial examples and the failure cases of zero-shot cross-lingual transfer. We adopt two widely used robust training methods, adversarial training and randomized smoothing, to train the desired robust model. The experimental results demonstrate that robust training improves zero-shot cross-lingual transfer on text classification tasks. The improvement is more significant in the generalized cross-lingual transfer setting, where the pair of input sentences belong to two different languages.
      @inproceedings{huang2021improving,
        title = {Improving Zero-Shot Cross-Lingual Transfer Learning via Robust Training},
        author = {Huang, Kuan-Hao and Ahmad, Wasi Uddin and Peng, Nanyun and Chang, Kai-Wei},
        booktitle = {The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        year = {2021}
      }
      
      Details
    7. Broaden the Vision: Geo-Diverse Visual Commonsense Reasoning

      Da Yin, Liunian Harold Li, Ziniu Hu, Nanyun Peng, and Kai-Wei Chang, in The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.
      Full Text Video Code Abstract BibTeX Details
      Commonsense is defined as the knowledge on which everyone agrees. However, certain types of commonsense knowledge are correlated with culture and geographic locations and they are only shared locally. For example, the scenes of wedding ceremonies vary across regions due to different customs influenced by historical and religious factors. Such regional characteristics, however, are generally omitted in prior work. In this paper, we construct a Geo-Diverse Visual Commonsense Reasoning dataset (GD-VCR) to test vision-and-language models’ ability to understand cultural and geo-location-specific commonsense. In particular, we study two state-of-the-art Vision-and-Language models, VisualBERT and ViLBERT trained on VCR, a standard benchmark with images primarily from Western regions. We then evaluate how well the trained models can generalize to answering the questions in GD-VCR. We find that the performance of both models for non-Western regions including East Asia, South Asia, and Africa is significantly lower than that for Western region. We analyze the reasons behind the performance disparity and find that the performance gap is larger on QA pairs that: 1) are concerned with culture-related scenarios, e.g., weddings, religious activities, and festivals; 2) require high-level geo-diverse commonsense reasoning rather than low-order perception and recognition.
      @inproceedings{yin2021broaden,
        title = {Broaden the Vision: Geo-Diverse Visual Commonsense Reasoning},
        author = {Yin, Da and Li, Liunian Harold and Hu, Ziniu and Peng, Nanyun and Chang, Kai-Wei},
        booktitle = {The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        year = {2021}
      }
      
      Details
    8. HypoGen: Hyperbole Generation with Commonsense and Counterfactual Knowledge

      Yufei Tian, Arvind krishna Sridhar, and Nanyun Peng, in Findings of the Association for Computational Linguistics: EMNLP, 2021.
      Full Text Video Code Abstract BibTeX Details
       A hyperbole is an intentional and creative exaggeration not to be taken literally. Despite its ubiquity in daily life, the computational explorations of hyperboles are scarce. In this paper, we tackle the under-explored and challenging task: sentence-level hyperbole generation. We start with a representative syntactic pattern for intensification and systematically study the semantic (commonsense and counterfactual) relationships between each component in such hyperboles. We then leverage commonsense and counterfactual inference to generate hyperbole candidates based on our findings from the pattern, and train neural classifiers to rank and select high-quality hyperboles. Automatic and human evaluations show that our generation method is able to generate hyperboles creatively with high success rate and intensity.
      @inproceedings{tian2021hypogen,
        title = {HypoGen: Hyperbole Generation with Commonsense and Counterfactual Knowledge},
        author = {Tian, Yufei and Sridhar, Arvind krishna and Peng, Nanyun},
        booktitle = {Findings of the Association for Computational Linguistics: EMNLP},
        year = {2021}
      }
      
      Details
    9. HyperExpan: Taxonomy Expansion with Hyperbolic Representation Learning

      Mingyu Derek Ma, Muhao Chen, Te-Lin Wu, and Nanyun Peng, in Findings of the Association for Computational Linguistics: EMNLP, 2021.
      Full Text Slides Video Code Abstract BibTeX Details
      Taxonomies are valuable resources for many applications, but the limited coverage due to the expensive manual curation process hinders their general applicability. Prior works attempt to automatically expand existing taxonomies to improve their coverage by learning concept embeddings in Euclidean space, while taxonomies, inherently hierarchical, more naturally align with the geometric properties of a hyperbolic space. In this paper, we present HyperExpan, a taxonomy expansion algorithm that seeks to preserve the structure of a taxonomy in a more expressive hyperbolic embedding space and learn to represent concepts and their relations with a Hyperbolic Graph Neural Network (HGNN). Specifically, HyperExpan leverages position embeddings to exploit the structure of the existing taxonomies, and characterizes the concept profile information to support the inference on unseen concepts during training. Experiments show that our proposed HyperExpan outperforms baseline models with representation learning in a Euclidean feature space and achieves state-of-the-art performance on the taxonomy expansion benchmarks.
      @inproceedings{ma2021hyperexpan,
        title = {HyperExpan: Taxonomy Expansion with Hyperbolic Representation Learning},
        author = {Ma, Mingyu Derek and Chen, Muhao and Wu, Te-Lin and Peng, Nanyun},
        booktitle = {Findings of the Association for Computational Linguistics: EMNLP},
        year = {2021}
      }
      
      Details
    10. Men Are Elected, Women Are Married: Events Gender Bias on Wikipedia

      Jiao Sun and Nanyun Peng, in Proceedings of the Conference of the 59th Annual Meeting of the Association for Computational Linguistics (ACL), 2021.
      Full Text Code Abstract BibTeX Details 🏆 Best Paper Nomination
      Human activities can be seen as sequences of events, which are crucial to understanding societies. Disproportional event distribution for different demographic groups can manifest and amplify social stereotypes, and potentially jeopardize the ability of members in some groups to pursue certain goals. In this paper, we present the first event-centric study of gender biases in a Wikipedia corpus. To facilitate the study, we curate a corpus of career and personal life descriptions with demographic information consisting of 7,854 fragments from 10,412 celebrities. Then we detect events with a state-of-the-art event detection model, calibrate the results using strategically generated templates, and extract events that have asymmetric associations with genders. Our study discovers that Wikipedia pages tend to intermingle personal life events with professional events for females but not for males, which calls for the awareness of the Wikipedia community to formalize guidelines and train the editors to mind the implicit biases that contributors carry. Our work also lays the foundation for future works on quantifying and discovering event biases at the corpus level.
      @inproceedings{sun2021men,
        title = {Men Are Elected, Women Are Married: Events Gender Bias on Wikipedia},
        author = {Sun, Jiao and Peng, Nanyun},
        booktitle = {Proceedings of the Conference of the 59th Annual Meeting of the Association for Computational Linguistics (ACL)},
        year = {2021}
      }
      
      Details
    11. Societal Biases in Language Generation: Progress and Challenges

      Emily Sheng, Kai-Wei Chang, Premkumar Natarajan, and Nanyun Peng, in Proceedings of the Conference of the 59th Annual Meeting of the Association for Computational Linguistics (ACL), 2021.
      Full Text Abstract BibTeX Details
      Technology for language generation has advanced rapidly, spurred by advancements in pre-training large models on massive amounts of data and the need for intelligent agents to communicate in a natural manner. While techniques can effectively generate fluent text, they can also produce undesirable societal biases that can have a disproportionately negative impact on marginalized populations. Language generation presents unique challenges for biases in terms of direct user interaction and the structure of decoding techniques. To better understand these challenges, we present a survey on societal biases in language generation, focusing on how data and techniques contribute to biases and progress towards reducing biases. Motivated by a lack of studies on biases from decoding techniques, we also conduct experiments to quantify the effects of these techniques. By further discussing general trends and open challenges, we call to attention promising directions for research and the importance of fairness and inclusivity considerations for language generation applications.
      @inproceedings{sheng2021societal,
        title = {Societal Biases in Language Generation: Progress and Challenges},
        author = {Sheng, Emily and Chang, Kai-Wei and Natarajan, Premkumar and Peng, Nanyun},
        booktitle = {Proceedings of the Conference of the 59th Annual Meeting of the Association for Computational Linguistics (ACL)},
        year = {2021}
      }
      
      Details
    12. Metaphor Generation with Conceptual Mappings

      Kevin Stowe, Tuhin Chakrabarty, Nanyun Peng, Smaranda Muresan, and Iryna Gurevych, in Proceedings of the Conference of the 59th Annual Meeting of the Association for Computational Linguistics (ACL), 2021.
      Full Text Code Abstract BibTeX Details
      Generating metaphors is a difficult task as it requires understanding nuanced relationships between abstract concepts. In this paper, we aim to generate a metaphoric sentence given a literal expression by replacing relevant verbs. Guided by conceptual metaphor theory, we propose to control the generation process by encoding conceptual mappings between cognitive domains to generate meaningful metaphoric expressions. To achieve this, we develop two methods: 1) using FrameNetbased embeddings to learn mappings between domains and applying them at the lexical level (CM-Lex), and 2) deriving source/target pairs to train a controlled seq-to-seq generation model (CM-BART). We assess our methods through automatic and human evaluation for basic metaphoricity and conceptual metaphor presence. We show that the unsupervised CMLex model is competitive with recent deep learning metaphor generation systems, and CM-BART outperforms all other models both in automatic and human evaluations.
      @inproceedings{stowe2021metaphor,
        title = {Metaphor Generation with Conceptual Mappings},
        author = {Stowe, Kevin and Chakrabarty, Tuhin and Peng, Nanyun and Muresan, Smaranda and Gurevych, Iryna},
        booktitle = {Proceedings of the Conference of the 59th Annual Meeting of the Association for Computational Linguistics (ACL)},
        year = {2021}
      }
      
      Details
    13. COM2SENSE: A Commonsense Reasoning Benchmark with Complementary Sentences

      Shikhar Singh, Nuan Wen, Yu Hou, Pegah Alipoormolabashi, Te-lin Wu, Xuezhe Ma, and Nanyun Peng, in Proceedings of Findings of the Conference of the 59th Annual Meeting of the Association for Computational Linguistics (ACL-Findings), 2021.
      Full Text Code Abstract BibTeX Details
      Commonsense reasoning is intuitive for humans but has been a long-term challenge for artificial intelligence (AI). Recent advancements in pretrained language models have shown promising results on several commonsense benchmark datasets. However, the reliability and comprehensiveness of these benchmarks towards assessing model’s commonsense reasoning ability remains unclear. To this end, we introduce a new commonsense reasoning benchmark dataset comprising natural language true/false statements, with each sample paired with its complementary counterpart, resulting in 4k sentence pairs. We propose a pairwise accuracy metric to reliably measure an agent’s ability to perform commonsense reasoning over a given situation. The dataset is crowdsourced and enhanced with an adversarial model-in-the-loop setup to incentivize challenging samples. To facilitate a systematic analysis of commonsense capabilities, we design our dataset along the dimensions of knowledge domains, reasoning scenarios and numeracy. Experimental results demonstrate that our strongest baseline (UnifiedQA-3B), after fine-tuning, achieves  71% standard accuracy and  51% pairwise accuracy, well below human performance ( 95% for both metrics).
      @inproceedings{sw2021com,
        title = {COM2SENSE: A Commonsense Reasoning Benchmark with Complementary Sentences},
        author = {Singh, Shikhar and Wen, Nuan and Hou, Yu and Alipoormolabashi, Pegah and Wu, Te-lin and Ma, Xuezhe and Peng, Nanyun},
        booktitle = {Proceedings of Findings of the Conference of the 59th Annual Meeting of the Association for Computational Linguistics (ACL-Findings)},
        year = {2021}
      }
      
      Details
    14. "Nice Try, Kiddo": Ad Hominems in Dialogue Systems

      Emily Sheng, Kai-Wei Chang, Premkumar Natarajan, and Nanyun Peng, in The 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2021.
      Full Text Video Code Abstract BibTeX Details
      Ad hominem attacks are those that attack some feature of a person’s character instead of the position the person is maintaining. As a form of toxic and abusive language, ad hominems contain harmful language that could further amplify the skew of power inequality for marginalized populations. Since dialogue systems are designed to respond directly to user input, it is important to study ad hominems in these system responses. In this work, we propose categories of ad hominems that allow us to analyze human and dialogue system responses to Twitter posts. We specifically compare responses to Twitter posts about marginalized communities (#BlackLivesMatter, #MeToo) and other topics (#Vegan, #WFH). Furthermore, we propose a constrained decoding technique that uses salient n-gram similarity to apply soft constraints to top-k sampling and can decrease the amount of ad hominems generated by dialogue systems. Our results indicate that 1) responses composed by both humans and DialoGPT contain more ad hominems for discussions around marginalized communities versus other topics, 2) different amounts of ad hominems in the training data can influence the likelihood of the model generating ad hominems, and 3) we can thus carefully choose training data and use constrained decoding techniques to decrease the amount of ad hominems generated by dialogue systems.
      @inproceedings{sheng2021nice,
        title = {"Nice Try, Kiddo": Ad Hominems in Dialogue Systems},
        author = {Sheng, Emily and Chang, Kai-Wei and Natarajan, Premkumar and Peng, Nanyun},
        booktitle = {The 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)},
        publisher = {Association for Computational Linguistics},
        pages = {750--767},
        year = {2021}
      }
      
      Details
    15. Plot-guided Adversarial Example Construction for Evaluating Open-domain Story Generation

      Sarik Ghazarian, Zixi Liu, Akash S. M, Ralph Weischedel, Aram Galstyan, and Nanyun Peng, in The 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2021.
      Full Text Slides Code Abstract BibTeX Details
      With the recent advances of open-domain story generation models, the lack of reliable automatic evaluation metrics becomes an increasingly imperative issue that hinders the development of such models. A critical bottleneck of obtaining a trustworthy learnable evaluation metric is the lack of high-quality training data for learning classifiers to efficiently distinguish between plausible and implausible machine-generated stories. Previous works relied on heuristically manipulate plausible examples to mimic possible system drawbacks such as repetition, contradiction, or irrelevant content in the text level, which can be unnatural and oversimplify the characteristics of implausible machine-generated stories. We propose to tackle these issues by generating a more comprehensive set of implausible stories using plots, which are structured representations of controllable factors used to generate stories.  Since these plots are compact and structured, it is easier to manipulate them to generate text with targeted undesirable properties, while at the same time maintain the naturalness of the generation. To improve the quality of incoherent stories, we further apply the adversarial filtering procedure to select a more nuanced set of implausible texts. We find that the evaluation metrics trained on our generated data result in more reliable automatic assessments that correlate remarkably better with human judgments than other baselines.
      @inproceedings{ghazarian2021plot,
        title = {Plot-guided Adversarial Example Construction for Evaluating Open-domain Story Generation},
        author = {Ghazarian, Sarik and Liu, Zixi and M, Akash S and Weischedel, Ralph and Galstyan, Aram and Peng, Nanyun},
        booktitle = {The 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)},
        publisher = {Association for Computational Linguistics},
        pages = {4334–-4344},
        year = {2021}
      }
      
      Details
    16. MERMAID: Metaphor Generation with Symbolism and Discriminative Decoding

      Tuhin Chakrabarty, Xurui Zhang, Smaranda Muresan, and Nanyun Peng, in The 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2021.
      Full Text Poster Code Abstract BibTeX Details
      Generating metaphors is a challenging task as it requires a proper understanding of abstract concepts, making connections between unrelated concepts, and deviating from the literal meaning. In this paper, we aim to generate a metaphoric sentence given a literal expression by replacing relevant verbs. Based on a theoretically-grounded connection between metaphors and symbols, we propose a method to automatically construct a parallel corpus by transforming a large number of metaphorical sentences from the Gutenberg Poetry corpus (CITATION) to their literal counterpart using recent advances in masked language modeling coupled with commonsense inference. For the generation task, we incorporate a metaphor discriminator to guide the decoding of a sequence to sequence model fine-tuned on our parallel data to generate high-quality metaphors. Human evaluation on an independent test set of literal statements shows that our best model generates metaphors better than three well-crafted baselines 66% of the time on average. A task-based evaluation shows that human-written poems enhanced with metaphors proposed by our model are preferred 68% of the time compared to poems without metaphors.
      @inproceedings{chakrabarty2021mermaid,
        title = {MERMAID: Metaphor Generation with Symbolism and Discriminative Decoding},
        author = {Chakrabarty, Tuhin and Zhang, Xurui and Muresan, Smaranda and Peng, Nanyun},
        booktitle = {The 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)},
        talk_url = {https://underline.io/events/122/sessions/4240/lecture/19642-mermaid-metaphor-generation-with-symbolism-and-discriminative-decoding},
        year = {2021}
      }
      
      Details
    17. DiSCoL: Toward Engaging Dialogue Systems through Conversational Line Guided Response Generation

      Sarik Ghazarian, Zixi Liu, Tuhin Chakrabarty, Xuezhe Ma, Aram Galstyan, and Nanyun Peng, in 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), Demonstrations Track, 2021.
      Full Text Code Abstract BibTeX Details
      Having engaging and informative conversations with users is the utmost goal for open-domain conversational systems. Recent advances in transformer-based language models and their applications to dialogue systems have succeeded to generate fluent and human-like responses. However, they still lack control over the generation process towards producing contentful responses and achieving engaging conversations. To achieve this goal, we present DiSCoL (Dialogue Systems through Coversational Line guided response generation). DiSCoL is an open-domain dialogue system that leverages conversational lines (briefly convlines) as controllable and informative content-planning elements to guide the generation model produce engaging and informative responses. Two primary modules in DiSCoL’s pipeline are conditional generators trained for 1) predicting relevant and informative convlines for dialogue contexts and 2) generating high-quality responses conditioned on the predicted convlines. Users can also change the returned convlines to control the direction of the conversations towards topics that are more interesting for them. Through automatic and human evaluations, we demonstrate the efficiency of the convlines in producing engaging conversations.
      @inproceedings{ghazarian2021discol,
        title = {DiSCoL: Toward Engaging Dialogue Systems through Conversational Line Guided Response Generation},
        author = {Ghazarian, Sarik and Liu, Zixi and Chakrabarty, Tuhin and Ma, Xuezhe and Galstyan, Aram and Peng, Nanyun},
        booktitle = {2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), Demonstrations Track},
        pages = {26–-34},
        publisher = {Association for Computational Linguistics},
        year = {2021}
      }
      
      Details
    18. EventPlus: A Temporal Event Understanding Pipeline

      Mingyu Derek Ma, Jiao Sun, Mu Yang, Kung-Hsiang Huang, Nuan Wen, Shikhar Singh, Rujun Han, and Nanyun Peng, in 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), Demonstrations Track, 2021.
      Full Text Slides Poster Video Code Abstract BibTeX Details
      We present EventPlus, a temporal event understanding pipeline that integrates various state-of-the-art event understanding components including event trigger and type detection, event argument detection, event duration and temporal relation extraction. Event information, especially event temporal knowledge, is a type of common sense knowledge that helps people understand how stories evolve and provides predictive hints for future events. EventPlus as the first comprehensive temporal event understanding pipeline provides a convenient tool for users to quickly obtain annotations about events and their temporal information for any user-provided document. Furthermore, we show EventPlus can be easily adapted to other domains (e.g., biomedical domain). We make EventPlus publicly available to facilitate event-related information extraction and downstream applications.
      @inproceedings{ma2021eventplus,
        title = {EventPlus: A Temporal Event Understanding Pipeline},
        author = {Ma, Mingyu Derek and Sun, Jiao and Yang, Mu and Huang, Kung-Hsiang and Wen, Nuan and Singh, Shikhar and Han, Rujun and Peng, Nanyun},
        booktitle = {2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), Demonstrations Track},
        year = {2021}
      }
      
      Details
    19. Identifying Distributional Perspective Differences from Colingual Groups

      Yufei Tian, Tuhin Chakrabarty, Fred Morstatter, and Nanyun Peng, in NAACL 2021 Workshop of Social NLP, 2021.
      Full Text Code Abstract BibTeX Details
      Perspective differences exist among different cultures or languages. A lack of mutual understanding among different groups about their perspectives on specific values or events may lead to uninformed decisions or biased opinions. Automatically understanding the group perspectives can provide essential background for many downstream applications of natural language processing techniques. In this paper, we study colingual groups and use language corpora as a proxy to identify their distributional perspectives. We present a novel computational approach to learn shared understandings, and benchmark our method by building culturally-aware models for the English, Chinese, and Japanese languages. On a held out set of diverse topics including marriage, corruption, democracy, our model achieves high correlation with human judgements regarding intra-group values and inter-group differences.
      @inproceedings{tian2021identifying,
        title = {Identifying Distributional Perspective Differences from Colingual Groups},
        author = {Tian, Yufei and Chakrabarty, Tuhin and Morstatter, Fred and Peng, Nanyun},
        booktitle = {NAACL 2021 Workshop of Social NLP},
        year = {2021}
      }
      
      Details
    20. Document-level Event Extraction with Efficient End-to-end Learning of Cross-event Dependencies

      Kung-Hsiang Huang and Nanyun Peng, in The 3rd Workshop on Narrative Understanding (NAACL 2021), 2021.
      Full Text Abstract BibTeX Details
      Fully understanding narratives often requires identifying events in the context of whole documents and modeling the event relations. However, document-level event extraction is a challenging task as it requires the extraction of event and entity coreference, and capturing arguments that span across different sentences. Existing works on event extraction usually confine on extracting events from single sentences, which fail to capture the relationships between the event mentions at the scale of a document, as well as the event arguments that appear in a different sentence than the event trigger. In this paper, we propose an end-to-end model leveraging Deep Value Networks (DVN), a structured prediction algorithm, to efficiently capture cross-event dependencies for document-level event extraction. Experimental results show that our approach achieves comparable performance to CRF-based models on ACE05, while enjoys significantly higher computational efficiency.
      @inproceedings{huang2021document,
        title = {Document-level Event Extraction with Efficient End-to-end Learning of Cross-event Dependencies},
        author = {Huang, Kung-Hsiang and Peng, Nanyun},
        booktitle = {The 3rd Workshop on Narrative Understanding (NAACL 2021)},
        year = {2021}
      }
      
      Details
    21. Discourse Tagging for Scientific Evidence Extraction

      Xiangci Li, Gully Burns, and Nanyun Peng, in The 16th Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2021.
      Full Text Code Abstract BibTeX Details
      Evidence plays a crucial role in any biomedical research narrative, providing justification for some claims and refutation for others. We seek to build models of scientific argument using information extraction methods from fulltext papers. We present the capability of automatically extracting text fragments from primary research papers that describe the evidence presented in that paper’s figures, which arguably provides the raw material of any scientific argument made within the paper. We apply richly contextualized deep representation learning pre-trained on biomedical domain corpus to the analysis of scientific discourse structures and the extraction of "evidence fragments" (i.e., the text in the results section describing data presented in a specified subfigure) from a set of biomedical experimental research articles. We first demonstrate our state-of-the-art scientific discourse tagger on two scientific discourse tagging datasets and its transferability to new datasets. We then show the benefit of leveraging scientific discourse tags for downstream tasks such as claim-extraction and evidence fragment detection. Our work demonstrates the potential of using evidence fragments derived from figure spans for improving the quality of scientific claims by cataloging, indexing and reusing evidence fragments as independent documents.
      @inproceedings{li2021discourse,
        title = {Discourse Tagging for Scientific Evidence Extraction},
        author = {Li, Xiangci and Burns, Gully and Peng, Nanyun},
        booktitle = {The 16th Conference of the European Chapter of the Association for Computational Linguistics (EACL)},
        year = {2021}
      }
      
      Details
    22. MELINDA: A Multimodal Dataset for Biomedical Experiment Method Classification

      Wu Te-Lin, Shikhar Singh, Sayan Paul, Gully Burns, and Nanyun Peng, in The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21), 2021.
      Full Text Code Abstract BibTeX Details
      We introduce a new dataset, MELINDA, for Multimodal Biomedical Experiment Method Classification. The dataset is collected in a fully automated distant supervision manner, where the labels are obtained from an existing curated database, and the actual contents are extracted from papers associated with each of the records in the database. We benchmark various state-of-the-art NLP and computer vision models, including unimodal models which only take either caption texts or images as inputs, and multimodal models. Our extensive experimental results show that multimodal models, despite outperforming other benchmarked models, require certain improvements especially a less-supervised way of grounding visual concepts with languages, and better transfer learning for low resource tasks.  We release our dataset and the benchmarks to facilitate future research in multimodal learning, especially to motivate targeted improvements for applications in scientific domains.
      @inproceedings{wu2021melinda,
        title = {MELINDA: A Multimodal Dataset for Biomedical Experiment Method Classification},
        author = {Te-Lin, Wu and Singh, Shikhar and Paul, Sayan and Burns, Gully and Peng, Nanyun},
        booktitle = {The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21)},
        year = {2021}
      }
      
      Details
    23. GATE: Graph Attention Transformer Encoder for Cross-lingual Relation and Event Extraction

      Wasi Ahmad, Nanyun Peng, and Kai-Wei Chang, in The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21), 2021.
      Full Text Code Abstract BibTeX Details
      Prevalent approaches in cross-lingual relation and event extraction use graph convolutional networks (GCNs) with universal dependency parses to learn language-agnostic representations such that models trained on one language can be applied to other languages. However, GCNs lack in modeling long-range dependencies or disconnected words in the dependency tree. To address this challenge, we propose to utilize the self-attention mechanism where we explicitly fuse structural information to learn the dependencies between words at different syntactic distances. We introduce GATE, a Graph Attention Transformer Encoder, and test its cross-lingual transferability on relation and event extraction tasks. We perform rigorous experiments on the widely used ACE05 dataset that includes three typologically different languages: English, Chinese, and Arabic. The evaluation results show that GATE outperforms three recently proposed methods by a large margin. Our detailed analysis reveals that due to the reliance on syntactic dependencies, GATE produces robust representations that facilitate transfer across languages.
      @inproceedings{ahmad2021gate,
        author = {Ahmad, Wasi and Peng, Nanyun and Chang, Kai-Wei},
        title = {GATE: Graph Attention Transformer Encoder for Cross-lingual Relation and Event Extraction},
        booktitle = {The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21)},
        year = {2021}
      }
      
      Details
    24. A Paragraph-level Multi-task Learning Model for Scientific Fact-Verification

      Xiangci Li, Gully Burns, and Nanyun Peng, in Scientific Document Understanding Workshop at the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21), 2021.
      Full Text Code Abstract BibTeX Details
      Even for domain experts, it is a non-trivial task to verify a scientific claim by providing supporting or refuting evidence rationales. The situation worsens as misinformation is proliferated on social media or news websites, manually or programmatically, at every moment. As a result, an automatic fact-verification tool becomes crucial for combating the spread of misinformation. In this work, we propose a novel, paragraph-level, multi-task learning model for the SciFact task by directly computing a sequence of contextualized sentence embeddings from a BERT model and jointly training the model on rationale selection and stance prediction.
      @inproceedings{li2021paragraph,
        title = {A Paragraph-level Multi-task Learning Model for Scientific Fact-Verification},
        author = {Li, Xiangci and Burns, Gully and Peng, Nanyun},
        booktitle = {Scientific Document Understanding Workshop at the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21)},
        year = {2021}
      }
      
      Details

    2020

    1. Content Planning for Neural Story Generation with Aristotelian Rescoring

      Seraphina Goldfarb-Tarrant, Tuhin Chakrabarty, Ralph Weischedel, and Nanyun Peng, in the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020.
      Full Text Slides Code Abstract BibTeX Details
      Long-form narrative text generated from largelanguage models manages a fluent impersonation of human writing, but only at the localsentence level, and lacks structure or global cohesion. We posit that many of the problem of story generation can be addressed via high quality content planning, and present a systemthat focuses on how to learn good plot structures to guide story generation. We utilize a plot-generation language model along with an ensemble of rescoring models that each implement an aspect of good story-writing as detailed in Aristotle’s Poetics. We find that stories written with our more principled plot structure are both more relevant to a given prompt and higher quality than baselines that do not content plan, or that plan in an unprincipled way.
      @inproceedings{goldfarb2020content,
        title = {Content Planning for Neural Story Generation with Aristotelian Rescoring},
        author = {Goldfarb-Tarrant, Seraphina and Chakrabarty, Tuhin and Weischedel, Ralph and Peng, Nanyun},
        booktitle = {the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        pages = {4319--4338},
        slideslive_id = {38939240},
        year = {2020}
      }
      
      Details
    2. Generating similes effortlessly like a Pro: A Style Transfer Approach for Simile Generation

      Tuhin Chakrabarty, Smaranda Muresan, and Nanyun Peng, in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020.
      Full Text Slides Code Abstract BibTeX Details
      Literary tropes, from poetry to stories, are at the crux of human imagination and communication. Figurative language, such as a simile,goes beyond plain expressions to give readers new insights and inspirations. We tackle the problem of simile generation. Generating a simile requires proper understanding for effective mapping of properties between two concepts. To this end, we first propose a method to automatically construct a parallel corpus by transforming a large number of similes collected from Reddit to their literal counterpart using structured common sense knowledge. We then fine-tune a pretrained sequence to sequence model, BART (Lewis et al., 2019),on the literal-simile pairs to generate novel similes given a literal sentence. Experiments show that our approach generates 88% novel similes that do not share properties with the training data. Human evaluation on an independent set of literal statements shows that our model generates similes better than two literary experts 37% of the times, and three baseline systems including a recent metaphor generation model 71% of the times when compared pairwise. We also show how replacing literal sentences with similes from our best model in machine generated stories improves evocativeness and leads to better acceptance by human judges.
      @inproceedings{chakrabarty-etal-2020-generating,
        title = {Generating similes effortlessly like a Pro: A Style Transfer Approach for Simile Generation},
        author = {Chakrabarty, Tuhin and Muresan, Smaranda and Peng, Nanyun},
        booktitle = {Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        pages = {6455--6469},
        publisher = {Association for Computational Linguistics},
        slideslive_id = {38938962},
        year = {2020}
      }
      
      Details
    3. Domain Knowledge Empowered Structured Neural Net for End-to-End Event Temporal Relation Extraction

      Rujun Han, Yichao Zhou, and Nanyun Peng, in the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020.
      Full Text Slides Code Abstract BibTeX Details
      Extracting event temporal relations is a critical task for information extraction and plays an important role in natural language understanding. Prior systems leverage deep learning and pre-trained language models to improve the performance of the task. However, these systems often suffer from two shortcomings: 1) when performing maximum a posteriori (MAP) inference based on neural models, previous systems only used structured knowledge that is assumed to be absolutely correct, i.e., hard constraints; 2) biased predictions on dominant temporal relations when training with a limited amount of data. To address these issues, we propose a framework that enhances deep neural network with distributional constraints constructed by probabilistic domain knowledge. We solve the constrained inference problem via Lagrangian Relaxation and apply it to end-to-end event temporal relation extraction tasks. Experimental results show our framework is able to improve the baseline neural network models with strong statistical significance on two widely used datasets in news and clinical domains.
      @inproceedings{han2020knowledge,
        title = {Domain Knowledge Empowered Structured Neural Net for End-to-End Event Temporal Relation Extraction},
        author = {Han, Rujun and Zhou, Yichao and Peng, Nanyun},
        booktitle = {the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        publisher = {Association for Computational Linguistics},
        pages = {5717--5729},
        slideslive_id = {38939236},
        year = {2020}
      }
      
      Details
    4. TORQUE: A Reading Comprehension Dataset of Temporal Ordering Questions

      Qiang Ning, Hao Wu, Rujun Han, Nanyun Peng, Matt Gardner, and Dan Roth, in the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020.
      Full Text Code Abstract BibTeX Details
      A critical part of reading is being able to understand the temporal relationships between events described in a passage of text, even when those relationships are not explicitly stated. However, current machine reading comprehension benchmarks have practically no questions that test temporal phenomena, so systems trained on these benchmarks have no capacity to answer questions such as "what happened before/after [some event]?" We introduce TORQUE, a new English reading comprehension benchmark built on 3.2k news snippets with 21k human-generated questions querying temporal relationships. Results show that RoBERTa-large achieves an exact-match score of 51% on the test set of TORQUE, about 30% behind human performance.
      @inproceedings{ning2020torque,
        title = {TORQUE: A Reading Comprehension Dataset of Temporal Ordering Questions},
        author = {Ning, Qiang and Wu, Hao and Han, Rujun and Peng, Nanyun and Gardner, Matt and Roth, Dan},
        booktitle = {the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        publisher = {Association for Computational Linguistics},
        pages = {1158--1172},
        slideslive_id = {38938807},
        year = {2020}
      }
      
      Details
    5. STORIUM: A Dataset and Evaluation Platform for Machine-in-the-Loop Story Generation

      Nader Akoury, Shufan Wang, Josh Whiting, Stephen Hood, Nanyun Peng, and Mohit Iyyer, in the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020.
      Full Text Code Abstract BibTeX Details
      Systems for story generation are asked to produce plausible and enjoyable stories given an input context. This task is underspecified, as a vast number of diverse stories can originate from a single input. The large output space makes it difficult to build and evaluate story generation models, as (1) existing datasets lack rich enough contexts to meaningfully guide models, and (2) existing evaluations (both crowdsourced and automatic) are unreliable for assessing long-form creative text. To address these issues, we introduce a dataset and evaluation platform built from STORIUM, an online collaborative storytelling community. Our author-generated dataset contains 6K lengthy stories (125M tokens) with fine-grained natural language annotations (e.g., character goals and attributes) interspersed throughout each narrative, forming a robust source for guiding models. We evaluate language models fine-tuned on our dataset by integrating them onto STORIUM, where real authors can query a model for suggested story continuations and then edit them. Automatic metrics computed over these edits correlate well with both user ratings of generated stories and qualitative feedback from semi-structured user interviews. We release both the STORIUM dataset and evaluation platform to spur more principled research into story generation.
      @inproceedings{akoury2020storium,
        title = {STORIUM: A Dataset and Evaluation Platform for Machine-in-the-Loop Story Generation},
        author = {Akoury, Nader and Wang, Shufan and Whiting, Josh and Hood, Stephen and Peng, Nanyun and Iyyer, Mohit},
        booktitle = {the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        slideslive_id = {38939010},
        year = {2020}
      }
      
      Details
    6. Towards Controllable Biases in Language Generation

      Emily Sheng, Kai-Wei Chang, Premkumar Natarajan, and Nanyun Peng, in the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)-Findings, long, 2020.
      Full Text Poster Code Abstract BibTeX Details
      We present a general approach towards controllable societal biases in natural language generation (NLG). Building upon the idea of adversarial triggers, we develop a method to induce societal biases in generated text when input prompts contain mentions of specific demographic groups. We then analyze two scenarios: 1) inducing negative biases for one demographic and positive biases for another demographic, and 2) equalizing biases between demographics. The former scenario enables us to detect the types of biases present in the model. Specifically, we show the effectiveness of our approach at facilitating bias analysis by finding topics that correspond to demographic inequalities in generated text and comparing the relative effectiveness of inducing biases for different demographics. The second scenario is useful for mitigating biases in downstream applications such as dialogue generation. In our experiments, the mitigation technique proves to be effective at equalizing the amount of biases across demographics while simultaneously generating less negatively biased text overall.
      @inproceedings{sheng2020towards,
        title = {Towards Controllable Biases in Language Generation},
        author = {Sheng, Emily and Chang, Kai-Wei and Natarajan, Premkumar and Peng, Nanyun},
        booktitle = {the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)-Findings, long},
        year = {2020}
      }
      
      Details
    7. Biomedical Event Extraction with Hierarchical Knowledge Graphs

      Kung-Hsiang Huang, Mu Yang, and Nanyun Peng, in the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)-Findings, short, 2020.
      Full Text Slides Code Abstract BibTeX Details
      Biomedical event extraction is critical in understanding biomolecular interactions described in scientific corpus. One of the main challenges is to identify nested structured events that are associated with non-indicative trigger words. We propose to incorporate domain knowledge from Unified Medical Language System (UMLS) to a pre-trained language model via a hierarchical graph representation encoded by a proposed Graph Edgeconditioned Attention Networks (GEANet). To better recognize the trigger words, each sentence is first grounded to a sentence graph based on a jointly modeled hierarchical knowledge graph from UMLS. The grounded graphs are then propagated by GEANet, a novel graph neural networks for enhanced capabilities in inferring complex events. On BioNLP 2011 GENIA Event Extraction task, our approach achieved 1.41% F1 and 3.19% F1 improvements on all events and complex events, respectively. Ablation studies confirm the importance of GEANet and hierarchical KG.
      @inproceedings{huang2020event,
        title = {Biomedical Event Extraction with Hierarchical Knowledge Graphs},
        author = {Huang, Kung-Hsiang and Yang, Mu and Peng, Nanyun},
        booktitle = {the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)-Findings, short},
        slideslive_id = {38940169},
        year = {2020}
      }
      
      Details
    8. Connecting the Dots: A Knowledgeable Path Generator for Commonsense Question Answering

      Peifeng Wang, Nanyun Peng, Filip Ilievski, Pedro Szekely, and Xiang Ren, in the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)-Findings, 2020.
      Full Text Code Abstract BibTeX Details
      Commonsense question answering (QA) requires background knowledge which is not explicitly stated in a given context. Prior works use commonsense knowledge graphs (KGs) to obtain this knowledge for reasoning. However, relying entirely on these KGs may not suffice, considering their limited coverage and the contextual dependence of their knowledge. In this paper, we augment a general commonsense QA framework with a knowledgeable path generator. By extrapolating over existing paths in a KG with a state-of-the-art language model, our generator learns to connect a pair of entities in text with a dynamic, and potentially novel, multi-hop relational path. Such paths can provide structured evidence for solving commonsense questions without fine-tuning the path generator. Experiments on two datasets show the superiority of our method over previous works which fully rely on knowledge from KGs (with up to 6% improvement in accuracy), across various amounts of training data. Further evaluation suggests that the generated paths are typically interpretable, novel, and relevant to the task.
      @inproceedings{wang2020connecting,
        title = {Connecting the Dots: A Knowledgeable Path Generator for Commonsense Question Answering},
        author = {Wang, Peifeng and Peng, Nanyun and Ilievski, Filip and Szekely, Pedro and Ren, Xiang},
        booktitle = {the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)-Findings},
        pages = {4129--4140},
        year = {2020}
      }
      
      Details
    9. R3: Reverse, Retrieve, and Rank for Sarcasm Generation with Commonsense Knowledge

      Tuhin Chakrabarty, Debanjan Ghosh, Smaranda Muresan, and Nanyun Peng, in the 2020 Annual Conference of the Association for Computational Linguistics (ACL), 2020.
      Full Text Code BibTeX Details
      @inproceedings{chakrabarty2020r,
        title = {R3: Reverse, Retrieve, and Rank for Sarcasm Generation with Commonsense Knowledge},
        author = {Chakrabarty, Tuhin and Ghosh, Debanjan and Muresan, Smaranda and Peng, Nanyun},
        booktitle = {the 2020 Annual Conference of the Association for Computational Linguistics (ACL)},
        year = {2020}
      }
      
      Details
    10. Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

      Sarik Ghazarian, Ralph Weischedel, Aram Galstyan, and Nanyun Peng, in The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20), 2020.
      Full Text Code Abstract BibTeX Details
      User engagement is a critical metric for evaluating the quality of open-domain dialogue systems. Prior work has focused on conversation-level engagement by using heuristically constructed features such as the number of turns and the total time of the conversation. In this paper, we investigate the possibility and efficacy of estimating utterance-level engagement and define a novel metric, predictive engagement, for automatic evaluation of open-domain dialogue systems. Our experiments demonstrate that (1) human annotators have high agreement on assessing utterance-level engagement scores; (2) conversation-level engagement scores can be predicted from properly aggregated utterance-level engagement scores. Furthermore, we show that the utterance-level engagement scores can be learned from data. These scores can be incorporated into automatic evaluation metrics for open-domain dialogue systems to improve the correlation with human judgements. This suggests that predictive engagement can be used as a real-time feedback for training better dialogue models.
      @inproceedings{ghazarian2020predictive,
        title = {Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems},
        author = {Ghazarian, Sarik and Weischedel, Ralph and Galstyan, Aram and Peng, Nanyun},
        booktitle = {The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20)},
        pages = {7789–-7796},
        year = {2020}
      }
      
      Details
    11. Enabling Low-Resource Transfer Learning across COVID-19 Corpora by Combining Event-Extraction and Co-Training

      Alexander Spangher, Nanyun Peng, Jonathan May, and Emilio Ferrara, in ACL 2020 Workshop on Natural Language Processing for COVID-19 (NLP-COVID), 2020.
      Full Text BibTeX Details
      @inproceedings{spangher2020enabling,
        title = {Enabling Low-Resource Transfer Learning across COVID-19 Corpora by Combining Event-Extraction and Co-Training},
        author = {Spangher, Alexander and Peng, Nanyun and May, Jonathan and Ferrara, Emilio},
        booktitle = {ACL 2020 Workshop on Natural Language Processing for COVID-19 (NLP-COVID)},
        year = {2020}
      }
      
      Details
    12. Man is to person as woman is to location: Measuring gender bias in named entity recognition

      Ninareh Mehrabi, Thamme Gowda, Fred Morstatter, Nanyun Peng, and Aram Galstyan, in 31st ACM Conference on Hypertext and Social Media (HT’20), 2020.
      Full Text BibTeX Details
      @inproceedings{mehrabi2020man,
        title = {Man is to person as woman is to location: Measuring gender bias in named entity recognition},
        author = {Mehrabi, Ninareh and Gowda, Thamme and Morstatter, Fred and Peng, Nanyun and Galstyan, Aram},
        booktitle = {31st ACM Conference on Hypertext and Social Media (HT’20)},
        year = {2020}
      }
      
      Details

    2019

    1. Joint Event and Temporal Relation Extraction with Shared Representations and Structured Prediction

      Rujun Han, Qiang Ning, and Nanyun Peng, in 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019.
      Full Text Poster Code BibTeX Details
      @inproceedings{han2019joint,
        title = {Joint Event and Temporal Relation Extraction with Shared Representations and Structured Prediction},
        author = {Han, Rujun and Ning, Qiang and Peng, Nanyun},
        booktitle = {2019 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        year = {2019}
      }
      
      Details
    2. The Woman Worked as a Babysitter: On Biases in Language Generation

      Emily Sheng, Kai-Wei Chang, Premkumar Natarajan, and Nanyun Peng, in 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP), short, 2019.
      Full Text BibTeX Details
      @inproceedings{sheng2019woman,
        title = {The Woman Worked as a Babysitter: On Biases in Language Generation},
        author = {Sheng, Emily and Chang, Kai-Wei and Natarajan, Premkumar and Peng, Nanyun},
        booktitle = {2019 Conference on Empirical Methods in Natural Language Processing (EMNLP), short},
        year = {2019}
      }
      
      Details
    3. Target Language-Aware Constrained Inference for Cross-lingual Dependency Parsing

      Tao Meng, Nanyun Peng, and Kai-Wei Chang, in 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019.
      Full Text BibTeX Details
      @inproceedings{meng2019target,
        title = {Target Language-Aware Constrained Inference for Cross-lingual Dependency Parsing},
        author = {Meng, Tao and Peng, Nanyun and Chang, Kai-Wei},
        booktitle = {2019 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
        year = {2019}
      }
      
      Details
    4. What Matters for Neural Cross-Lingual Named Entity Recognition: An Empirical Analysis

      Xiaolei Huang, Jonathan May, and Nanyun Peng, in 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP), short, 2019.
      Full Text BibTeX Details
      @inproceedings{huang2019matters,
        title = {What Matters for Neural Cross-Lingual Named Entity Recognition: An Empirical Analysis},
        author = {Huang, Xiaolei and May, Jonathan and Peng, Nanyun},
        booktitle = {2019 Conference on Empirical Methods in Natural Language Processing (EMNLP), short},
        year = {2019}
      }
      
      Details
    5. Do Nuclear Submarines Have Nuclear Captains? A Challenge Dataset for Commonsense Reasoning over Adjectives and Objects

      James Mullenbach, Jonathan Gordon, Nanyun Peng, and Jonathan May, in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), short, 2019.
      Full Text BibTeX Details
      @inproceedings{mullenbach2019nuclear,
        title = {Do Nuclear Submarines Have Nuclear Captains? A Challenge Dataset for Commonsense Reasoning over Adjectives and Objects},
        author = {Mullenbach, James and Gordon, Jonathan and Peng, Nanyun and May, Jonathan},
        booktitle = {Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), short},
        pages = {6054--6060},
        year = {2019}
      }
      
      Details
    6. Deep Structured Neural Network for Event Temporal Relation Extraction

      Rujun Han, I.-Hung Hsu, Mu Yang, Aram Galstyan, Ralph Weischedel, and Nanyun Peng, in The 2019 SIGNLL Conference on Computational Natural Language Learning (CoNLL), 2019.
      Full Text Code BibTeX Details
      @inproceedings{han2019deep,
        title = {Deep Structured Neural Network for Event Temporal Relation Extraction},
        author = {Han, Rujun and Hsu, I-Hung and Yang, Mu and Galstyan, Aram and Weischedel, Ralph and Peng, Nanyun},
        booktitle = {The 2019 SIGNLL Conference on Computational Natural Language Learning (CoNLL)},
        year = {2019}
      }
      
      Details
    7. Cross-lingual Dependency Parsing with Unlabeled Auxiliary Languages

      Wasi Uddin Ahmad, Zhisong Zhang, Xuezhe Ma, Kai-Wei Chang, and Nanyun Peng, in The 2019 SIGNLL Conference on Computational Natural Language Learning (CoNLL), 2019.
      Full Text BibTeX Details
      @inproceedings{ahmad2019cross,
        title = {Cross-lingual Dependency Parsing with Unlabeled Auxiliary Languages},
        author = {Ahmad, Wasi Uddin and Zhang, Zhisong and Ma, Xuezhe and Chang, Kai-Wei and Peng, Nanyun},
        booktitle = {The 2019 SIGNLL Conference on Computational Natural Language Learning (CoNLL)},
        year = {2019}
      }
      
      Details
    8. Learning A Unified Named Entity Tagger From Multiple Partially Annotated Corpora For Efficient Adaptation

      Xiao Huang, Li Dong, Elizabeth Boschee, and Nanyun Peng, in The 2019 SIGNLL Conference on Computational Natural Language Learning (CoNLL), 2019.
      Full Text Code Abstract BibTeX Details
      Named entity recognition (NER) identifies typed entity mentions in raw text. While the task is well-established, there is no universally used tagset: often, datasets are annotated for use in downstream applications and accordingly only cover a small set of entity types relevant to a particular task. For instance, in the biomedical domain, one corpus might annotate genes, another chemicals, and another diseases—despite the texts in each corpus containing references to all three types of entities. In this paper, we propose a deep structured model to integrate these “partially annotated” datasets to jointly identify all entity types appearing in the training corpora. By leveraging multiple datasets, the model can learn robust input representations; by building a joint structured model, it avoids potential conflicts caused by combining several models’ predictions at test time. Experiments show that the proposed model significantly outperforms strong multi-task learning baselines when training on multiple, partially annotated datasets and testing on datasets that contain tags from more than one of the training corpora
      @inproceedings{huang2019learning,
        title = {Learning A Unified Named Entity Tagger From Multiple Partially Annotated Corpora For Efficient Adaptation},
        author = {Huang, Xiao and Dong, Li and Boschee, Elizabeth and Peng, Nanyun},
        booktitle = {The 2019 SIGNLL Conference on Computational Natural Language Learning (CoNLL)},
        year = {2019}
      }
      
      Details
    9. Pun Generation with Surprise

      He He, Nanyun Peng, and Percy Liang, in 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2019), 2019.
      Full Text BibTeX Details
      @inproceedings{he2019pun,
        title = {Pun Generation with Surprise},
        author = {He, He and Peng, Nanyun and Liang, Percy},
        booktitle = {2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2019)},
        volume = {1},
        year = {2019}
      }
      
      Details
    10. On difficulties of cross-lingual transfer with order differences: A case study on dependency parsing

      Wasi Uddin Ahmad, Zhisong Zhang, Xuezhe Ma, Eduard Hovy, Kai-Wei Chang, and Nanyun Peng, in Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2019.
      Full Text BibTeX Details
      @inproceedings{ahmad2019difficulties,
        title = {On difficulties of cross-lingual transfer with order differences: A case study on dependency parsing},
        author = {Ahmad, Wasi Uddin and Zhang, Zhisong and Ma, Xuezhe and Hovy, Eduard and Chang, Kai-Wei and Peng, Nanyun},
        booktitle = {Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)},
        year = {2019}
      }
      
      Details
    11. Plan-And-Write: Towards Better Automatic Storytelling

      Lili Yao, Nanyun Peng, Weischedel Ralph, Kevin Knight, Dongyan Zhao, and Rui Yan, in The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19), 2019.
      Full Text BibTeX Details
      @inproceedings{yao2019plan,
        title = {Plan-And-Write: Towards Better Automatic Storytelling},
        author = {Yao, Lili and Peng, Nanyun and Ralph, Weischedel and Knight, Kevin and Zhao, Dongyan and Yan, Rui},
        booktitle = {The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19)},
        year = {2019}
      }
      
      Details
    12. Plan, Write, and Revise: an Interactive System for Open-Domain Story Generation

      Seraphina Goldfarb-Tarrant, Haining Feng, and Nanyun Peng, in 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2019), Demonstrations Track, 2019.
      Full Text Video Code Abstract BibTeX Details
      Story composition is a challenging problem for machines and even for humans. We present a neural narrative generation system that interacts with humans to generate stories. Our system has different levels of human interaction, which enables us to understand at what stage of story-writing human collaboration is most productive, both to improving story quality and human engagement in the writing process. We compare different varieties of interaction in story-writing, story-planning, and diversity controls under time constraints, and show that increased types of human collaboration at both planning and writing stages results in a 10-50% improvement in story quality as compared to less interactive baselines. We also show an accompanying increase in user engagement and satisfaction with stories as compared to our own less interactive systems and to previous turn-taking approaches to interaction. Finally, we find that humans tasked with collaboratively improving a particular characteristic of a story are in fact able to do so, which has implications for future uses of human-in-the-loop systems.
      @inproceedings{goldfarb2019plan,
        title = {Plan, Write, and Revise: an Interactive System for Open-Domain Story Generation},
        author = {Goldfarb-Tarrant, Seraphina and Feng, Haining and Peng, Nanyun},
        booktitle = {2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2019), Demonstrations Track},
        volume = {4},
        pages = {89--97},
        year = {2019}
      }
      
      Details
    13. Better Automatic Evaluation of Open-Domain Dialogue Systems with Contextualized Embeddings

      Sarik Ghazarian, Johnny Tian-Zheng Wei, Aram Galstyan, and Nanyun Peng, in 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2019), NeuralGen Workshop, 2019.
      Full Text BibTeX Details
      @inproceedings{ghazarian2019better,
        title = {Better Automatic Evaluation of Open-Domain Dialogue Systems with Contextualized Embeddings},
        author = {Ghazarian, Sarik and Wei, Johnny Tian-Zheng and Galstyan, Aram and Peng, Nanyun},
        booktitle = {2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2019), NeuralGen Workshop},
        year = {2019}
      }
      
      Details
    14. Contextualized Word Embeddings Enhanced Event Temporal Relation Extraction for Story Understanding

      Rujun Han, Mengyue Liang, Bashar Alhafni, and Nanyun Peng, in 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2019), Workshop on Narrative Understanding, 2019.
      Full Text BibTeX Details
      @inproceedings{han2019contextualized,
        title = {Contextualized Word Embeddings Enhanced Event Temporal Relation Extraction for Story Understanding},
        author = {Han, Rujun and Liang, Mengyue and Alhafni, Bashar and Peng, Nanyun},
        booktitle = {2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2019), Workshop on Narrative Understanding},
        year = {2019}
      }
      
      Details
    15. Building deep learning models for evidence classification from the open access biomedical literature

      Gully A. Burns, Xiangci Li, and Nanyun Peng, Database, 2019.
      Full Text BibTeX Details
      @article{burns2019building,
        title = {Building deep learning models for evidence classification from the open access biomedical literature},
        author = {Burns, Gully A and Li, Xiangci and Peng, Nanyun},
        journal = {Database},
        year = {2019},
        publisher = {Narnia}
      }
      
      Details
    16. Espresso: A Fast End-to-end Neural Speech Recognition Toolkit

      Yiming Wang, Tongfei Chen, Hainan Xu, Shuoyang Ding, Hang Lv, Yiwen Shao, Nanyun Peng, Lei Xie, Shinji Watanabe, and Sanjeev Khudanpur, in The 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2019.
      Full Text BibTeX Details
      @inproceedings{wang2019espresso,
        title = {Espresso: A Fast End-to-end Neural Speech Recognition Toolkit},
        author = {Wang, Yiming and Chen, Tongfei and Xu, Hainan and Ding, Shuoyang and Lv, Hang and Shao, Yiwen and Peng, Nanyun and Xie, Lei and Watanabe, Shinji and Khudanpur, Sanjeev},
        booktitle = {The 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)},
        year = {2019}
      }
      
      Details
    17. Evaluating and Enhancing the Robustness of Retrieval-Based Dialogue Systems with Adversarial Examples

      Jia Li, Chongyang Tao, Nanyun Peng, Wei Wu, Dongyan Zhao, and Rui Yan, in CCF International Conference on Natural Language Processing and Chinese Computing, 2019.
      Full Text BibTeX Details
      @inproceedings{li2019evaluating,
        title = {Evaluating and Enhancing the Robustness of Retrieval-Based Dialogue Systems with Adversarial Examples},
        author = {Li, Jia and Tao, Chongyang and Peng, Nanyun and Wu, Wei and Zhao, Dongyan and Yan, Rui},
        booktitle = {CCF International Conference on Natural Language Processing and Chinese Computing},
        pages = {142--154},
        year = {2019},
        organization = {Springer}
      }
      
      Details
    18. Debiasing Community Detection: The Importance of Lowly-Connected Nodes

      Ninareh Mehrabi, Fred Morstatter, Nanyun Peng, and Aram Galstyan, in The 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2019), 2019.
      Full Text BibTeX Details
      @inproceedings{mehrabi2019debiasing,
        title = {Debiasing Community Detection: The Importance of Lowly-Connected Nodes},
        author = {Mehrabi, Ninareh and Morstatter, Fred and Peng, Nanyun and Galstyan, Aram},
        booktitle = {The 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2019)},
        year = {2019}
      }
      
      Details

    2018

    1. Style Transfer in Text: Exploration and Evaluation

      Zhenxin Fu, Xiaoye Tan, Nanyun Peng, Dongyan Zhao, and Rui Yan, in Proceedings of The Thirty-Second Association for the Advancement of Artificial Intelligence Conference on Artificial Intelligence (AAAI), 2018.
      Full Text BibTeX Details
      @inproceedings{fu2018style,
        title = {Style Transfer in Text: Exploration and Evaluation},
        author = {Fu, Zhenxin and Tan, Xiaoye and Peng, Nanyun and Zhao, Dongyan and Yan, Rui},
        booktitle = {Proceedings of The Thirty-Second Association for the Advancement of Artificial Intelligence Conference on Artificial Intelligence (AAAI)},
        year = {2018}
      }
      
      Details
    2. Towards controllable story generation

      Nanyun Peng, Marjan Ghazvininejad, Jonathan May, and Kevin Knight, in NAACL Workshop, 2018.
      Full Text BibTeX Details
      @inproceedings{peng2018towards,
        title = {Towards controllable story generation},
        author = {Peng, Nanyun and Ghazvininejad, Marjan and May, Jonathan and Knight, Kevin},
        booktitle = {NAACL Workshop},
        year = {2018}
      }
      
      Details
    3. Learning to Converse with Noisy Data: Generation with Calibration.

      Mingyue Shang, Zhenxin Fu, Nanyun Peng, Yansong Feng, Dongyan Zhao, and Rui Yan, in IJCAI, 2018.
      Full Text BibTeX Details
      @inproceedings{shang2018learning,
        title = {Learning to Converse with Noisy Data: Generation with Calibration.},
        author = {Shang, Mingyue and Fu, Zhenxin and Peng, Nanyun and Feng, Yansong and Zhao, Dongyan and Yan, Rui},
        booktitle = {IJCAI},
        pages = {4338--4344},
        year = {2018}
      }
      
      Details
    4. Scalable Construction and Reasoning of Massive Knowledge Bases

      Xiang Ren, Nanyun Peng, and William Yang Wang, in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Tutorial Abstracts, 2018.
      Full Text BibTeX Details
      @inproceedings{ren2018scalable,
        title = {Scalable Construction and Reasoning of Massive Knowledge Bases},
        author = {Ren, Xiang and Peng, Nanyun and Wang, William Yang},
        booktitle = {Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Tutorial Abstracts},
        pages = {10--16},
        year = {2018}
      }
      
      Details
    5. Stack-pointer networks for dependency parsing

      Xuezhe Ma, Zecong Hu, Jingzhou Liu, Nanyun Peng, Graham Neubig, and Eduard Hovy, in The 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018), 2018.
      Full Text BibTeX Details
      @inproceedings{ma2018stack,
        title = {Stack-pointer networks for dependency parsing},
        author = {Ma, Xuezhe and Hu, Zecong and Liu, Jingzhou and Peng, Nanyun and Neubig, Graham and Hovy, Eduard},
        booktitle = {The 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018)},
        volume = {1},
        year = {2018}
      }
      
      Details

    2017

    1. A multi-task learning approach to adapting bilingual word embeddings for cross-lingual named entity recognition

      Dingquan Wang, Nanyun Peng, and Kevin Duh, in Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers), 2017.
      Full Text BibTeX Details
      @inproceedings{wang2017multi,
        title = {A multi-task learning approach to adapting bilingual word embeddings for cross-lingual named entity recognition},
        author = {Wang, Dingquan and Peng, Nanyun and Duh, Kevin},
        booktitle = {Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)},
        pages = {383--388},
        year = {2017}
      }
      
      Details
    2. Multi-task multi-domain representation learning for sequence tagging

      Nanyun Peng and Mark Dredze, in Proceedings of the 2nd Workshop on Representation Learning for NLP, 2017.
      Full Text BibTeX Details
      @inproceedings{peng2017multi,
        title = {Multi-task multi-domain representation learning for sequence tagging},
        author = {Peng, Nanyun and Dredze, Mark},
        booktitle = {Proceedings of the 2nd Workshop on Representation Learning for NLP},
        year = {2017}
      }
      
      Details
    3. Jointly Learning Representations for Low-Resource Information Extraction

      Nanyun Peng and others, PhD thesis, 2017.
      Full Text BibTeX Details
      @phdthesis{peng2017jointly,
        title = {Jointly Learning Representations for Low-Resource Information Extraction},
        author = {Peng, Nanyun and others},
        year = {2017},
        school = {Ph. D. thesis, Johns Hopkins University}
      }
      
      Details
    4. Cross-sentence N-ary Relation Extraction with Graph LSTMs

      Nanyun Peng, Hoifung Poon, Chris Quirk, Kristina Toutanova, and Wen-tau Yih, Transactions of the Association of Computational Linguistics, 2017.
      Full Text BibTeX Details
      @article{peng2017cross,
        title = {Cross-sentence N-ary Relation Extraction with Graph LSTMs},
        author = {Peng, Nanyun and Poon, Hoifung and Quirk, Chris and Toutanova, Kristina and Yih, Wen-tau},
        journal = {Transactions of the Association of Computational Linguistics},
        year = {2017}
      }
      
      Details
    5. Supplementary results for named entity recognition on Chinese social media with an updated dataset

      Nanyun Peng and Mark Dredze, Tech. Rep., 2017.[Online], 2017.
      Full Text BibTeX Details
      @techreport{peng2017supplementary,
        title = {Supplementary results for named entity recognition on Chinese social media with an updated dataset},
        author = {Peng, Nanyun and Dredze, Mark},
        year = {2017},
        institution = {Tech. Rep., 2017.[Online]}
      }
      
      Details

    2016

    1. Graph long short term memory for syntactic relationship discovery

      Christopher Brian Quirk, Kristina Nikolova Toutanova, Wen-tau Yih, Hoifung Poon, and Nanyun Peng,
      BibTeX Details
      @misc{quirk2016graph,
        title = {Graph long short term memory for syntactic relationship discovery},
        author = {Quirk, Christopher Brian and Toutanova, Kristina Nikolova and Yih, Wen-tau and Poon, Hoifung and Peng, Nanyun},
        year = {2016}
      }
      
      Details
    2. Improving named entity recognition for chinese social media with word segmentation representation learning

      Nanyun Peng and Mark Dredze, in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016.
      Full Text BibTeX Details
      @inproceedings{peng2016improving,
        title = {Improving named entity recognition for chinese social media with word segmentation representation learning},
        author = {Peng, Nanyun and Dredze, Mark},
        booktitle = {Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics},
        year = {2016}
      }
      
      Details

    2015

    1. HLTCOE Participation in TAC KBP 2015: Cold Start and TEDL

      Taneeya Satyapanich, Tim Finin, Paul McNamee, James Mayfield, Doug Oard, Nanyun Peng, Ning Gao, Yiu-Chang Lin, Joshi MacKin, and Tim Dowd, UMBC Faculty Collection, 2015.
      BibTeX Details
      @article{satyapanich2015hltcoe,
        title = {HLTCOE Participation in TAC KBP 2015: Cold Start and TEDL},
        author = {Satyapanich, Taneeya and Finin, Tim and McNamee, Paul and Mayfield, James and Oard, Doug and Peng, Nanyun and Gao, Ning and Lin, Yiu-Chang and MacKin, Joshi and Dowd, Tim},
        journal = {UMBC Faculty Collection},
        year = {2015},
        publisher = {National Institute of Standards and Technology}
      }
      
      Details
    2. Named entity recognition for chinese social media with jointly trained embeddings

      Nanyun Peng and Mark Dredze, in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015.
      Full Text BibTeX Details
      @inproceedings{peng2015named,
        title = {Named entity recognition for chinese social media with jointly trained embeddings},
        author = {Peng, Nanyun and Dredze, Mark},
        booktitle = {Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing},
        pages = {548--554},
        year = {2015}
      }
      
      Details
    3. An Empirical Study of Chinese Name Matching and Applications

      Nanyun Peng, Mo Yu, and Mark Dredze, in Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics (ACL), 2015.
      BibTeX Details
      @inproceedings{peng2015empirical,
        title = {An Empirical Study of Chinese Name Matching and Applications},
        author = {Peng, Nanyun and Yu, Mo and Dredze, Mark},
        booktitle = {Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics (ACL)},
        year = {2015}
      }
      
      Details
    4. Modeling word forms using latent underlying morphs and phonology

      Ryan Cotterell, Nanyun Peng, and Jason Eisner, Transactions of the Association of Computational Linguistics, 2015.
      Full Text BibTeX Details
      @article{cotterell2015modeling,
        title = {Modeling word forms using latent underlying morphs and phonology},
        author = {Cotterell, Ryan and Peng, Nanyun and Eisner, Jason},
        journal = {Transactions of the Association of Computational Linguistics},
        volume = {3},
        number = {1},
        year = {2015}
      }
      
      Details
    5. A concrete chinese NLP pipeline

      Nanyun Peng, Francis Ferraro, Mo Yu, Nicholas Andrews, Jay DeYoung, Max Thomas, Matthew R. Gormley, Travis Wolfe, Craig Harman, Benjamin Van Durme, and others, in Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, 2015.
      BibTeX Details
      @inproceedings{peng2015concrete,
        title = {A concrete chinese NLP pipeline},
        author = {Peng, Nanyun and Ferraro, Francis and Yu, Mo and Andrews, Nicholas and DeYoung, Jay and Thomas, Max and Gormley, Matthew R and Wolfe, Travis and Harman, Craig and Van Durme, Benjamin and others},
        booktitle = {Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations},
        pages = {86--90},
        year = {2015}
      }
      
      Details
    6. A chinese concrete nlp pipeline

      Nanyun Peng, Francis Ferraro, Mo Yu, Nicholas Andrews, Jay DeYoung, Max Thomas, Matt Gormley, Travis Wolfe, Craig Harman, Benjamin Van Durme, and others, North American Chapter of the Association for Computational Linguistics (NAACL), Demonstration Session, 2015.
      BibTeX Details
      @article{peng2015chinese,
        title = {A chinese concrete nlp pipeline},
        author = {Peng, Nanyun and Ferraro, Francis and Yu, Mo and Andrews, Nicholas and DeYoung, Jay and Thomas, Max and Gormley, Matt and Wolfe, Travis and Harman, Craig and Van Durme, Benjamin and others},
        journal = {North American Chapter of the Association for Computational Linguistics (NAACL), Demonstration Session},
        year = {2015}
      }
      
      Details
    7. HLTCOE participation in TAC KBP 2015: Cold start and TEDL

      Tim Finin, Dawn Lawrie, Paul McNamee, James Mayfield, Doug Oard, Nanyun Peng, Ning Gao, Yiu-Chang Lin, Joshi MacKin, Tim Dowd, and others, in Eighth Text Analysis Conference, 2015.
      BibTeX Details
      @inproceedings{finin2015hltcoe,
        title = {HLTCOE participation in TAC KBP 2015: Cold start and TEDL},
        author = {Finin, Tim and Lawrie, Dawn and McNamee, Paul and Mayfield, James and Oard, Doug and Peng, Nanyun and Gao, Ning and Lin, Yiu-Chang and MacKin, Joshi and Dowd, Tim and others},
        booktitle = {Eighth Text Analysis Conference},
        year = {2015}
      }
      
      Details
    8. Dual decomposition inference for graphical models over strings

      Nanyun Peng, Ryan Cotterell, and Jason Eisner, in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015.
      Full Text BibTeX Details
      @inproceedings{peng2015dual,
        title = {Dual decomposition inference for graphical models over strings},
        author = {Peng, Nanyun and Cotterell, Ryan and Eisner, Jason},
        booktitle = {Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing},
        pages = {917--927},
        year = {2015}
      }
      
      Details

    2014

    1. Stochastic Contextual Edit Distance and Probabilistic FSTs

      Ryan Cotterell, Nanyun Peng, and Jason Eisner, in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014.
      Full Text BibTeX Details
      @inproceedings{cotterell2014stochastic,
        title = {Stochastic Contextual Edit Distance and Probabilistic FSTs},
        author = {Cotterell, Ryan and Peng, Nanyun and Eisner, Jason},
        booktitle = {Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics},
        year = {2014}
      }
      
      Details
    2. Learning polylingual topic models from code-switched social media documents

      Nanyun Peng, Yiming Wang, and Mark Dredze, in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2014.
      Full Text BibTeX Details
      @inproceedings{peng2014learning,
        title = {Learning polylingual topic models from code-switched social media documents},
        author = {Peng, Nanyun and Wang, Yiming and Dredze, Mark},
        booktitle = {Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)},
        pages = {674--679},
        year = {2014}
      }
      
      Details

    2012

    1. On convergence rate of concave-convex procedure

      Ian E. H. Yen, Nanyun Peng, Po-Wei Wang, and Shou-De Lin, in Proceedings of the NIPS 2012 Optimization Workshop, 2012.
      BibTeX Details
      @inproceedings{yen2012convergence,
        title = {On convergence rate of concave-convex procedure},
        author = {Yen, Ian EH and Peng, Nanyun and Wang, Po-Wei and Lin, Shou-De},
        booktitle = {Proceedings of the NIPS 2012 Optimization Workshop},
        pages = {31--35},
        year = {2012}
      }
      
      Details
    2. Online Plagiarized Detection Through Exploiting Lexical, Syntax, and Semantic Information

      Wan-Yu Lin, Nanyun Peng, Chun-Chao Yen, and Shou-de Lin, in Proceedings of the ACL 2012 System Demonstrations, 2012.
      BibTeX Details
      @inproceedings{lin2012online,
        title = {Online Plagiarized Detection Through Exploiting Lexical, Syntax, and Semantic Information},
        author = {Lin, Wan-Yu and Peng, Nanyun and Yen, Chun-Chao and Lin, Shou-de},
        booktitle = {Proceedings of the ACL 2012 System Demonstrations},
        pages = {145--150},
        year = {2012}
      }
      
      Details
    3. Exploiting latent information to predict diffusions of novel topics on social networks

      Tsung-Ting Kuo, San-Chuan Hung, Wei-Shih Lin, Nanyun Peng, Shou-De Lin, and Wei-Fen Lin, in Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2012.
      BibTeX Details
      @inproceedings{kuo2012exploiting,
        title = {Exploiting latent information to predict diffusions of novel topics on social networks},
        author = {Kuo, Tsung-Ting and Hung, San-Chuan and Lin, Wei-Shih and Peng, Nanyun and Lin, Shou-De and Lin, Wei-Fen},
        booktitle = {Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)},
        pages = {344--348},
        year = {2012}
      }
      
      Details