Share this page:

Improving Zero-Shot Cross-Lingual Transfer Learning via Robust Training

Kuan-Hao Huang, Wasi Uddin Ahmad, Nanyun Peng, and Kai-Wei Chang, in The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.

Download the full text


Abstract

Pre-trained multilingual language encoders, such as multilingual BERT and XLM-R, show great potential for zero-shot cross-lingual transfer. However, these multilingual encoders do not precisely align words and phrases across languages. Especially, learning alignments in the multilingual embedding space usually requires sentence-level or word-level parallel corpora, which are expensive to be obtained for low-resource languages. An alternative is to make the multilingual encoders more robust; when fine-tuning the encoder using downstream task, we train the encoder to tolerate noise in the contextual embedding spaces such that even if the representations of different languages are not aligned well, the model can still achieve good performance on zero-shot cross-lingual transfer. In this work, we propose a learning strategy for training robust models by drawing connections between adversarial examples and the failure cases of zero-shot cross-lingual transfer. We adopt two widely used robust training methods, adversarial training and randomized smoothing, to train the desired robust model. The experimental results demonstrate that robust training improves zero-shot cross-lingual transfer on text classification tasks. The improvement is more significant in the generalized cross-lingual transfer setting, where the pair of input sentences belong to two different languages.


Bib Entry

@inproceedings{huang2021improving,
  title = {Improving Zero-Shot Cross-Lingual Transfer Learning via Robust Training},
  author = {Huang, Kuan-Hao and Ahmad, Wasi Uddin and Peng, Nanyun and Chang, Kai-Wei},
  booktitle = {The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  year = {2021}
}

Related Publications

  1. Improving Zero-Shot Cross-Lingual Transfer Learning via Robust Training

    Kuan-Hao Huang, Wasi Uddin Ahmad, Nanyun Peng, and Kai-Wei Chang, in The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.
    Full Text Code Abstract BibTeX Details
    Pre-trained multilingual language encoders, such as multilingual BERT and XLM-R, show great potential for zero-shot cross-lingual transfer. However, these multilingual encoders do not precisely align words and phrases across languages. Especially, learning alignments in the multilingual embedding space usually requires sentence-level or word-level parallel corpora, which are expensive to be obtained for low-resource languages. An alternative is to make the multilingual encoders more robust; when fine-tuning the encoder using downstream task, we train the encoder to tolerate noise in the contextual embedding spaces such that even if the representations of different languages are not aligned well, the model can still achieve good performance on zero-shot cross-lingual transfer. In this work, we propose a learning strategy for training robust models by drawing connections between adversarial examples and the failure cases of zero-shot cross-lingual transfer. We adopt two widely used robust training methods, adversarial training and randomized smoothing, to train the desired robust model. The experimental results demonstrate that robust training improves zero-shot cross-lingual transfer on text classification tasks. The improvement is more significant in the generalized cross-lingual transfer setting, where the pair of input sentences belong to two different languages.
    @inproceedings{huang2021improving,
      title = {Improving Zero-Shot Cross-Lingual Transfer Learning via Robust Training},
      author = {Huang, Kuan-Hao and Ahmad, Wasi Uddin and Peng, Nanyun and Chang, Kai-Wei},
      booktitle = {The 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
      year = {2021}
    }
    
    Details
  2. GATE: Graph Attention Transformer Encoder for Cross-lingual Relation and Event Extraction

    Wasi Ahmad, Nanyun Peng, and Kai-Wei Chang, in The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21), 2021.
    Full Text Code Abstract BibTeX Details
    Prevalent approaches in cross-lingual relation and event extraction use graph convolutional networks (GCNs) with universal dependency parses to learn language-agnostic representations such that models trained on one language can be applied to other languages. However, GCNs lack in modeling long-range dependencies or disconnected words in the dependency tree. To address this challenge, we propose to utilize the self-attention mechanism where we explicitly fuse structural information to learn the dependencies between words at different syntactic distances. We introduce GATE, a Graph Attention Transformer Encoder, and test its cross-lingual transferability on relation and event extraction tasks. We perform rigorous experiments on the widely used ACE05 dataset that includes three typologically different languages: English, Chinese, and Arabic. The evaluation results show that GATE outperforms three recently proposed methods by a large margin. Our detailed analysis reveals that due to the reliance on syntactic dependencies, GATE produces robust representations that facilitate transfer across languages.
    @inproceedings{ahmad2021gate,
      author = {Ahmad, Wasi and Peng, Nanyun and Chang, Kai-Wei},
      title = {GATE: Graph Attention Transformer Encoder for Cross-lingual Relation and Event Extraction},
      booktitle = {The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21)},
      year = {2021}
    }
    
    Details
  3. Target Language-Aware Constrained Inference for Cross-lingual Dependency Parsing

    Tao Meng, Nanyun Peng, and Kai-Wei Chang, in 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019.
    Full Text BibTeX Details
    @inproceedings{meng2019target,
      title = {Target Language-Aware Constrained Inference for Cross-lingual Dependency Parsing},
      author = {Meng, Tao and Peng, Nanyun and Chang, Kai-Wei},
      booktitle = {2019 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
      year = {2019}
    }
    
    Details
  4. Cross-lingual Dependency Parsing with Unlabeled Auxiliary Languages

    Wasi Uddin Ahmad, Zhisong Zhang, Xuezhe Ma, Kai-Wei Chang, and Nanyun Peng, in The 2019 SIGNLL Conference on Computational Natural Language Learning (CoNLL), 2019.
    Full Text BibTeX Details
    @inproceedings{ahmad2019cross,
      title = {Cross-lingual Dependency Parsing with Unlabeled Auxiliary Languages},
      author = {Ahmad, Wasi Uddin and Zhang, Zhisong and Ma, Xuezhe and Chang, Kai-Wei and Peng, Nanyun},
      booktitle = {The 2019 SIGNLL Conference on Computational Natural Language Learning (CoNLL)},
      year = {2019}
    }
    
    Details
  5. On difficulties of cross-lingual transfer with order differences: A case study on dependency parsing

    Wasi Uddin Ahmad, Zhisong Zhang, Xuezhe Ma, Eduard Hovy, Kai-Wei Chang, and Nanyun Peng, in Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2019.
    Full Text BibTeX Details
    @inproceedings{ahmad2019difficulties,
      title = {On difficulties of cross-lingual transfer with order differences: A case study on dependency parsing},
      author = {Ahmad, Wasi Uddin and Zhang, Zhisong and Ma, Xuezhe and Hovy, Eduard and Chang, Kai-Wei and Peng, Nanyun},
      booktitle = {Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)},
      year = {2019}
    }
    
    Details