Share this page:

Stochastic Contextual Edit Distance and Probabilistic FSTs

Ryan Cotterell, Nanyun Peng, and Jason Eisner, in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014.

Download the full text


Abstract

Almost 30 years ago, Zhang and Shasha (1989) published a seminal paper describing an efficient dynamic programming algorithm computing the tree edit distance, that is, the minimum number of node deletions, insertions, and replacements that are necessary to transform one tree into another. Since then, the tree edit distance has been widely applied, for example in biology and intelligent tutoring systems. However, the original paper of Zhang and Shasha can be challenging to read for newcomers and it does not describe how to efficiently infer the optimal edit script. In this contribution, we provide a comprehensive tutorial to the tree edit distance algorithm of Zhang and Shasha. We further prove metric properties of the tree edit distance, and describe efficient algorithms to infer the cheapest edit script, as well as a summary of all cheapest edit scripts between two trees.


Bib Entry

@inproceedings{cotterell2014stochastic,
  title = {Stochastic Contextual Edit Distance and Probabilistic FSTs},
  author = {Cotterell, Ryan and Peng, Nanyun and Eisner, Jason},
  booktitle = {Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics},
  year = {2014}
}

Related Publications